Category

Research

Snorkel AI emerged from a research project, and we remain closely connected to the research community. Students and professors associated with the Snorkel project continue to publish academic papers that push the field forward, and the Snorkel AI research team integrates the most promising of those ideas into our platform.

Our picks

Image for Getting better performance from foundation models (with less data)
Getting better performance from foundation models (with less data)
Getting better performance from foundation models (with less data)
August 4, 2023
Fred Sala
Image for Snorkel AI researchers present 18 papers at NeurIPS 2023
Snorkel AI researchers present 18 papers at NeurIPS 2023
The Snorkel AI team will present 18 research papers and talks at the 2023 Neural Information Processing Systems (NeurIPS) conference from December 10-16. The Snorkel papers cover a broad range of topics including fairness, semi-supervised learning, large language models (LLMs), and domain-specific models. Snorkel AI is proud of its roots in the research community and endeavors to remain at the forefront
October 31, 2023
Team Snorkel
Image for Long context models in the enterprise: benchmarks and beyond
Long context models in the enterprise: benchmarks and beyond
Snorkel researchers devised a new way to evaluate long context models and address their “lost-in-the-middle” challenges with mediod voting.
June 6, 2024
Amanda Dsouza

All articles on Research

Image
How Skill-it! enables faster, better LLM training
Humans learn tasks better when taught in a logical order. So do LLMs. Researchers developed a way to exploit this tendency called “Skill-it!”
March 12, 2024
Fred Sala
Image2
Large language model training: how three training phases shape LLMs
Training large language models is a multi-layered stack of processes, each with its unique role and contribution to the model’s performance.
February 27, 2024
Stephen Bach
Image4
LoRA: Low-Rank Adaptation for LLMs
Low-rank adaptation (LoRA) lets data scientists customize GenAI models like LLMs faster than traditional full fine-tuning methods.
February 21, 2024
Matt Casey
Image
New benchmark results demonstrate value of Snorkel AI approach to LLM alignment
Snorkel researchers’ state-of-the-art methods created a 7B LLM that ranked 2nd, behind only GPT-4 Turbo, on AlpacaEval 2.0 leaderboard.
January 24, 2024
Cate Lochead
Image
Retrieval augmented generation (RAG): a conversation with its creator
Snorkel CEO Alex Ratner spoke with Douwe Keila, an author of the original paper about retrieval augmented generation (RAG).
January 16, 2024
Team Snorkel
Image
Stanford professor discusses exciting advances in foundation model evaluation
Snorkel CEO Alex Ratner chatted with Stanford Professor Percy Liang about evaluation in machine learning and in AI generally.
January 2, 2024
Team Snorkel
Image1
Snorkel AI researchers present 18 papers at NeurIPS 2023
The Snorkel AI team will present 18 research papers and talks at the 2023 Neural Information Processing Systems (NeurIPS) conference from December 10-16. The Snorkel papers cover a broad range of topics including fairness, semi-supervised learning, large language models (LLMs), and domain-specific models. Snorkel AI is proud of its roots in the research community and endeavors to remain at the forefront
October 31, 2023
Team Snorkel
Image1
Two approaches to distill LLMs for better enterprise value
Distillation techniques allow enterprises to access the full predictive power of large language models at a tiny fraction of their cost.
October 31, 2023
Jason Fries
Image1
Bloomberg’s Gideon Mann on the power of domain specialist LLMs
Gideon Mann, head of ML Product and Research at Bloomberg LP, chatted with Snorkel CEO Alex Ratner about building BloombergGPT.
October 17, 2023
Team Snorkel
Image
Which is better, retrieval augmentation (RAG) or fine-tuning? Both.
Professionals in the data science space often debate whether RAG or fine-tuning yields the better result. The answer is “both.”
September 20, 2023
Hoang Tran
Image1
Former U.S. Chief Data Scientist on past and future of data science
Past U.S. Chief Data Scientist DJ Patil talked with Snorkel AI CEO Alex Ratner on topics including the origin of the title “data scientist.”
September 12, 2023
Team Snorkel
Image2
4 new papers show foundation models can build on themselves
The surest way to improve foundation models is through more and better data, but Snorkel researchers showed FMs can learn from themselves.
August 31, 2023
Fred Sala
Image
Accelerating predictive task time to value with generative AI
Generative AI can write poems, recite common knowledge, and extract information. GenAI can also help quickly build predictive pipelines.
August 17, 2023
Bradley Fowler
Image1
Getting better performance from foundation models (with less data)
Getting better performance from foundation models (with less data)
August 4, 2023
Fred Sala
Image
Data fuels enterprise AI value: 6 takeaways from the Gartner Hype Cycle for Artificial Intelligence, 2023
GenAI may be the most transformative technology of the past decade but data is where enterprises are able to realize real value from AI today.
August 2, 2023
Matt Casey