Research

Snorkel AI emerged from a research project, and we remain closely connected to the research community. Students and professors associated with the Snorkel project continue to publish academic papers that push the field forward, and the Snorkel AI research team integrates the most promising of those ideas into our platform.

Our picks

Getting better performance from foundation models (with less data)

August 4, 2023

•

Fred Sala

Snorkel AI researchers present 18 papers at NeurIPS 2023

The Snorkel AI team will present 18 research papers and talks at the 2023 Neural Information Processing Systems (NeurIPS) conference from December 10-16. The Snorkel papers cover a broad range of topics including fairness, semi-supervised learning, large language models (LLMs), and domain-specific models. Snorkel AI is proud of its roots in the research community and endeavors to remain at the forefront

October 31, 2023

•

Team Snorkel

Long context models in the enterprise: benchmarks and beyond

Snorkel researchers devised a new way to evaluate long context models and address their “lost-in-the-middle” challenges with mediod voting.

June 6, 2024

•

Amanda Dsouza

All articles on Research

How Skill-it! enables faster, better LLM training

Humans learn tasks better when taught in a logical order. So do LLMs. Researchers developed a way to exploit this tendency called “Skill-it!”

March 12, 2024

•

Fred Sala

Large language model training: how three training phases shape LLMs

Training large language models is a multi-layered stack of processes, each with its unique role and contribution to the model’s performance.

February 27, 2024

•

Stephen Bach

LoRA: Low-Rank Adaptation for LLMs

Low-rank adaptation (LoRA) lets data scientists customize GenAI models like LLMs faster than traditional full fine-tuning methods.

February 21, 2024

•

Matt Casey

New benchmark results demonstrate value of Snorkel AI approach to LLM alignment

Snorkel researchers’ state-of-the-art methods created a 7B LLM that ranked 2nd, behind only GPT-4 Turbo, on AlpacaEval 2.0 leaderboard.

January 24, 2024

•

Cate Lochead

Retrieval augmented generation (RAG): a conversation with its creator

Snorkel CEO Alex Ratner spoke with Douwe Keila, an author of the original paper about retrieval augmented generation (RAG).

January 16, 2024

•

Team Snorkel

Stanford professor discusses exciting advances in foundation model evaluation

Snorkel CEO Alex Ratner chatted with Stanford Professor Percy Liang about evaluation in machine learning and in AI generally.

January 2, 2024

•

Team Snorkel

Snorkel AI researchers present 18 papers at NeurIPS 2023

October 31, 2023

•

Team Snorkel

Two approaches to distill LLMs for better enterprise value

Distillation techniques allow enterprises to access the full predictive power of large language models at a tiny fraction of their cost.

October 31, 2023

•

Jason Fries

Bloomberg’s Gideon Mann on the power of domain specialist LLMs

Gideon Mann, head of ML Product and Research at Bloomberg LP, chatted with Snorkel CEO Alex Ratner about building BloombergGPT.

October 17, 2023

•

Team Snorkel

Which is better, retrieval augmentation (RAG) or fine-tuning? Both.

Professionals in the data science space often debate whether RAG or fine-tuning yields the better result. The answer is “both.”

September 20, 2023

•

Hoang Tran

Former U.S. Chief Data Scientist on past and future of data science

Past U.S. Chief Data Scientist DJ Patil talked with Snorkel AI CEO Alex Ratner on topics including the origin of the title “data scientist.”

September 12, 2023

•

Team Snorkel

4 new papers show foundation models can build on themselves

The surest way to improve foundation models is through more and better data, but Snorkel researchers showed FMs can learn from themselves.

August 31, 2023

•

Fred Sala

Accelerating predictive task time to value with generative AI

Generative AI can write poems, recite common knowledge, and extract information. GenAI can also help quickly build predictive pipelines.

August 17, 2023

•

Bradley Fowler

Getting better performance from foundation models (with less data)

August 4, 2023

•

Fred Sala

Data fuels enterprise AI value: 6 takeaways from the Gartner Hype Cycle for Artificial Intelligence, 2023

GenAI may be the most transformative technology of the past decade but data is where enterprises are able to realize real value from AI today.

August 2, 2023

•

Matt Casey

Research

Our picks

All articles on Research

How do you want to work with Snorkel?