The Frontier AI Data Lab
Snorkel AI is the frontier AI data lab helping teams build the data and environments behind high-performing frontier models and agentic AI. Our platform, research, and embedded delivery model combine to create the datasets, environments, evaluation frameworks, and custom solutions that power real-world AI systems.
Founded out of the Stanford AI Lab in 2019.
Safely define and advance the frontier of AI with expert data development

From Stanford AI Lab to frontier data research
Snorkel started with a contrarian bet at the Stanford AI Lab: that training data, not models or compute, would decide whether machine learning worked. The 2017 VLDB paper that introduced Snorkel made the bet a system: data programming, weak supervision, and the radical idea that you could build state-of-the-art models without hand-labeling a single example.
A decade and 250+ publications later and awards at NeurIPS, ICML, ICLR, UAI, VLDB, and more, the bet has only gotten stronger as models have gotten bigger.

Do the best work of your career

