We define and advance data and environments to push the AI frontier

Built on 10+ years of pioneering research in data-centric AI,
including 250+ publications and benchmarks.

building benchmarks and collaborating with

Image
Image
Image
Image
Image
Image
Image
Image
Image
key research areas

Vision and impact

We help labs advance frontier models by working with domain experts to design and build complex, realistic datasets that drive model performance.

initiatives

Community and open science

Open benchmarks, conversations, and research for real-world AI performance.

Image

Open Benchmarks Grants

Backed by a $3M commitment, the program funds
open-source datasets, benchmarks, and evaluation artifacts that shape how frontier AI systems are built
and evaluated.

Image

Bench Talks

Our podcast series at the intersection of AI evaluation, data quality, and real-world impact.
Image

Reading Group

A recurring forum for researchers and practitioners to explore the latest frontier developments in AI while building meaningful connections within the community.

DEEP RESEARCH Expertise

Technical advisors and distinguished affiliates

Stephen Bach headshot

Stephen Bach

Brown University
Eliot Horowitz Assistant Professor, Computer Science Department
Jason Fries headshot

Jason Fries

Stanford University
Assistant Professor of Biomedical Data Science and of Medicine
Jared Dunnmon headshot

Jared Dunnmon

Co-Founder & Chief Scientist, Stealth Startup
Prev. Dir. of AI at DIU
Fred Sala headshot

Fred Sala

Chief Scientist, Snorkel AI
Assistant Professor @ University of Wisconsin-Madison
Chris Ré headshot

Chris Ré

Co-Founder, Snorkel AI
Professor @ Stanford University
Ludwig Schmidt headshot

Ludwig Schmidt

Stanford University · LAION
Stanford researcher and LAION collaborator
Karthik Narasimhan headshot

Karthik Narasimhan

Princeton University
Professor of Computer Science
Yu Su headshot

Yu Su

Ohio State University
Associate Professor of Computer Science and Engineering
Lewis Tunstall headshot

Lewis Tunstall

Hugging Face
Machine Learning Engineer
PUBLICATIONS

Browse research blogs
and academic papers

Type: All Types
Sort: Newest
Snorkel: Rapid Training Data Creation With Weak Supervision
Research Paper
Snorkel: Rapid Training Data Creation With Weak Supervision

This paper presents a flexible interface layer to write labeling functions based on experience.

Oct 04, 2017
Alexander Ratner, Stephen H Bach, Henry Ehrenberg, Jason Fries, Sen Wu, Christopher Ré
Learn more about Snorkel: Rapid Training Data Creation With Weak Supervision
Data Programming: Creating Large Training Sets, Quickly
Research Paper
Data Programming: Creating Large Training Sets, Quickly

A paradigm for labeling training datasets programmatically rather than by hand.

Dec 20, 2016
A. Ratner, et al. 2016
Learn more about Data Programming: Creating Large Training Sets, Quickly
Data Programming With DDLite: Putting Humans in a Different Part of the Loop
Research Paper
Data Programming With DDLite: Putting Humans in a Different Part of the Loop

Introducing DDLite, an interactive development framework for data programming.

Dec 19, 2016
H. Ehrenberg, et al, 2016
Learn more about Data Programming With DDLite: Putting Humans in a Different Part of the Loop
1 2 33 34
Image

Let’s research together

Join our team of leading researchers and help shape the future of AI.