Image
author

Chris Ré

Co-Founder
,
Snorkel AI
Professor @ Stanford University

I’m a professor in the Stanford AI Lab (SAIL), the center for research on foundation models (CRFM), and the Machine Learning Group (bio). Our lab works on the foundations of the next generation of AI systems.

  • On the AI side, I am fascinated by how we can learn from increasingly weak forms of supervision, the basis of new architectures, the role of data, and by the mathematical foundations of such techniques.
  • On the systems side, I am broadly interested in how machine learning is changing how we build software and hardware. I’m particularly excited when we can blend AI and systems, e.g,. Snorkel, Overton (YouTube), or Together.

Our work is inspired by the observation that data is central to these systems, and so data management principles (re-imagined) play a starring role in our work. This sounds like Silicon Valley nonsense, but oddly enough, these ideas get used due to amazing students and collaborations with Google ads, YouTube, Apple, and more.
While we’re very proud of our research ideas and their impact, the lab’s real goal is to help students become professors, entrepreneurs, and researchers. To that end, over a dozen members of our group have started their own professorships. With students and collaborators, I’ve been fortunate enough to cofound a number of companies and a venture firm. For transparency, I try to list companies I advise or invest in here and our research sponsors here. My students run the ML Sys Podcast.

The latest from Chris

Efficiently Modeling Long Sequences with Structured State Spaces
This paper introduces the Structured State Space sequence model (s4), which uses a new parameterization for the state-space model to improve long-range dependency handling both mathematically and empirically.
Research Paper
Efficiently Modeling Long Sequences with Structured State Spaces

This paper introduces the Structured State Space sequence model (s4), which uses a new parameterization for the state-space model to improve long-range dependency handling both mathematically and empirically.

Mar 29, 2022

A. Gu, et al

Learn more about Efficiently Modeling Long Sequences with Structured State Spaces
Cross-Modal Data Programming Enables Rapid Medical Machine Learning
This paper proposes cross-modal data programming (XMDP) for machine learning (ML) in medicine.
Research Paper
Cross-Modal Data Programming Enables Rapid Medical Machine Learning

This paper proposes cross-modal data programming (XMDP) for machine learning (ML) in medicine.

Nov 14, 2020

J. Dunnmon, et al, 2020

Learn more about Cross-Modal Data Programming Enables Rapid Medical Machine Learning
Train and You’ll Miss It: Interactive Model Iteration With Weak Supervision…
This paper provides a series of results studying how performance scales with changes in source coverage, source accuracy, and the Lipschitzness of label distributions in the embedding space, and compare this rate to standard weak supervision.
Research Paper
Train and You’ll Miss It: Interactive Model Iteration With Weak Supervision…

This paper provides a series of results studying how performance scales with changes in source coverage, source accuracy, and the Lipschitzness of label distributions in the embedding space, and compare this rate to standard weak supervision.

Nov 13, 2020

M. Chen, et al, 2020

Learn more about Train and You’ll Miss It: Interactive Model Iteration With Weak Supervision…
Low-Dimensional Hyperbolic Knowledge Graph Embeddings
Knowledge graph (KG) embeddings learn lowdimensional representations of entities and relations to predict missing facts. KGs often exhibit hierarchical and logical patterns which must be preserved in the embedding space. For hierarchical data, hyperbolic embedding methods have shown promise for high-fidelity and parsimonious representations. However, existing hyperbolic embedding methods do not account for the rich logical patterns in KGs. In this work, we introduce a class of hyperbolic KG embedding models that simultaneously capture hierarchical and logical patterns. Our approach combines hyperbolic reflections and rotations with attention to model complex relational patterns. Experimental results on standard KG benchmarks show that...
Research Paper
Low-Dimensional Hyperbolic Knowledge Graph Embeddings

Knowledge graph (KG) embeddings learn lowdimensional representations of entities and relations to predict missing facts. KGs often exhibit hierarchical and logical patterns which must be preserved in the embedding space. For hierarchical data, hyperbolic embedding methods have shown promise for high-fidelity and parsimonious representations. However, existing hyperbolic embedding methods do not account for the rich logical patterns in KGs. In…

Jul 05, 2020

I. Chami, et al.

Learn more about Low-Dimensional Hyperbolic Knowledge Graph Embeddings
Ivy: Instrumental Variable Synthesis for Causal Inference
A popular way to estimate the causal effect of a variable x on y from observational data is to use an instrumental variable (IV): a third variable z that affects y only through x. The more strongly z is associated with x, the more reliable the estimate is, but such strong IVs are difficult to find. Instead, practitioners combine more commonly available IV candidates—which are not necessarily strong, or even valid, IVs—into a single "summary" that is plugged into causal effect estimators in place of an IV. In genetic epidemiology, such approaches are known as allele scores. Allele scores require...
Research Paper
Ivy: Instrumental Variable Synthesis for Causal Inference

A popular way to estimate the causal effect of a variable x on y from observational data is to use an instrumental variable (IV): a third variable z that affects y only through x. The more strongly z is associated with x, the more reliable the estimate is, but such strong IVs are difficult to find. Instead, practitioners combine more…

Jun 02, 2020

Z. Kuang, et al.

Learn more about Ivy: Instrumental Variable Synthesis for Causal Inference
Extracting chemical reactions from text using Snorkel
Enzymatic and chemical reactions are key for understanding biological processes in cells. Curated databases of chemical reactions exist but these databases struggle to keep up with the exponential growth of the biomedical literature. Conventional text mining pipelines provide tools to automatically extract entities and relationships from the scientific literature, and partially replace expert curation, but such machine learning frameworks often require a large amount of labeled training data and thus lack scalability for both larger document corpora and new relationship types. We developed an application of Snorkel, a weakly supervised learning framework, for extracting chemical reaction relationships from biomedical literature...
Research Paper
Extracting chemical reactions from text using Snorkel

Enzymatic and chemical reactions are key for understanding biological processes in cells. Curated databases of chemical reactions exist but these databases struggle to keep up with the exponential growth of the biomedical literature. Conventional text mining pipelines provide tools to automatically extract entities and relationships from the scientific literature, and partially replace expert curation, but such machine learning frameworks often…

May 27, 2020

E. Mallory, et al.

Learn more about Extracting chemical reactions from text using Snorkel
Utilizing Weak Supervision to Infer Complex Objects in Autonomous Driving Data
This paper explores the applicability of weak supervision, or relying on higher level, noisier forms of supervision to label training data, specifically using data programming.
Research Paper
Utilizing Weak Supervision to Infer Complex Objects in Autonomous Driving Data

This paper explores the applicability of weak supervision, or relying on higher level, noisier forms of supervision to label training data, specifically using data programming.

Dec 19, 2019

Z. Wheng, et al, 2019

Learn more about Utilizing Weak Supervision to Infer Complex Objects in Autonomous Driving Data
Training Complex Models with Multi-Task Weak Supervision
Proposing a framework for integrating and modeling such weak supervision sources by viewing them as labeling different related sub-tasks of a problem, which we refer to as the multi-task weak supervision setting
Research Paper
Training Complex Models with Multi-Task Weak Supervision

Proposing a framework for integrating and modeling such weak supervision sources by viewing them as labeling different related sub-tasks of a problem, which we refer to as the multi-task weak supervision setting

Dec 18, 2019

A. Ratner, et al, 2019

Learn more about Training Complex Models with Multi-Task Weak Supervision
The Role of Massively Multi-Task and Weak Supervision in Software 2.0
Outlining a vision for a Software 2.0 lifecycle centered around the idea that labeling training data can be the primary interface to Software 2.0 systems.
Research Paper
The Role of Massively Multi-Task and Weak Supervision in Software 2.0

Outlining a vision for a Software 2.0 lifecycle centered around the idea that labeling training data can be the primary interface to Software 2.0 systems.

Dec 17, 2019

A. Ratner, et al, 2019

Learn more about The Role of Massively Multi-Task and Weak Supervision in Software 2.0
1 2 4
Image

For models that need to be right. Not just good enough.