resources

Resource library

Explore our complete library of resources including blogs, benchmarks, research papers and more.
Image for Evaluating Coding Agent Capabilities with Terminal-Bench: Snorkel’s Role in Building the Next Generation Benchmark
Blog

Evaluating Coding Agent Capabilities with Terminal-Bench: Snorkel’s Role in Building the Next Generation Benchmark

Announcing a $3M commitment to launch Open Benchmarks Grants
September 30, 2025
Image for Closing the Evaluation Gap in Agentic AI
Blog

Closing the Evaluation Gap in Agentic AI

Announcing a $3M commitment to launch Open Benchmarks Grants
February 11, 2026
Image for Benchtalks #1: Alex Shaw (Terminal-Bench, Harbor) – Building the Benchmark Factory
Blog

Benchtalks #1: Alex Shaw (Terminal-Bench, Harbor) – Building the Benchmark Factory

Announcing a $3M commitment to launch Open Benchmarks Grants
March 31, 2026
Image for Building FinQA: An Open RL Environment for Financial Reasoning Agents
Blog

Building FinQA: An Open RL Environment for Financial Reasoning Agents

Announcing a $3M commitment to launch Open Benchmarks Grants
March 30, 2026
Image for The science of rubric design
Blog

The science of rubric design

Announcing a $3M commitment to launch Open Benchmarks Grants
September 11, 2025
of
Type: All Types
Sort: Newest
Snorkel MeTaL: Weak Supervision for Multi-Task Learning
Research Paper
Snorkel MeTaL: Weak Supervision for Multi-Task Learning

Presenting Snorkel MeTal, an end-to-end system for multi-task learning.

Dec 18, 2018
A. Ratner, et al, 2018
Learn more about Snorkel MeTaL: Weak Supervision for Multi-Task Learning
Fonduer: Knowledge Base Construction From Richly Formatted Data
Research Paper
Fonduer: Knowledge Base Construction From Richly Formatted Data

Introducing Fonduer, a machine-learning-based KBC system for richly formatted data.

Dec 17, 2018
S. Wu, et al, 2018
Learn more about Fonduer: Knowledge Base Construction From Richly Formatted Data
Deep Text Mining of Instagram Data Without Strong Supervision
Research Paper
Deep Text Mining of Instagram Data Without Strong Supervision

This paper showcases methods for unsupervised mining of fashion attributes from Instagram text, which can enable a new kind of user recommendation in the fashion domain.

Dec 16, 2018
K. Hammar, et al, 2018
Learn more about Deep Text Mining of Instagram Data Without Strong Supervision
Snorkel: Fast Training Set Generation for Information Extraction
Research Paper
Snorkel: Fast Training Set Generation for Information Extraction

Introducing Snorkel, a new system for quickly creating, managing, and modeling training datasets.

Dec 20, 2017
A. Ratner, et al, 2017
Learn more about Snorkel: Fast Training Set Generation for Information Extraction
Learning to Compose Domain-Specific Transformations for Data Augmentation
Research Paper
Learning to Compose Domain-Specific Transformations for Data Augmentation

Automating data augmentation by learning a generative sequence model over user-specified transformation functions.

Dec 19, 2017
A. Ratner, et al, 2017
Learn more about Learning to Compose Domain-Specific Transformations for Data Augmentation
Learning the Structure of Generative Models Without Labeled Data
Research Paper
Learning the Structure of Generative Models Without Labeled Data

Proposing a structure estimation method that is 100x faster than a maximum likelihood approach for training data.

Dec 18, 2017
S. Bach, et al, 2017
Learn more about Learning the Structure of Generative Models Without Labeled Data
Inferring Generative Model Structure With Static Analysis
Research Paper
Inferring Generative Model Structure With Static Analysis

Presenting Coral, a paradigm that infers generative model structure, significantly reducing the amount of data required to learn structure.

Dec 17, 2017
P. Varma, et al, 2017
Learn more about Inferring Generative Model Structure With Static Analysis
Swellshark: A Generative Model for Biomedical Named Entity Recognition Without Labeled Data
Introducing SwellShark, a framework for building biomedical named entity recognition (NER) systems quickly.
Research Paper
Swellshark: A Generative Model for Biomedical Named Entity Recognition Without Labeled Data

Introducing SwellShark, a framework for building biomedical named entity recognition (NER) systems quickly.

Nov 13, 2017
J. Fries, et al, 2017
Learn more about Swellshark: A Generative Model for Biomedical Named Entity Recognition Without Labeled Data
Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data
Introducing Socratic learning, a paradigm that uses feedback from a discriminative model to automatically identify latent data subsets in training data.
Research Paper
Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data

Introducing Socratic learning, a paradigm that uses feedback from a discriminative model to automatically identify latent data subsets in training data.

Nov 13, 2017
P. Varma, et al, 2017
Learn more about Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data
1 2 62 63
Image
Image

Join our newsletter

For expert advice, the latest research, and exclusive events.
By submitting this form, I acknowledge I will receive email updates from Snorkel AI, and I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.