| Snorkel AI

By: Snorkel Team

Published: May 16, 2026

Recommended
articles

Why Coding Agents Need Better Data, Evals, and Environments

Coding agents have moved from tab-complete to teammate. They autonomously inspect repositories, edit files, run commands, diagnose failures, and work through multi-step engineering tasks. That creates a harder reliability problem. A model that only suggests code is easy for a human to evaluate. A coding agent refactoring your repository and testing its own changes is much harder to supervise –…

Justin Bauer

May 6, 2026

Research

Benchmarks should shape the frontier, not just measure it

Since launching the Open Benchmarks Grants, we’ve received more than 100 applications from academic groups and industry labs spanning a wide range of domains and capabilities. As the best benchmarks have driven how the field allocates research effort, the bar for what counts as useful has risen. Here, we share what’s now table stakes for useful benchmarks, and what separates…

Vincent Sunn Chen

April 6, 2026

Research

Benchtalks #1: Alex Shaw (Terminal-Bench, Harbor) – Building the Benchmark Factory

To kick off our inaugural Benchtalks, a series dedicated to the researchers building these measurement toolkits, Snorkel AI co-founder Vincent Sunn Chen sat down with Alex Shaw, Founding MTS at Laude Institute and co-creator of Terminal-Bench and Harbor. Highlights More on Terminal-Bench: See the leaderboard and the catalog of tasks at tbench.ai. Explore Harbor: Learn how to scale your agent…

Vincent Sunn Chen

March 31, 2026

Join our newsletter for expert advice, the latest research, and exclusive events.

By submitting this form, I acknowledge I will receive email updates from Snorkel AI, and I agree to the Terms of Use and acknowledge that my information will be used in accordance with the Privacy Policy.

Recommended articles

Why Coding Agents Need Better Data, Evals, and Environments

Benchmarks should shape the frontier, not just measure it

Benchtalks #1: Alex Shaw (Terminal-Bench, Harbor) – Building the Benchmark Factory

Join our newsletter for expert advice, the latest research, and exclusive events.

How do you want to work with Snorkel?

Recommended
articles