June 22, 2026

Observability: Role of Evals, Benchmarks & Data in Frontier AI | Alex Ratner from Snorkel AI

Topic:

Alex Ratner

Snorkel AI

TYPE: Videos

YEAR: 2026

The excitement around agentic AI is real — backed by quantitative progress on model cards and genuine leaps in capability. But our ability to measure AI has been outpaced by our ability to develop it, and closing this evaluation gap is one of the most important problems facing the field. More enduring benchmarks are needed to advance the next vectors of capability and chart the path to reliable agents.

In this talk, Snorkel AI Co-Founder and CEO Alex Ratner will share insights from major research and benchmark collaborations on agentic coding and continual learning, along with practical tips from working with global frontier labs and leading academics. He’ll focus on three dimensions where today’s models most often break down, and where the next generation of benchmarks will need to deliver real signal: environment complexity (how dynamic and rich the operating world is), autonomy horizon (how far an agent can act independently), and output complexity (how sophisticated and verifiable the deliverable is).

SUBSCRIBE TO @SCALE

← Back

Observability: Role of Evals, Benchmarks & Data in Frontier AI | Alex Ratner from Snorkel AI

Alex Ratner

TYPE: Videos

YEAR: 2026

SUBSCRIBE TO @SCALE

Thank you for your response. ✨

RECENT POSTS