Artificial intelligence is advancing rapidly, but our ability to measure what these systems can actually do—and the risks they may pose—has lagged behind. Headline benchmarks and viral demos offer snapshots of a system's performance, but they say little about how AI behaves in complex real-world settings or how much autonomy models can sustain over time. As these systems take on more consequential roles, the challenge is not just building more powerful models, but developing credible ways to evaluate their capabilities and limits.
Chris Painter, president of Model Evaluation and Threat Research (METR), joins Oren to discuss how researchers are building new frameworks to assess AI systems and what those efforts reveal about the trajectory of machine intelligence. They explore “time horizon” as a measure of autonomy, the difficulty of evaluating alignment and sabotage risks, and the constraints posed by compute and organizational bottlenecks. They also consider what it will look like when AI systems begin contributing even more to their own development and their capabilities outpace our ability to measure them.
Playback speed
×
Share post
Share post at current time
Share from 0:00
0:00
/
Transcript
Measuring Machine Intelligence with Chris Painter
How should we evaluate AI’s capabilities, risks, and autonomy as systems grow more powerful?
Apr 17, 2026
The American Compass Podcast
Conversations aimed at developing the conservative economic agenda to supplant blind faith in free markets with a focus on workers, their families and communities, and the national interest.
Conversations aimed at developing the conservative economic agenda to supplant blind faith in free markets with a focus on workers, their families and communities, and the national interest.Authors
Recent Posts












