Close Menu
clearpathinsight.org
  • AI Studies
  • AI in Biz
  • AI in Tech
  • AI in Health
  • Supply AI
    • Smart Chain
    • Track AI
    • Chain Risk
  • More
    • AI Logistics
    • AI Updates
    • AI Startups

How Los Angeles yard sales launched AI’s latest twenty-something billionaire

February 7, 2026

This is the most misunderstood graph in AI

February 7, 2026

Top 10+ AI Agents in Healthcare with Examples

February 7, 2026
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
clearpathinsight.org
Subscribe
  • AI Studies
  • AI in Biz
  • AI in Tech
  • AI in Health
  • Supply AI
    • Smart Chain
    • Track AI
    • Chain Risk
  • More
    • AI Logistics
    • AI Updates
    • AI Startups
clearpathinsight.org
Home»AI in Technology»This is the most misunderstood graph in AI
AI in Technology

This is the most misunderstood graph in AI

February 7, 2026003 Mins Read
Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email Telegram WhatsApp
Follow Us
Google News Flipboard
Ai chart3.jpg
Share
Facebook Twitter LinkedIn Pinterest Email Copy Link

This was certainly the case with Claude Opus 4.5, the latest version of Anthropic’s most powerful model, released at the end of November. In December, METR reported that Opus 4.5 appeared to be able to autonomously complete a task that would have taken a human about five hours – a vast improvement over what even the exponential trend would have predicted. An Anthropic security researcher tweeted that he would change the direction of his research in light of these findings; another employee of the company simply wrote: “Mom, come get me, I’m scared.”

But the truth is more complex than these dramatic responses suggest. On the one hand, METR estimates of specific model capabilities have substantial error bars. As METR explicitly stated on Given the uncertainties inherent in the method, it was impossible to be sure.

“There are many reasons why people overinterpret the chart,” says Sydney Von Arx, a member of the METR technical team.

More fundamentally, the METR chart does not measure AI capabilities as a whole, nor does it claim to do so. In order to build the chart, METR tests the models primarily on coding tasks, rating the difficulty of each by measuring or estimating how long it takes humans to complete it, a metric that not everyone accepts. Claude Opus 4.5 might be able to complete some tasks that take humans five hours to complete, but that doesn’t mean it’s close to replacing a human worker.

METR was founded to assess the risks posed by border AI systems. Although he is best known for his exponential trend plotting, he has also worked with AI companies to evaluate their systems in more detail and has published several other independent research projects, including a widely covered July 2025 study which suggests that AI coding assistants might actually slow down software engineers.

But the exponential plot made METR’s reputation, and the organization seems to have a complicated relationship with the graph’s often breathless reception. In January, Thomas Kwa, one of the main authors of the article that introduced him, wrote a blog post responding to certain criticisms and explaining its limitations, METR is currently working on a more complete FAQ document. But Kwa is not optimistic that these efforts will significantly change the narrative. “I think whatever we do, the hype machine will just eliminate all the caveats,” he says.

Nonetheless, the METR team believes the plot has something significant to say about the trajectory of AI progress. “You absolutely should not tie your life to this chart,” says Von Arx. “But also,” she adds, “I bet this trend will continue.”

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link

Related Posts

Tesla trains its AI technology in China, local media report

February 7, 2026

AI is disrupting entry-level jobs. Three teenagers tell us how they react

February 7, 2026

Why a new AI tool hit some software titles this week

February 7, 2026
Add A Comment
Leave A Reply Cancel Reply

Categories
  • AI Applications & Case Studies (62)
  • AI in Business (338)
  • AI in Healthcare (292)
  • AI in Technology (329)
  • AI Logistics (50)
  • AI Research Updates (115)
  • AI Startups & Investments (272)
  • Chain Risk (81)
  • Smart Chain (104)
  • Supply AI (92)
  • Track AI (59)

How Los Angeles yard sales launched AI’s latest twenty-something billionaire

February 7, 2026

This is the most misunderstood graph in AI

February 7, 2026

Top 10+ AI Agents in Healthcare with Examples

February 7, 2026

AI adoption in healthcare doubles, but cybersecurity risks are significant

February 7, 2026

Subscribe to Updates

Get the latest news from clearpathinsight.

Topics
  • AI Applications & Case Studies (62)
  • AI in Business (338)
  • AI in Healthcare (292)
  • AI in Technology (329)
  • AI Logistics (50)
  • AI Research Updates (115)
  • AI Startups & Investments (272)
  • Chain Risk (81)
  • Smart Chain (104)
  • Supply AI (92)
  • Track AI (59)
Join us

Subscribe to Updates

Get the latest news from clearpathinsight.

We are social
  • Facebook
  • Twitter
  • Pinterest
  • Instagram
  • YouTube
  • Reddit
  • Telegram
  • WhatsApp
Facebook X (Twitter) Instagram Pinterest
© 2026 Designed by clearpathinsight

Type above and press Enter to search. Press Esc to cancel.