startup

Surprise upset: GPT-5.5 beats Claude Fable 5 on brutal new Agents’ Last Exam benchmark

Published: June 11, 2026 Source: venturebeat.com 1 min read

You are reading a summary. The full content is hosted on venturebeat.com.

UC Berkeley RDI and 300+ experts launched Agents’ Last Exam, a benchmark of long-horizon professional workflows across 55 industries with mostly deterministic grading and anti-contamination controls. GPT-5.5 via Codex leads the leaderboard at a 24.0% pass rate, underscoring that top models still perform poorly, with many scoring 0.0% on the hardest tier.

Read the full article on the original website

External link to venturebeat.com

startup

Scientists Warn a Popular Joint Supplement May Accelerate Your Risk of Cognitive Decline—Here’s What to Know

1 min read •

startup

South Korea’s Floundering Movie Business Turns to AI for Help

1 min read •

startup

Sources: Frank founder Charlie Javice, sentenced in September 2025 to 85 months for defrauding JPMorgan Chase, has been seeking a presidential pardon from Trump (Wall Street Journal)

1 min read •

Related Articles

Scientists Warn a Popular Joint Supplement May Accelerate Your Risk of Cognitive Decline—Here’s What to Know

South Korea’s Floundering Movie Business Turns to AI for Help

Sources: Frank founder Charlie Javice, sentenced in September 2025 to 85 months for defrauding JPMorgan Chase, has been seeking a presidential pardon from Trump (Wall Street Journal)