AI News 2025-12-25

Research Insights

LLM

  • OpenAI update: GPT-5.2-Codex.
  • METR reports a record-setting time on their task length benchmark: Opus 4.5 reaches almost 5 hours (though sparsity in the available evaluations at this time horizon make data increasingly unrealiable).
    • From this, we can update our predictions. It appears that progress continues along the established exponential, with capabilities doubling every 4-5 months.

AI Agents

Safety

This entry was posted in AI, News and tagged , , . Bookmark the permalink.

Leave a Reply