AI News 2024-12-26

General

Research Insights

LLM

  • OpenAI reveal a new reasoning model: o3. It scores higher on math and coding benchmarks, including setting a new record of 87.5% on ARC-AGI Semi-Private Evaluation. This suggests that the model is exhibiting new kinds of generalization and adaptability.
    • The ARC-AGI result becomes even more impressive when one realizes that the prompt they used was incredibly simple. It does not seem that they prompt engineered, nor used a bespoke workflow for this benchmark (the ARC-AGI public training set was included in o3 training). Moreover, some of the failures involve ambiguities; even when it fails, the solutions it outputs are not far off. While humans still out-perform AI on this benchmark (by design), we are approaching the situation where the problem is not depth-of-search, but rather imperfect mimicking of human priors.
    • The success of o3 suggests that inference-time scaling has plenty of capacity; and that we are not yet hitting a wall in terms of improving capabilities.
  • More research as part of the trend of improving LLMs with more internal compute, rather than external/token-level compute (c.f. Meta and Microsoft research):
  • Qwen released: QvQ-72B-preview visual reasoning model.
  • DeepSeek release DeepSeek-V3-Base (weights), 671B params. This is noteworthy as a very large open-source model, noteworthy for achieving competitive to state-of-the-art performance, and noteworthy for having (supposedly) required relatively little compute (15T tokens, 2.788M GPU-hours on H800, only $5.5M).

Safety

Video

Audio

  • Adobe Sketch2Sound allows one to imitate sound effects, and use AI to convert it into appropriate sounds. This allows art direction for Foley sound.
  • MMAudio enables video-to-audio; i.e. it can add a soundtrack to silent video (project, code, examples: 1, 2).

World Synthesis

Science

Hardware

  • Nvidia unveils a small form-factor compute platform (suitable for robotics).
  • Raven Resonance is another attempt to deliver augmented reality glasses.

Robots

This entry was posted in AI, News and tagged , , , , , , , , . Bookmark the permalink.

Leave a Reply