AI News 2025-01-09

General

Research Insights

  • PRIME: Process Reinforcement Through Implicit Rewards (data/models, code)
    • Builds on prior work: Free Process Rewards without Process Labels.
    • The basic idea is: chain-of-thought (CoT) is a useful way to improve reasoning. But how to train better CoT? You can give scores to good vs. bad chains, but then the model only gets whole-chain feedback. It would be better to know where the reasoning chain went wrong (or right). In PRIME, alongside training the LLM, they train an LLM that acts as a per-token reward model. It learns what CoT-steps are looking good vs. bad, and so can provide more fine-grained direction control.
  • Differential Transformer. Explanation: The traditional transformer architecture spreads attention and can thus get distracted by noise (especially with large context). The differential architecture alters the attention equation so as to better amplify relevant context and suppress noise. This should improve retrieval and reduce hallucinations, especially for large contexts.
  • Metadata Conditioning Accelerates Language Model Pre-training. Pre-pending training data with meta-data (e.g. “from wikipedia.org”), for part of the training, allows more control. Training can be more data-efficient, and inference can be more steerable (by invoking a meta-data field associated with the desired output style).

LLM

AI Agents

Video

  • Fine-tuning of video models to a particular style is now starting. Examples of Hunyuan Video LoRAs.
  • Nvidia’s new GeForce RTX 5090 graphics card can use neural rendering for real-time ray-tracing (where only ~10% of pixels are computed using traditional ray-tracing, and a neural model is used to interpolate from that).

World Synthesis

  • Nvidia present Cosmos, a set of foundation models trained on 20 million hours of video. Intended to accelerate training (e.g. via synthetic data generation) of models for robotics, autonomous driving, industrial settings, etc.

Science

Brain

Hardware

  • Nvidia described their BG200 NVL72 rack-sized supercomputer: 72 Blackwell GPUs, 1.4 exaFLOPS of compute, and 130 trillion transistors. For fun, Jensen Huang showed what the corresponding compute would look like if all placed on a single wafer as a superchip, though that is not how it is actually manufactured or used.
  • Nvidia announces $3,000 personal AI supercomputer called Digits, which uses a GB10 superchip. A single unit can run a 200B model; linking two should allow one to run 405B models.

Robots

This entry was posted in AI, News and tagged , , , , , , , , . Bookmark the permalink.

Leave a Reply