AI News 2025-06-12

General

Analysis of the location of datacenters (750 AI supercomputers): Trends in AI Supercomputers.

Research Insights

Prompting Science Report 2: The Decreasing Value of Chain of Thought in Prompting. Reasoning models don’t benefit from “think step by step” prompting.
Epoch AI was given access to the full reasoning traces of o3-mini (normally only summaries are shown to the user) to conduct this research: Beyond benchmark scores: Analyzing o3-mini’s mathematical reasoning. Mathematicians reviewed the traces of working on math problems; one evaluated described o3 as a “vibes-based inductive reasoner”.
Corrector Sampling in Language Models. Resampling prior tokens can be used to do small amounts of backtracking and thereby improve performance.
SPARTA ALIGNMENT: Collectively Aligning Multiple Language Models through Combat. The models compete and evaluate each other, without requiring external (human) scoring; models are updated on the computed preference ordering. This seems to provide a simple way to recursively self-improve.

LLM

Anthropic adds Claude Gov; models intended for national security.
Mistral announces Magistral, a reasoning model. Two variants: 24B open-source or a larger enterprise version via API.
- An interesting result from the report (section 7.2: Eating the multimodal free lunch): They base model is multi-modal, but RL is done using text only. Yet, they observe this text-only training does not harm multi-modal performance; in fact multi-modal performance improves. This suggests modalities are well-entangled and that transfer learning between modalities is naturally occurring.
OpenAI announced the released of o3-pro (release notes).
- Review from an early tester: God is hungry for Context: First thoughts on o3 pro.

Vision

Meta announce V-JEPA 2 (paper) a vision model that builds a world model, and could be useful for robotic control.

Audio

ElevenLabs introduces v3, an expressive text-to-speech system that supports intonation, accent, and even non-words like laughs and sighs or sound effects (examples: joke, affecting accents, various).

World Synthesis

4DV is demoing 4D Gaussian Splatting, wherein multi-camera video data is converted into a temporal/video 3D-spatial reconstruction (videos showing interaction: 1, 2, 3).
- FreeTimeGS: Free Gaussian Primitives at Anytime Anywhere for Dynamic Scene Reconstruction (preprint).

Science

Cars

Tesla has provided some updated details on their current “full self-driving” (FSD) implementation. Some claims: 3.5B miles driven by FSD across 6 million vehicles, 54% safer than human.

Robots

Video of Figure 02 robot sorting deformable packages. (Uncut 1 hour video of this activity, proving the prior footage was not cherry-picked.)
1X announce Redwood, their vision-language transformer model for robot control.