AI News 2025-06-19

General

The U.S. Army Reserve has formed a new Detachment 201: Executive Innovation Corps. The group (which includes OpenAI CPO Kevin Weil, Palantir CTO Shyam Sankar, Meta CTO Andrew Bosworth, and Bob McGrew) will focus on tech issues.
Epoch reports continued progress in AI, including on their hard FrontierMath benchmark.

OpenAI has been awarded a US Department of Defense contract for $200M to develop AI models for defense applications.
Future of Work with AI Agents: Auditing Automation and Augmentation Potential across the U.S. Workforce. Interesting delineation of jobs:

Research Insights

Distillation Robustifies Unlearning (preprint, demo, discussion). Normal unlearning suppresses knowledge in a model, but adversarial prompting or fine-tuning can bring the knowledge/behavior back. They show that distilling into a new model more reliably eradicates the undesired information.
Self-Adapting Language Models. The models update their own fine-tuning data and update directives.

LLM

Google is nearing release of Gemini 2.5 Pro Deep Think, which deploys more inference-time compute to improve reasoning.
Google launch 2.5 Flash-Lite, a very fast (very low cost) reasoning model.
Google DeepMind technical report: Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities. The paper lists a thousand contributors. (Informal summary.)

Agents

Anthropic describes the multi-agent system underlying Claude’s research capabilities: How we built our multi-agent research system.

Safety

Anthropic blog post: SHADE-Arena: Evaluating sabotage and monitoring in LLM agents (paper).
Avoiding Obfuscation with Prover-Estimator Debate. They show that honesty is incentivized at equilibrium (under certain conditions).
OpenAI: Toward understanding and preventing misalignment generalization.
- Paper: Persona Features Control Emergent Misalignment.
- They show that intentional misalignment training (e.g. to write bad code) causes an emergent “evil” personality. But this can be detected and countered.

Video

Science

Hardware

AMD unveiled its new MI350 chip, optimized for AI workloads. They are focusing on open/compliant coding standards, and energy/cost efficiency.

Cars

Data from Waymo: New Insights for Scaling Laws in Autonomous Driving. They show scaling laws apply to autonomous driving: using more data and more compute for training yields reliable improvements in performance.

Robots

RoboBrain 2.0 is an open-source, general purpose robot control model (video).
1X World Model: Evaluating Bits, not Atoms (preprint).
Generalist shows off a model that can enable relatively simple robot arms to perform precise work.
Hexagon robotics announces Aeon humanoid robot (on wheels), optimized for industrial work (video).