General
- OpenAI adds shopping research to ChatGPT.
- Anthropic: Estimating AI productivity gains from Claude conversations.
Safety
- Anthropic: From shortcuts to sabotage: natural emergent misalignment from reward hacking.
- Paper: Natural Emergent Misalignment from Reward Hacking in Production RL.
- They find that models learn reward hacking, and that this is entangled with other bad/undesired behaviors. Interestingly, by changing the RL system prompt to allow reward hacking, they were able to decouple this from other bad behaviors. They frame this as “inoculation prompting”; it stops generalization of bad behaviors to larger misalignment.
LLM
- Anthropic unveils Claude Opus 4.5. Beats Gemini 3 Pro on many (but not all) benchmarks, making it competitive with the state-of-the-art.
AI Agents
- Google is testing multi-agent systems, for (e.g.) refining ideas.
- OpenAI: Building an AI-native engineering team: How coding agents accelerate the software development lifecycle.
- Anthropic: Effective harnesses for long-running agents.
Image Synthesis
- Modern image synthesis relies on inferring patterns at different length-scales or doing patchwise prediction. But why not do next-pixel prediction? Traditionally, this is considered too computationally expensive. Google now publish: Rethinking generative image pretraining: How far are we from scaling up next-pixel prediction? Their scaling suggests ~5 years until we reach this capability.
World Synthesis
- Tencent Hunyuan 3D Engine is now available globally.
Science
Robots