AI News 2025-03-13

General

Research Insights

LLM

AI Agents

Safety

  • OpenAI blog post: Detecting misbehavior in frontier reasoning models. They study how the natural-language chain-of-thought operates in reasoning models. They find that aggressive optimization of reasoning, especially optimizing for the CoT to not exhibit misaligned text, leads to model behaviors where undesired thoughts are not expressed in CoT (but are nevertheless activated). Conversely, under-optimized CoT remains human-legible, providing an opportunity to detect and modify undesired behavior. They advocate for strongly avoiding over-optimization of CoT, thereby keeping it legible; noting that this may require hiding the CoT from the end-user (e.g. so model can freely consider dangerous topics in the CoT, while ultimately not expressing these in the response to the user).
  • Dan Hendrycks, Eric Schmidt and Alexandr Wang released: Superintelligence Strategy, a detailed essay about ASI risks, with concrete mitigation suggestions, including Mutual Assured AI Malfunction (MAIM).

Audio

  • Elevenlabs adds speed control for text-to-speech; can be controlled down to the word level to control a performance.
  • Tavus are demoing AI avatars (audio and video) that are fairly lifelike and responsive.
  • Nvidia release Audio Flamingo 2 (paper, code), an audio-language model with long-context and understanding of non-speech audio.
  • Sesame has now released the weights for their remarkable conversational audio model (demo, example): use, code, weights.

Image Synthesis

Video

  • Hedra releases Character 3, an improved video avatar model, that can lip-sync to provided audio.
  • Captions AI’s Mirage model also achieves more emotive lip-sync than older methods.

Science

Robots

This entry was posted in AI, News and tagged , , , , , . Bookmark the permalink.

Leave a Reply