AI News 2025-05-29

General

  • Essay by Pete Koomen: AI Horseless Carriages (video version: Why AI Apps Still Feel Broken with Pete Koomen). It makes the case that our current approach of adding AI to existing applications is akin to early horseless carriages (which added engines to existing carriage designs; instead of being designed from scratch to optimally take advantage of an engine). Future AI-first applications need to rethink the user experience in light of AI capabilies.

Research Insights

LLM

Agents

  • OpenAI updates Operator to use the o3 model.
  • Manus introduce a system that will build a slide deck on demand.

Safety & Interpretability

Audio

  • Kyutai demos Unmute, a text-to-speech and speech-to-text capability. Will be open-sourced.
  • Anthropic announce that they will begin rolling out voice conversation mode.
  • Chatterbox TTS is a high-quality open source speech synthesis model (try).

Image Synthesis

Video

  • Viggle Live enables real-time avatar control.
  • Workflow: Use Google Street View imagery combined with image synthesis (e.g. Runway References) and then video generation (e.g. Runway Gen3) to generate a sequence of “on location” clips.
  • Google DeepMind report SignGemma, a forthcoming open model for converting sign language video into text.

World Synthesis

Science

  • OpenAI adds to ChatGPT scaffolding the ability to visualize molecules (RDKit library).

Robots

This entry was posted in AI, News and tagged , , , , , , , , , . Bookmark the permalink.

Leave a Reply