AI News 2024-10-24

General

Research Insights

Safety/Policy

LLM

  • The OpenAI Chat Completion API now supports audio input (allowing one to skip a separate transcription step).
  • Google’s Notebook LM has capture much attention, in part due to the useful “chat with my PDFs” feature, but mostly the cool “generate podcast” trick. You can now customize the podcast generation.
  • MotherDuck have added a “prompt()” function to their SQL database, such that you can weave LLM calls into your SQL lookups.
    • BlendSQL appears to be an open-source attempt to do something similar: combine LLM calls with SQL.
  • Meta released Meta Spirit LM an open source multimodal language model that freely mixes text and speech.
  • Anthropic announces a new Claude 3.5 Haiku model, as well as a new version of their excellent Claude 3.5 Sonnet model. This new model can “use a computer” (still experimental), available via API.
    • Ethan Mollick posts about his experience using this experimental mode.
    • An open-source version (using regular Claude 3.5 Sonnet via API) has appeared: agent.exe.
  • Perplexity plans to release a reasoning mode, where it can agentically search and collate information.

Tools

Audio

  • Elevenlabs adds Voice Design, allowing you to generate a new voice by text-prompting what it should sound like.

Image Synthesis

Video

Science

Hardware

Robots

  • A video of Fourier’s GR-2 robot standing up.
  • Video of Engine AI robot walking. As noted, the more upright (locked knees) gait is more energy-efficient, compared to the squatted (bended knee) walking of many other designs.
  • Clone Robotics continue to pursue their micro-hydraulic bio-mechanical approach to robotics; they now have a torso.
This entry was posted in AI, News and tagged , , , , , , , , , , . Bookmark the permalink.

Leave a Reply