AI News 2024-07-25

Research Insights

Capabilities

  • Google Deepmind demonstrates vastly improved AI reasoning on math problems. AlphaProof and an improved AlphaGeometry 2 combine a language model with the AlphaZero reinforcement learning algorithm (and leveraging Lean formal language). This system achieves silver-medal quality on math Olympiad problems. Combining LLM heuristics (as system 1 intuitions) with more rigorous iteration (as system 2 reasoning) seems like a viable path towards improved intelligence.
    • It seems increasingly likely that AI will achieve gold-medal performance soon enough.
    • OpenAI presented some similar work in 2022, and UC Berkeley just published a related result using Prolog. It is also known that tree search (e.g. MCTS) improves LLM math abilities (1, 2, 3, 4, 5). Overall this body of work points towards a viable way to improve LLM math performance. The hope is that this translates to improved general reasoning.
  • OpenAI announced SearchGPT, a web-searching prototype (not yet available to the public). Looks like it will be useful.

AI Agents

LLM

  • Llama 3.1 405b is now released. 750GB on disk, requires 8-16 GPUs to run inference. 128k context length. Benchmarks show it competitive with state-of-the-art (OpenAI GPT-4o and Anthropic Claude 3.5 Sonnet).
    • Zuckerberg published a companion essay: Open Source AI Is the Path Forward.
    • Llama 3.1 also has smaller models distilled from the larger.
    • Of course we are already seeing real-time voice chatbots that take advantage of the small/fast models: RTVI (demo, code) runs Llama on Groq for responsive voice chatting.
  • Mistral Large 2 released (download). 123B parameters, 128k context length. Appears roughly competitive with Llama 3.1, GPT-4o, etc.

Multi-modal Models

Audio

  • Suno AI has added instrumental and vocal stems, allowing users to separate the vocals and instrumentals from songs.
  • Udio released v1.5 with improved audio quality. Also added the ability to download stems.

3D

World Synthesis

Policy

Hardware

  • xAI has just turned on their cluster. 100,000 Nvidia H100 GPUs, which is roughly 100 petaflops (FP16) of compute (hardware cost ~$3B). They claim this is the most powerful single cluster for AI applications. (Supposedly, OpenAI’s next cluster will have 100k GB200, which would be ~250 petaflops and cost ~$6.5B.)

Robots

This entry was posted in AI, News and tagged , , , , , , , , . Bookmark the permalink.

Leave a Reply