AI News 2025-02-20

General

  • Perplexity adds a Deep Research capability (similar to Google and OpenAI). You can try it even in the free tier (5 per day). They score 21% on the challenging “Humanity’s Last Exam” benchmark, second only to OpenAI at 26%.
  • TechCrunch reports: A job ad for Y Combinator startup Firecrawl seeks to hire an AI agent for $15K a year. Undoubtedly a publicity stunt. And yet, it hints towards a near-future economic dynamic: offering pay based on desired results (instead of salary), and allowing others to bid using human or AI solutions.
  • Mira Murati (formerly at OpenAI) announces Thinking Machines, an AI venture.
  • Fiverr announces Fiverr Go, where freelancers can train a custom AI model on their own assets, and have this AI model/agent available for use through the Fiverr platform. This provides a way for freelancers to service more clients.
    • Elevenlabs Payouts is a similar concept, where voice actors can be paid when clients use their customized AI voice.
    • In the short term, this provides an extra revenue stream to these workers. Of course, these workers are the most at threat for full replacement by these very AI methods. (And, indeed, one could worry that the companies in question are gathering the data they need to eventually obviate the need for profit-sharing with contributors.)

Research Insights

LLM

  • Nous Research releases DeepHermes 3 (8B), which mixes together conventional LLM response with long-CoT reasoning response.
  • InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU.
  • ByteDance has released a new AI-first coding IDE: Trae AI (video intro).
  • LangChain Open Canvas provides a user interface for LLMs, including memory features, UI for coding, display artifacts, etc.
  • xAI announces the release of Grok 3 (currently available for use here), including a reasoning variant and “Deep Search” (equivalent to Deep Research). Early testing suggests a model closing in on the abilities of o1-pro (but not catching up to o3 full). So, while it has not demonstrated any record-setting capabilities, it confirms that frontier models are not yet using any methods that cannot be reproduced by others.

AI Agents

Safety

Image

Video

3D

World Synthesis

  • Microsoft report: Introducing Muse: Our first generative AI model designed for gameplay ideation (publication in Nature: World and Human Action Models towards gameplay ideation). They train a model on gameplay videos (World and Human Action Model, WHAM); the model can subsequently forward-simulate gameplay from a provided frame. The model has thus learned an implicit world model for the video game. Forward-predicting gameplay based on artificial editing of frames (introducing a new character or situation) thus allows rapid ideation of gameplay ideas before actually updating the video game. More generally, this points towards direct neural rendering of games and other interactive experiences.

Science

Brain

Robots

  • Unitree video shows robot motion that is fairly fluid and resilient.
  • Clone robotics is moving towards combining their biomimetic components into a full-scale humanoid: Protoclone.
  • MagicLab Robot with dextrous MagicHand S01.
  • Figure AI claims a breakthrough in robotic control software (Helix: A Vision-Language-Action Model for Generalist Humanoid Control). The video shows two humanoid robots handling a novel task based on human natural voice instructions. Assuming the video is genuine, it show genuine progress in the capability of autonomous robots to understand instructions and conduct simple tasks (including working with a partner in a team).
This entry was posted in AI, News and tagged , , , , , , , , , , . Bookmark the permalink.

Leave a Reply