AI News 2025-01-16

General

The US White House issued a statement: FACT SHEET: Ensuring U.S. Security and Economic Strength in the Age of Artificial Intelligence. It calls to provide unrestricted access to AI hardware and software to 18 “key allies and partners”; with correspondingly restricted access to others.
OpenAI’s Economic Blueprint: policy proposals for how the US can maximize AI’s benefits, bolster national security, and drive economic growth. Full report: AI in America.
From chalkboards to chatbots: Transforming learning in Nigeria, one prompt at a time. The article reports major gains in education when using AI as a tutor (supposedly: 6 weeks of after-school AI tutoring = 2 years of typical learning gains).
Simple discussion of the environmental cost of genAI: Using ChatGPT is not bad for the environment.
- Relatedly: The carbon emissions of writing and illustrating are lower for AI than for humans.
Here’s a press release that provides a general-audience intro to my exocortex concept.

Research Insights

Safety

Writing Doom. A short film (27m) about superintelligence. The film does a good job of going-over the basic arguments for ASI threat; useful for those who haven’t heard these before. (C.f. my attempt to summarize the arguments.)

LLM

LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs. They introduce a multi-step visual reasoning benchmark, and introduce a LlamaV-o1 visual reasoning model that leverages curriculum learning.
AutoRAG: RAG AutoML tool for automatically finding an optimal RAG pipeline for your data.
Enhancing Retrieval-Augmented Generation: A Study of Best Practices.
OpenAI introduces Tasks: the ability to schedule ChatGPT to perform an action and report the result (examples). Although simple, it points towards increasingly agentic, background activity by commercial LLMs.
MiniMax release (open-source) MiniMax-Text-01 and MiniMax-VL-01 (multi-modal visual). You can try it here. Using flash attention, they deploy a 4M token context length.
- Paper: MiniMax-01: Scaling Foundation Models with Lightning Attention.
Interesting developments to improve LLM reasoning over image/video data:
- VideoRAG: Retrieval-Augmented Generation over Video Corpus.
- Imagine while Reasoning in Space: Multimodal Visualization-of-Thought.

AI Agents

Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains (preprint, code). A base model is finetuned into a variety of specialized models using synthetic data.

Audio

Image Synthesis

Video

Science

Update to the NextBrain segmentation method: Bayesian Segmentation with Histological Atlas “NextBrain”.
- Previously, researchers evaluated whether Meta’s Segment Anything Model (SAM) was suitable for MRI.
A generative model for inorganic materials design. Uses the denoising concept (as used in image synthesis) to enable generation of novel inorganic material unit cells. This essentially allows text-to-material prompting.

Robots

Latest video of Unitree’s humanoid robot shows a more humanlike gait, and navigating more rugged terrain.