AI News 2024-10-31

General

Research Insights

  • adi has proposed a new benchmark for evaluating agentic AI: MC bench (code). It consists of having the agent build an elaborate structure in Minecraft. By using humans to A/B rank the visual output, the capability of agents can be ranked.
  • Anthropic have provided an update to their interpretability work, where the activation space is projected concisely into a higher-dimensional space using sparse auto-encoders (SAE). Now, they posted: Evaluating feature steering: A case study in mitigating social biases. Earlier work showed that they can enforce certain kinds of model behaviors or personalities by exploiting a discovered interpretable feature. Now, they further investigate; focusing on features related to social bias. They find that they can, indeed steer the model (e.g. elicit more neutral and unbiased responses). They also find that pushing too far away from a central “sweet spot” leads to reduced capabilities.
  • RL, but don’t do anything I wouldn’t do. In traditional training, parts of the semantic space without data are simply interpolated. This can lead to unintended AI behaviors in those areas. In particular, this means when an AI isn’t sure what to do, they do exactly that undefined thing. This new approach tries to consider uncertainty. So when an AI isn’t sure about an action, it is biased towards not taking that action. This captures a sort of “don’t do anything I might not do” signals.
  • Mixture of Parrots: Experts improve memorization more than reasoning. The “mixture-of-experts” method (of having different weights that get triggered depending on context) seems to improve memorization (more knowledge for a given inference-time parameter budget) but not reasoning. This makes sense; reasoning is more of an “iterative deliberation” process that benefits from single-pass parameters and multi-pass refinement.
  • The Geometry of Concepts: Sparse Autoencoder Feature Structure. Tegmark et al. report on finding the feature space of LLMs spontaneously organizes in a hierarchical manner: “atomic” structures at small-scale, “brain” organization at intermediate-scale, and “galaxy” distribution at large-scale.

LLM

Audio

Image Synthesis

Video

World Synthesis

Robots

This entry was posted in AI, News and tagged , , , , , , . Bookmark the permalink.

Leave a Reply