AI News 2024-10-10

General

  • Ethan Mollick writes about “AI in organizations: Some tactics”, talking about how individuals are seeing large gains from use of AI, but organizations (so far) are not.
    • Many staff are hiding their use of AI, with legitimate cause for doing so: orgs often signal risk-averse and punitive bureaucracy related to AI, staff worry that productivity gains won’t be rewarded (or indeed punished, as expectations rise), staff worry contributions won’t be regarded, etc.
    • Mollick offers concrete things that orgs can do to increase use of AI:
      • Reduce fear. Do not have punitive rules. Publicly encourage the use of AI.
      • Provide concrete, meaningful incentives to those who use AI to increase efficiency.
      • Build a sort of “AI Lab” where domain experts test all the tools and see whether they can help with business processes.
  • The 2024 Nobel Prize in Physics has been awarded to John J. Hopfield and Geoffrey E. Hinton, for developing artificial neural networks.
  • The 2024 Nobel Prize in Chemistry has been awarded to to David Baker for computational protein design, and to Demis Hassabis and John Jumper for AI protein prediction (AlphaFold).
  • Lex Friedman interviews the team that builds Cursor. Beyond just Cursor/IDEs, the discussion includes many insights about the future of LLMs.

Research Insights

LLM

AI Agents

  • Altera is using GPT-4o to build agents. As an initial proof-of-concept, they have AI agents that can play Minecraft.
  • CORE-Bench is a new benchmark (leaderboard) for assessing agentic abilities. The task consists of reproducing published computational results, using provided code and data. This task is non-trivial (top score right now is only 21%) but measurable.
  • OpenAI released a new benchmark: MLE-bench (paper) which evaluates agents using machine-learning engineering tasks.
  • AI Agents are becoming more prominent; but there is a wide range of definitions being used implicitly, all the way from “any software process” (“agent” is already in use for any software program that tries to accomplish something has been called) all the way to “AGI” (needs to be completely independent and intelligent). This thread is trying to crowd-source a good definition.
    • Some that resonate with me:
      • (1): agent = llm + memory + planning + tools + while loop
      • (2): An AI system that’s capable of carrying out and completing long running, open ended tasks in the real world.
      • (3): An AI agent is an autonomous system (powered by a Large Language Model) that goes beyond text generation to plan, reason, use tools, and execute complex, multi-step tasks. It adapts to changes to achieve goals without predefined instructions or significant human intervention.
    • To me, a differentiating aspect of an agent (compared to a base LLM) is the ability to operate semi-autonomously (without oversight) for some amount of time, and make productive progress on a task. A module that simply returns an immediate answer to a query is not an agent. So, there must be some kind of iteration (multiple calls to LLM) for it to count. So I might offer something like:
      • AI Agent: A persistent AI system that autonomously and adaptively completes open-ended tasks through iterative planning, tool-use, and reasoning.

Image Synthesis

  • FacePoke is a real-time image editor that allows one to change a face’s pose (code, demo), based on LivePortrait.
  • A few months ago, Paints-UNDO (code) unveiled an AI method for not just generating an image, but approximating the stepwise sketching/drawing process that leads up to that image This is fun, maybe useful as a sort of drawing tutorial; but also undermines one of the few ways that digital artists can “prove” that their art is not AI generated (by screen-capturing the creation process).

Video

World Synthesis

Science

This entry was posted in AI, News and tagged , , , , , . Bookmark the permalink.

Leave a Reply