Heather Gorham (@heather_gorham_) - Twitter Profili

Sabitlenmiş Tweet

Heather Gorham@heather_gorham_·10 Nis

ZXX

307

Heather Gorham@heather_gorham_·7 Haz

I had the privilege of sitting down with Ida Momennejad, Principal Researcher in Reinforcement Learning at Microsoft Research. Her years of brilliant research on multi-step planning and cognition could not be more timely and important amid today's conversations around agentic AI. Here are some of the highlights from our conversation, many from her published papers. I encourage you to read more and share your thoughts, linked below. - Despite many claims on what LLMs can do, they're not good at executive functions like abstract reasoning and multi-step planning. Importantly, to understand where the gaps are in their ability (and to close these gaps), we need to have methods for proper evaluation established. Ida studied if LLMs could extract cognitive maps to use in planning and inference (spoiler alert - they could not). See: microsoft.com/en-us/research… - To enable agents and systems to perform better at these kinds of tasks, Ida proposes that LLMs need a prefrontal cortex and thus recommends a modular system approach for agentic AI. Different modules (agents) function as the generator, orchestrator, evaluator, predictor, etc. They work together in different roles to better accomplish tasks that require multi-step planning and accomplishing compositional tasks like decision making. This is similar to the framework also seen in the Microsoft's Autogen framework. See: arxiv.org/abs/2310.00194 microsoft.com/en-us/research… - We wrapped the conversation on Human-AI alignment. Her recent paper on multi-player games studied the differences in behavior in humans vs. AI agents in game play. This is important in scaling AI safely, as well as better designing behavior in agents that humans enjoy interacting with. Ida proposes the “Task-sets” framework, an interpretable approach towards human-AI alignment. See: arxiv.org/abs/2402.03575

English

456

Heather Gorham@heather_gorham_·22 Şub

@eladgil I would note that in the speed and quality conversation both model + underlying hardware matter.

English

Elad Gil@eladgil·21 Şub

How do we think about speed vs performance for LLMs? We probably have slow, very smart models outputting major work - e.g. a legal doc. Then faster, workhorse models for e.g. social apps. How extreme does this segmentation get?

English

8.1K

Elad Gil@eladgil·21 Şub

Things I don't know about AI markets - a list of questions about our ever changing world :)

English

112

1.1K

300.6K

Heather Gorham

Keşfet