Sam Wiseman

16 posts

Sam Wiseman

Sam Wiseman

@_samwiseman

NLP and ML things, formerly @duke_nlp.

انضم Haziran 2016
727 يتبع593 المتابعون
Sam Wiseman أُعيد تغريده
Reflection
Reflection@reflection_ai·
Most approaches to “agentic AI” focus on post-training fixes. In this conversation, member of our technical staff, @achowdhery argues the bottleneck is pre-training itself. Drawing on her work on PaLM and early Gemini, she explains why next-token prediction breaks down for long-horizon planning -- and how objectives, attention, and training data must evolve to support true agentic behavior.
The TWIML AI Podcast@twimlai

Today, we're joined by @achowdhery, member of technical staff at @reflection_ai, to explore the fundamental shifts required to build true agentic AI. While the industry has largely focused on post-training techniques to improve reasoning, Aakanksha draws on her experience leading pre-training efforts for Google’s PaLM and early Gemini models to argue that pre-training itself must be rethought to move beyond static benchmarks. We explore the limitations of next-token prediction for multi-step workflows and examine how attention mechanisms, loss objectives, and training data must evolve to support long-form reasoning and planning. Aakanksha shares insights on the difference between context retrieval and actual reasoning, the importance of "trajectory" training data, and why scaling remains essential for discovering emergent agentic capabilities like error recovery and dynamic tool learning. 🗒️ For the full list of resources for this episode, visit the show notes page: twimlai.com/go/759. 📖 CHAPTERS =============================== 00:00 - Introduction 02:26 - Reflection 04:54 - Limitations of post-training for building agents 07:31 - Rethinking pre-training in agents 10:51 - Scaling 11:27 - Evolving attention mechanisms for agentic capabilities 12:39 - Memory as a tool 14:13 - Loss objectives and training data 15:50 - Fine-tuning loss in agent performance 19:37 - Training data 21:29 - Augmenting dominant training data source 24:11 - Overcoming challenges in training on synthetic data 25:47 - Benchmarks 30:44 - Scaling laws in large models versus small models 33:20 - Long-form versus short-form reasoning 37:57 - Agent’s ability to recover from failure 40:15 - Hallucinations and failure recovery 43:53 - Tool use in agents 46:38 - Coding agents 48:37 - How researchers can contribute to agentic AI

English
5
15
112
43K
Sam Wiseman
Sam Wiseman@_samwiseman·
Thrilled to share I’ve joined Reflection AI, just as it’s announced its Series B funding round! I joined because I’m excited about its ambitious plan to train its own open-weight frontier models, and because of its enormously talented people. Congrats to the team!
Reflection@reflection_ai

Today we're sharing the next phase of Reflection. We're building frontier open intelligence accessible to all. We've assembled an extraordinary AI team, built a frontier LLM training stack, and raised $2 billion. Why Open Intelligence Matters Technological and scientific progress is driven by values of openness and collaboration. The internet, Linux, and the protocols and standards that underpin modern computing are all open. This isn't a coincidence. Open software is what gets forked, customized, and embedded into systems worldwide. It's what universities teach, what startups build on, what enterprises deploy. Open science enables others to learn from the results, be inspired by them, interrogate them, and build upon them in order to push the frontier of human knowledge and scientific advancement. AI got to where it is today through scaling ideas (e.g. self-attention, next token prediction, reinforcement learning) that were shared and published openly. Now AI is becoming the technology layer that everything else runs on top of. The systems that accelerate scientific research, enhance education, optimize energy usage, supercharge medical diagnoses, and run supply chains will all be built on AI infrastructure. But the frontier is currently concentrated in closed labs. If this continues, a handful of entities will control the capital, compute, and talent required to build AI, creating a runaway dynamic that locks everyone else out. There's a narrow window to change this trajectory. We need to build open models so capable that they become the obvious choice for users and developers worldwide, ensuring the foundation of intelligence remains open and accessible rather than controlled by a few. What We've Built Over the last year, we've been preparing for this mission. We’ve assembled a team who have pioneered breakthroughs including PaLM, Gemini, AlphaGo, AlphaCode, AlphaProof, and contributed to ChatGPT and Character AI, among many others. We built something once thought possible only inside the world’s top labs: a large-scale LLM and reinforcement learning platform capable of training massive Mixture-of-Experts (MoEs) models at frontier scale. We saw the effectiveness of our approach first-hand when we applied it to the critical domain of autonomous coding. With this milestone unlocked, we're now bringing these methods to general agentic reasoning. We've raised significant capital and identified a scalable commercial model that aligns with our open intelligence strategy, ensuring we can continue building and releasing frontier models sustainably. We are now scaling up to build open models that bring together large-scale pretraining and advanced reinforcement learning from the ground up. Safety and Responsibility Open intelligence also changes how we think about safety. It enables the broader community to participate in safety research and discourse, rather than leaving critical decisions to a few closed labs. Transparency allows independent researchers to identify risks, develop mitigations, and hold systems accountable in ways that closed development cannot. But openness also requires confronting the challenges of capable models being widely accessible. We're investing in evaluations to assess capabilities and risks before release, security research to protect against misuse, and responsible deployment standards. We believe the answer to AI safety is not “security through obscurity” but rigorous science conducted in the open, where the global research community can contribute to solutions rather than a handful of companies making decisions behind closed doors. Join Us There is a window of opportunity today to build frontier open intelligence, but it is closing and this may be the last. If this mission resonates, join us.

English
11
0
76
6.6K
Ethan Brooks
Ethan Brooks@ponderousbs·
@_samwiseman Welcome to the team Sam! Looking forward to workin together!
English
1
0
3
106
Sam Wiseman أُعيد تغريده
Ruoming Pang
Ruoming Pang@ruomingpang·
As Apple Intelligence is rolling out to our beta users today, we are proud to present a technical report on our Foundation Language Models that power these features on devices and cloud: machinelearning.apple.com/research/apple…. 🧵
English
13
185
698
161K
Sam Wiseman أُعيد تغريده
Craig Thomson
Craig Thomson@ThomsonSoftware·
I keep seeing published or under-review papers in #NLProc data-to-text generation using the RotoWire dataset where problems of train/test contamination are not addressed. Evaluation are also poor. I created a simple post to explain both including a TL;DR github.com/nlgcat/sport_s…
English
1
9
10
0
Sam Wiseman
Sam Wiseman@_samwiseman·
So we train a model to mimic this shortest sequence, which gives fairly interpretable derivations of text from neighbors, like the one at the top of the thread. 3/3 Also, hello twitter people 👋
English
0
0
3
0
Sam Wiseman
Sam Wiseman@_samwiseman·
The idea is to allow arbitrary span insertion/replacement (from neighbors) into a canvas. Turns out finding the *shortest* sequence of these splice ops that yields a sentence given some neighbors reduces to parsing under a WCFG. 2/3
Sam Wiseman tweet media
English
1
0
2
0
Sam Wiseman
Sam Wiseman@_samwiseman·
Newish #EMNLP2021 work w/ Arturs Backurs & Karl Stratos: we try to generate text (in a data-to-text setting) by splicing together pieces of retrieved neighbor text. Paper: arxiv.org/pdf/2101.08248… 1/3
Sam Wiseman tweet media
English
3
14
64
0