Stanford NLP Group

15.1K posts

Stanford NLP Group banner
Stanford NLP Group

Stanford NLP Group

@stanfordnlp

Natural Language Processing/Machine Learning @chrmanning @jurafsky @percyliang @ChrisGPotts @tatsu_hashimoto @MonicaSLam @Diyi_Yang @YejinChoinka @StanfordAILab

Stanford, CA, USA Katılım Şubat 2010
338 Takip Edilen182.3K Takipçiler
Sabitlenmiş Tweet
Stanford NLP Group
Stanford NLP Group@stanfordnlp·
For this week's NLP Seminar, we are excited to host @universeinanegg from UChicago! Date and Time: Thursday, April 2, 11:00AM — 12:00 PM Pacific Time. Zoom Link: stanford.zoom.us/j/93941842999?… Title: Seeing Like a Language Model Abstract: How does a language model perceive its input? What aspects of reality does it find legible and which elude it? How can we know? Current approaches to studying LLMs—focused on engineering progress—are insufficiently exploratory. I will discuss new approaches we have been incubating and more conceptually what it means for interpretability approaches to be predictive rather than mechanistic, defend prompting as a form of scientific inquiry, and caution against formalizing concepts too early, without doing the required amount of stamp collecting. Bio: Ari Holtzman is an Assistant Professor of Computer Science and Data Science at the University of Chicago, where he leads Conceptualization Lab. His motto is: 'I'm doing it with LLMs or I'm not doing it at all.'
Stanford NLP Group tweet media
English
0
8
66
20.5K
Stanford NLP Group retweetledi
vitrupo
vitrupo@vitrupo·
Chris Manning says Yann LeCun sees language as a low bandwidth communication channel compared to vision. But the gap between a chimp and a human wasn’t produced by superior eyes. What took off for humans was language. Not just for communication, but as a cognitive tool.
English
74
111
917
86K
Stanford NLP Group retweetledi
Artificial Intelligence
If you want to truly understand how modern LLMs work under the hood, Stanford’s CS336: Language Modeling from Scratch is one of the best resources out there. Instead of treating LLMs as black boxes, the course walks through the full stack: • tokenization & dataset pipelines • transformer architecture implementation • attention & positional encoding • training dynamics & scaling laws • distributed training and GPU efficiency • alignment and post-training You end up implementing a full language model training pipeline yourself. It’s rare to see a course that connects model architecture, training theory, and systems optimization this cleanly. Highly recommended for anyone interested in LLM research, systems, or building foundation models. cs336.stanford.edu
English
2
27
171
11.1K
Stanford NLP Group retweetledi
SUBRATA (SUBRATA) chatterjee
@percyliang You are the best. I did not go to Stanford but watch and follow your CS336 lectures. Amazing teaching by you and Tatsu. Thank you very much!
English
0
1
2
3.6K
Stanford NLP Group retweetledi
Megagon Labs
Megagon Labs@MegagonLabs·
𝗪𝗵𝗮𝘁 𝗱𝗼𝗲𝘀 𝗶𝘁 𝘁𝗮𝗸𝗲 𝗳𝗼𝗿 𝗔𝗜 𝘁𝗼 𝘁𝗿𝘂𝗹𝘆 𝗰𝗼𝗹𝗹𝗮𝗯𝗼𝗿𝗮𝘁𝗲 𝘄𝗶𝘁𝗵 𝗵𝘂𝗺𝗮𝗻𝘀, 𝗻𝗼𝘁 𝗷𝘂𝘀𝘁 𝗿𝗲𝘀𝗽𝗼𝗻𝗱 𝘁𝗼 𝘁𝗵𝗲𝗺? We hosted Stanford Assistant Professor @Diyi_Yang as a guest speaker at Megagon Labs to discuss just that. ⤵️ Professor Yang works at the intersection of NLP and Human-Computer Interaction, focusing on how AI systems can better understand people, their goals, and their context. Her work offers important perspectives for anyone building human-centered AI. 𝗔 𝗹𝗶𝘁𝘁𝗹𝗲 𝗮𝗯𝗼𝘂𝘁 𝗵𝗲𝗿 𝘁𝗮𝗹𝗸, 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗶𝗻𝗴 𝗛𝘂𝗺𝗮𝗻-𝗔𝗜 𝗖𝗼𝗹𝗹𝗮𝗯𝗼𝗿𝗮𝘁𝗶𝗼𝗻 Recent advances in LLMs have transformed human-AI interaction, but effective collaboration requires systems that can reason about users and adapt to how they work, not just strong models. In this talk, Dr. Yang examined how automation and augmentation are shaping the future of work and how grounding AI system design in real worker perspectives is essential. She introduced General User Models (GUMs), which learn about users from interaction signals, and Next Action Prediction (NAP), a framework for anticipating user intent from multimodal interaction histories. As an AI research lab consistently conducting human-centered AI research and developing frameworks, we enjoyed the dialogue on how systems can move from reactive systems to ones that can support meaningful, proactive collaboration. 𝗗𝗼 𝘆𝗼𝘂 𝗵𝗮𝘃𝗲 𝗶𝗻𝘀𝗶𝗴𝗵𝘁 𝗼𝗻 𝘁𝗵𝗶𝘀 𝘁𝗼𝗽𝗶𝗰? 𝗖𝗼𝗺𝗺𝗲𝗻𝘁 𝗯𝗲𝗹𝗼𝘄. #AI #HCI #Stanford @Stanford
Megagon Labs tweet media
English
0
5
22
4K
Stanford NLP Group retweetledi
Augmented Mind Podcast
Augmented Mind Podcast@augmind_fm·
"Actually, we (vllm) get more users from the simple UX than vllm performance" For our third guest, we welcome @woosuk_k, co-founder & CTO of @inferact and creator of @vllm_project. To us, Woosuk is a unique guest, and we are amazed by the user-centric perspective on LLM inference he shared — from what makes the vLLM project successful, to new application scenarios to tailor inference to, and to how to support continual learning from user signals, and more. 0:00 - Prelude: Introducing Woosuk and Inferact 3:00 - Woosuk’s First PhD Project 6:00 - How the vLLM Project Got Started 9:18 - AI Infra Needs More Than Just Efficiency 14:08 - How AI Infra and Human-centered AI Are Connected 15:01 - How to Prioritize Feature Requests for Popular AI Infra 18:18 - Streaming Requests and Realtime API 24:05 - Multi-turn, Agentic, Proactive LLMs 27:03 - How to Design AI Infra in a Principled Way 29:13 - How to Design an AI Inference Engine for Continue Learning with RL 35:05 - Would LoRA Training Affect RL Infra Design? 37:28 - Why Start an AI Inference Infra Startup? 40:46 - What Effortless Inference with Open-source Models Means for Developers 43:46 - A Vision for On-device AI Inference 46:19 - Can Today’s Coding Agents Create vLLM?
English
1
5
30
13.4K
Stanford NLP Group retweetledi
jessica dai
jessica dai@jessicadai_·
personallyyy I think academics should play the role of clarifying messy things in the discourse, not optimize for convincing journalists to blast sensationalist headlines. but what do I know
English
3
7
147
11K
Stanford NLP Group retweetledi
Aryaman Arora
Aryaman Arora@aryaman2020·
I’m very glad to see that Anthropic interp has caught up to the idea of generating a bunch of contrastive synthetic data for extracting supervised steering vectors from! It’s unfortunate that there’s no prior work to cite on this…
Anthropic@AnthropicAI

New Anthropic research: Emotion concepts and their function in a large language model. All LLMs sometimes act like they have emotions. But why? We found internal representations of emotion concepts that can drive Claude’s behavior, sometimes in surprising ways.

English
20
19
444
53.6K
Stanford NLP Group retweetledi
Jeff Dean
Jeff Dean@JeffDean·
Today we're releasing Gemma 4, our new family of open foundation models, built on the same research and technology as our Gemini 3 series. These models set a new standard for open intelligence, offering SOTA reasoning capabilities from edge-scale (2B and 4B w/ vision/audio) up to a 26B parameter MoE model and a 31B dense model. By releasing Gemma 4 under the Apache 2.0 license, we hope to enable more innovation across the research and developer communities. Our earlier Gemma 3 models were downloaded 400M times and over 100,000 variants of those models have been published, so we're excited to see what the community will do with the even better Gemma 4 models! Learn more at blog.google/innovation-and… and goo.gle/gemma-4-apache… Great work by everyone involved! #Gemma4 #AI #OpenSource #ML
English
55
176
1.5K
94.1K
Stanford NLP Group retweetledi
Percy Liang
Percy Liang@percyliang·
Our 1e23 Delphi run finished last night. It's loss was within 0.005 of the projected (preregistered) loss. Note that these projections were based on only training models over 100x smaller (3e20)! Still more work to do. We still had loss spikes and if you closely, our scaling laws are bending. We have some ideas for fixing both...
Will Held@WilliamBarrHeld

How far do Marin's scaling laws extrapolate? At least 100x, apparently! Despite spooky spikes, our 1e23 Delphi finished on forecast. The compute-optimal ladder costs ~1e21 FLOPs to train. Good scaling science lets you “run” this (not tiny) experiment at 1/100th the cost.

English
7
13
186
30.8K
Stanford NLP Group retweetledi
clem 🤗
clem 🤗@ClementDelangue·
Hot take: Git was the wrong abstraction for 90% of ML data. Checkpoints, optimizer states, training logs, agent traces - none of this needs version control. It needs fast, cheap, mutable storage. So we built Buckets. S3-like storage on the @huggingface Hub with Xet dedup and zero egress. Train in a bucket. Publish to a repo. One platform. 🤗🤗🤗
clem 🤗 tweet media
English
54
68
952
82.3K
Stanford NLP Group retweetledi
Steven Feng
Steven Feng@stevenyfeng·
We’re bringing back Stanford’s CS25 Transformers course tomorrow! 🤖 It’s open to everyone (in-person + online). Weekly talks (every Thursday) from top AI researchers. One of Stanford’s most popular AI seminar courses. Don’t miss out! More info below 👇 (1/7)
Steven Feng tweet media
English
9
89
625
47.8K
Stanford NLP Group retweetledi
Ari Holtzman
Ari Holtzman@universeinanegg·
making fresh slides for this now, should be fun :)
Stanford NLP Group@stanfordnlp

For this week's NLP Seminar, we are excited to host @universeinanegg from UChicago! Date and Time: Thursday, April 2, 11:00AM — 12:00 PM Pacific Time. Zoom Link: stanford.zoom.us/j/93941842999?… Title: Seeing Like a Language Model Abstract: How does a language model perceive its input? What aspects of reality does it find legible and which elude it? How can we know? Current approaches to studying LLMs—focused on engineering progress—are insufficiently exploratory. I will discuss new approaches we have been incubating and more conceptually what it means for interpretability approaches to be predictive rather than mechanistic, defend prompting as a form of scientific inquiry, and caution against formalizing concepts too early, without doing the required amount of stamp collecting. Bio: Ari Holtzman is an Assistant Professor of Computer Science and Data Science at the University of Chicago, where he leads Conceptualization Lab. His motto is: 'I'm doing it with LLMs or I'm not doing it at all.'

English
0
1
21
7.9K
Stanford NLP Group retweetledi
Stanford NLP Group
Stanford NLP Group@stanfordnlp·
For this week's NLP Seminar, we are excited to host @universeinanegg from UChicago! Date and Time: Thursday, April 2, 11:00AM — 12:00 PM Pacific Time. Zoom Link: stanford.zoom.us/j/93941842999?… Title: Seeing Like a Language Model Abstract: How does a language model perceive its input? What aspects of reality does it find legible and which elude it? How can we know? Current approaches to studying LLMs—focused on engineering progress—are insufficiently exploratory. I will discuss new approaches we have been incubating and more conceptually what it means for interpretability approaches to be predictive rather than mechanistic, defend prompting as a form of scientific inquiry, and caution against formalizing concepts too early, without doing the required amount of stamp collecting. Bio: Ari Holtzman is an Assistant Professor of Computer Science and Data Science at the University of Chicago, where he leads Conceptualization Lab. His motto is: 'I'm doing it with LLMs or I'm not doing it at all.'
Stanford NLP Group tweet media
English
0
8
66
20.5K
Stanford NLP Group retweetledi
Seungju Han
Seungju Han@SeungjuHan3·
can synthetic training beat RAG in data-constrained domains? we suggest a simple recipe for better synthetic training: - Synth Mixed Training: train on both synth QAs and synth docs - Focal Rewriting: rewrite docs with targeted topic prompts results: - beats RAG by +2.6% on QuaLITY - improves to +4.4% with Focal Rewriting - reaches +6.7% when combined with RAG Paper: arxiv.org/abs/2603.23562
Seungju Han tweet media
English
2
16
65
10.9K
Stanford NLP Group retweetledi
Florea AI
Florea AI@FloreaAI·
Myra Cheng, Cinoo Lee, Pranav Khadpe, Sunny Yu, Dyllan Han, and Dan Jurafsky researchers at Stanford University and Carnegie Mellon published one of the most important studies on AI and human behavior. science.org/doi/10.1126/sc…
English
0
1
3
1.8K