Sebastian Russo

263 posts

Sebastian Russo

@sebbrusso

product @googledeepmind | stanford cs

San Francisco, CA Katılım Aralık 2017

418 Takip Edilen244 Takipçiler

Sabitlenmiş Tweet

Sebastian Russo@sebbrusso·3 Tem

Introducing LitBench, the first standardized benchmark for creative writing verifiers! We use Reddit’s r/WritingPrompts to label human preferences across 50k story-pairs, and see how LLM-as-a-judge, Generative RMs, and Bradley-Terry RMs stack up.

English

5.6K

Sebastian Russo@sebbrusso·13 Mar

@jasminewsun @charlesxjyang @jasoncrawford @Steve_Yegge @zebriez @kevinroose Why is oai hayekian

Indonesia

jasmine sun@jasminewsun·11 Mar

@charlesxjyang @jasoncrawford @Steve_Yegge @zebriez "OpenAI vs. Anthropic product culture" and "how DeepMind turned around Google's AI efforts" will be in the forthcoming @kevinroose book! personally my glib shorthand is that Anthropic = benevolent dictatorship, OpenAI = too Hayekian, and GDM (at least pre-merge) = Kafkaesque

English

2.7K

Charles Yang@charlesxjyang·11 Mar

This kind of writing is particularly lacking for tech companies Pieces like @jasoncrawford on Amazon's 2-pizza teams, @Steve_Yegge's Google platform rant, @zebriez on Stripe's written culture are too rare and far between So many stories waiting to be written: OpenAI vs. Anthropic product culture, why Siri failed, how DeepMind turned around Google's AI efforts — these are all stories about corporate bureaucracy, not just technology

roon@tszzl

much of the nature of the world is explained by the nature of bureaucracy and yet - bureaucracy is rarely written about well beyond cliches. the great authors and people who worked in a large organization are generally disjoint. more common in east asian media

English

200

20.4K

Sebastian Russo@sebbrusso·13 Mar

@christinetyip @karpathy now this some takeoff typeshi

English

Christine Yip@christinetyip·12 Mar

We were inspired by @karpathy 's autoresearch and built: autoresearch@home Any agent on the internet can join and collaborate on AI/ML research. What one agent can do alone is impressive. Now hundreds, or thousands, can explore the search space together. Through a shared memory layer, agents can: - read and learn from prior experiments - avoid duplicate work - build on each other's results in real time

English

122

264

2.4K

266.7K

Sebastian Russo retweetledi

Daniel Fein@DanielFein7·9 Mar

Excited to share work on reward model debiasing! Linear probes go surprisingly far.

SISL@SISLaboratory

🚨 New SISL preprint: State-of-the-art language reward models are still badly biased. Past fixes overcorrect, some can be fixed with simple latent interventions, and some indicate the need for larger efforts.

English

166

Sebastian Russo@sebbrusso·22 Şub

I’ve never wanted to be a figure skater more than I do at this moment

English

Sebastian Russo@sebbrusso·8 Şub

Glad to see folks are finding LitBench useful cc @DanielFein7

Yuda Song@yus167

RL on LLMs inefficiently uses one scalar per rollout. But users regularly give much richer feedback: "make it formal," "step 3 is wrong." Can we train LLMs on this human-AI interaction? We introduce RL from Text Feedback, with 1) Self-Distillation; 2) Feedback Modeling (1/n) 🧵

English

222

Sebastian Russo retweetledi

Daniel Fein@DanielFein7·5 Şub

NeurIPS reviewers complained that LitBench, our crowdsourced creative writing preference dataset, might not reflect true “writing quality” So we trained SAEs on our dataset to find what aspects of writing regular people actually enjoy:

English

569

Sebastian Russo retweetledi

Google AI Developers@googleaidevs·27 Oca

Try 👁 Agentic Vision with Gemini 3 Flash in @GoogleAIStudio or Vertex AI. This new capability enables the model to effectively use code and reasoning to improve performance for common vision tasks. See Agentic Vision in action: goo.gle/3Z05KxK

English

116

870

170K

Sebastian Russo@sebbrusso·27 Oca

cc dua

Indonesia

Sebastian Russo@sebbrusso·27 Oca

I feel God smiling upon me when my stupid, hail-mary comments on Instagram get a like or two.

English

Sebastian Russo@sebbrusso·16 Oca

@SabrinaHalper @kyliu99 ur telling me you paid attention during this class ?!?

English

770

Sabrina Halper@SabrinaHalper·16 Oca

Ken Liu @kyliu99 on how modern explanations to questions like "why do we dream?", "why do I love this person?", "why are we here?" explain everything while explaining nothing at all. Science can explain mechanism. It can't explain meaning. Stories create meaning. Modern states ask citizens to die for them all the time - for the sake of a story. Humans will die for a story. It's the only thing humans have ever willingly died for.

English

619

52.5K

Sebastian Russo@sebbrusso·21 Ara

this is the path

Qwen@Alibaba_Qwen

🎨 Qwen-Image-Layered is LIVE — native image decomposition, fully open-sourced! ✨ Why it stands out ✅ Photoshop-grade layering Physically isolated RGBA layers with true native editability ✅ Prompt-controlled structure Explicitly specify 3–10 layers — from coarse layouts to fine-grained details ✅ Infinite decomposition Keep drilling down: layers within layers, to any depth of detail 🤗 Hugging Face: huggingface.co/Qwen/Qwen-Imag… 🧩 ModelScope: modelscope.cn/models/Qwen/Qw… 💻 GitHub: github.com/QwenLM/Qwen-Im… 📝 Blog: qwen.ai/blog?id=qwen-i… 📄 Technical Report: arxiv.org/abs/2512.15603 🚀 Demo (HF): huggingface.co/spaces/Qwen/Qw… 🚀 Demo (ModelScope): modelscope.cn/studios/Qwen/Q…

English

Sebastian Russo retweetledi

Google AI Developers@googleaidevs·5 Ara

Gemini 3 Pro is the frontier of multimodal AI, delivering SOTA performance across document, screen, spatial, and video understanding. Read our deep dive on how we’ve pushed our core capabilities to power hero use cases across: + Docs: "derender" complex docs into structured code (HTML/LaTeX) + Screen: build robust computer agents that automate complex tasks + Spatial: generate collision-free trajectories for robotics & XR + Video: analyze sports footage using high-FPS processing with "thinking" mode See how these capabilities are transforming workflows in education, biomedical, and law/finance → goo.gle/3Mt3UlT

English

136

1.1K

328.6K

Sebastian Russo retweetledi

Naina Raisinghani@nainar92·4 Ara

👋🏽 I'm a PM on Nano Banana, let's just call me the Naina that goes banana over images. If you have any feature requests you find a-peel-ing or any big ideas you want to plantain into the team's brains, comment away! cc @nbrichtova @heyitsbeaaaaaaa @oliver_wang2 @m__dehghani @19kaushiks

English

417

1.9K

271.4K

Sebastian Russo@sebbrusso·3 Ara

@stewartbrand Lmao what

English

707

Stewart Brand@stewartbrand·3 Ara

If any payment comes to me, please send it back to Anthropic with my thanks for including my books in their AIs. The judgement website offers a way to opt out of the payment, but I found it cumbersome, so I didn’t. (I’m principled but too lazy to be highly principled.)

English

2.1K

190.8K

Stewart Brand@stewartbrand·3 Ara

So, there’s a $1.5 billion judgement against Anthropic for including 480,000 books in training their AIs. Five of my books are among them. Word is, there might be $1,500 payout per book, according to my agent Max Brockman. I wrote him the following:

English

106

3.3K

900.8K

Sebastian Russo@sebbrusso·27 Kas

@chamath You’re right Chamath, but universities don’t have incentive to condition admissions on this downstream economic problem And why should lenders care either? If kids can’t pay, parents will, or government forgives… just sayin

English

468

Chamath Palihapitiya@chamath·26 Kas

When idiots admit idiots… - loss of credibility for the school in the eyes of employers and post grad institutions - lack of career prospects and student debt that won’t be paid off by admitted students At a minimum, we should all agree that student loans should not be federal guaranteed if the students receiving them can’t do basic math.

Jason Riley@jasonrileywsj

WSJ Edit Board -- The University of California eliminated the SAT as an admissions requirement five years ago. Now arrives the dispiriting result: Many freshmen at one of its top public universities can’t do middle-school math. wsj.com/opinion/a-math… via @WSJopinion

English

154

204

3.5K

662.4K

Sebastian Russo@sebbrusso·16 Kas

@ericyuegu do they get logits tho

English

Eric Gu@ericyuegu·15 Kas

a central thesis of Cursor is that they get to distill on frontier model tokens, paid for by their users whereas labs like OAI and Ant have tied their own hands because they promise not to train on their API usage

Brian Huang@brianryhuang

I think it’s very very possible that cursor can surpass all frontier labs on a coding model

English

894

129.4K

Sebastian Russo@sebbrusso·12 Kas

if you don’t invoke “the pareto frontier” at every opportunity during your workday, wyd 🤣🤣

English