Tao Xu

296 posts

Tao Xu

Tao Xu

@txhf

Learning Machine at OpenAI

Se unió Şubat 2013
1.2K Siguiendo9.4K Seguidores
Tweet fijado
Tao Xu
Tao Xu@txhf·
Back in the days, @alexandr_wang helped recruit me to OpenAI. Now I still chose to stay at OpenAI after 6 years and all the craziness. Long time ago, I also worked at Meta and built a few things there, I am definitely happier here.
Tao Xu tweet media
English
28
22
1.1K
144.6K
Tao Xu
Tao Xu@txhf·
@percyliang @Jianlin_S is QB too greedy that needs to suboptimal specialization? would it require injecting gumbel noise before QB or temperature scaled QB? @Jianlin_S
English
0
0
0
113
Percy Liang
Percy Liang@percyliang·
Marin is using quantile balancing from @Jianlin_S (who developed RoPE, which was also a good idea) to train our current 1e23 FLOPs MoE. The idea is elegant: assigning tokens to experts by solving a linear program. No hyperparameters to tune. Yields stable training.
Larry Dial@classiclarryd

Researchers' brilliant ideas often get lost in the sea of endless SOTA claims on weak baselines. At Marin we battle-test ideas in an open arena, where anyone's idea can be promoted to the next hero run. One that recently rose up was @Jianlin_S MoE Quantile Balancing, used in our last 1e22 and ongoing 130B run. Animated visuals of how QB performed are available in the OpenAthena blog. openathena.ai/blog/quantile-…

English
4
32
312
71.1K
Tao Xu
Tao Xu@txhf·
@FakePsyho very cool! are we all the last generation of programmers?
English
1
0
1
319
Psyho
Psyho@FakePsyho·
The CodinGame contest is over and my pure vibecoded solution placed 13th (out of 2382 people who submitted anything) I won't lie, this feels unfair. My estimate is that I spent around 3 active hours (reading, thinking, prompting, looking at replays, etc.) on the problem itself in a two-week contest. A reasonable number would probably be something in 20-40 hours range to break into the top20, assuming you're good. And just to clear, I didn't do any manual coding, I didn't look at the code, except once, where the agent couldn't find a bug. I invested a few more hours into experimenting with evolve-like approaches, but didn't get any value from that. If I can find enough time, I'll write a short summary of this (very) short journey.
Psyho tweet mediaPsyho tweet media
English
7
6
156
8.7K
Tao Xu
Tao Xu@txhf·
@pmddomingos other than sugar coating unification, what can it do that DL can not do?
English
0
0
0
221
Tao Xu
Tao Xu@txhf·
@natolambert wondering how many of those equations will survive the test of time.
English
0
0
1
143
Nathan Lambert
Nathan Lambert@natolambert·
Made a language model RL cheatsheet for the extra page on the inside back cover of the physical edition RLHF Book.
Nathan Lambert tweet media
English
29
101
965
53.8K
Tao Xu
Tao Xu@txhf·
@Yuhu_ai_ Congrats on your achievements on your ride of a lifetime, looking forward to seeing your next adventure.
English
0
0
1
879
Yuhuai (Tony) Wu
Yuhuai (Tony) Wu@Yuhu_ai_·
I resigned from xAI today. This company - and the family we became - will stay with me forever. I will deeply miss the people, the warrooms, and all those battles we have fought together. It's time for my next chapter. It is an era with full possibilities: a small team armed with AIs can move mountains and redefine what's possible. Thank you to the entire xAI family. Onward. 🚀 And to Elon @elonmusk - thank you for believing in the mission and for the ride of a lifetime.
English
742
366
9.3K
3.6M
Tao Xu retuiteado
Trevor Cai
Trevor Cai@trevorycai·
3 years ago, we emailed Jensen with requests for Blackwell. Today, we released GPT-5.3-Codex, a SOTA model designed for GB200-NVL72. Nitpicking ISA, simming rack designs, and tailoring our arch to the system has been a fun experience! I'm grateful to our collaborators at NVIDIA.
Trevor Cai tweet media
English
89
75
1.6K
399K
Tao Xu
Tao Xu@txhf·
@adityaag just quoting Charles Dickens, combining last a few due to the length limit.
English
0
0
7
695
Tao Xu
Tao Xu@txhf·
@adityaag It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of light, it was the season of darkness, it was the spring of hope, was the winterofdespair
English
5
19
390
96K
Aditya Agarwal
Aditya Agarwal@adityaag·
It's a weird time. I am filled with wonder and also a profound sadness. I spent a lot of time over the weekend writing code with Claude. And it was very clear that we will never ever write code by hand again. It doesn't make any sense to do so. Something I was very good at is now free and abundant. I am happy...but disoriented. At the same time, something I spent my early career building (social networks) was being created by lobster-agents. It's all a bit silly...but if you zoom out, it's kind of indistinguishable from humans on the larger internet. So both the form and function of my early career are now produced by AI. I am happy but also sad and confused. If anything, this whole period is showing me what it is like to be human again.
English
465
1.8K
15.7K
3.3M
Tao Xu retuiteado
Neil Rathi
Neil Rathi@neil_rathi·
New paper, w/@AlecRad Models acquire a lot of capabilities during pretraining. We show that we can precisely shape what they learn simply by filtering their training data at the token level.
Neil Rathi tweet media
English
27
98
1.1K
106.1K
Tao Xu
Tao Xu@txhf·
@soumithchintala thank you for your work on pytorch. I left FB 12+ years ago, it was pretty damned good outside.
English
0
0
1
786
Soumith Chintala
Soumith Chintala@soumithchintala·
Leaving Meta and PyTorch I'm stepping down from PyTorch and leaving Meta on November 17th. tl;dr: Didn't want to be doing PyTorch forever, seemed like the perfect time to transition right after I got back from a long leave and the project built itself around me. Eleven years at Meta. Nearly all my professional life. Making many friends for life. Almost eight years leading PyTorch, taking it from nothing to 90%+ adoption in AI. Walking away from this was one of the hardest things I've ever done. But I'm leaving with a full heart. PyTorch handles exascale training now. It powers foundation models that are redefining intelligence. It's in production at virtually every major AI company. It's taught in classrooms from MIT to rural India. The tools I dreamed about making accessible? They are. The barrier to entry I wanted to lower? It's almost gone. To be clear, there’s so much more to do. As long as AI evolves at a breakneck pace, PyTorch will continue to play catch up. Obsessing over the yet-to-come sometimes makes us forget how much we’ve already done. To everyone who built this with me—who believed research should be joyful, that tools should be elegant, that open source changes everything—thank you. This wasn't my journey. It was ours. What's next for me? Something small. Something new. Something I don't fully understand yet. Something uncomfortable. I could have moved to something else inside Meta. But I needed to know what's out there. I needed to do something small again. I couldn't live with the counterfactual regret of never trying something outside Meta. It's very hard to leave. I probably have one of the AI industry’s most leveraged seats, I lead the software layer that powers the entire AI industry. Every major AI company and hardware vendor are on a speed dial. This kind of power is really hard to give up. But curiosity ultimately won out in my head. Keep making AI delicious and accessible. I'll be watching. Probably filing issues. Definitely staying involved. Is PyTorch going to be okay? I don't want to be doing PyTorch forever. I don't want to be like Guido or Linus— bound to a single thing for decades. Last November, coinciding with the birth of my daughter, I started planning my exit with Aparna. My goal was to leave PyTorch in a good and stable place. By this August, during the second half of my parental leave, I knew: Edward, Suo, Alban, Greg, John, Joe and Jana were ready. The team faced hard people, product, technical and organizational problems and didn’t feel the need to lean back on me to solve these for them (unlike in the past). The product story they crafted for the PyTorch Conference was coherent—really coherent. The things I'd flagged red were turning healthy. The project didn't need me anymore. Unlike 2020-2022 (when I stepped down to go do robotics and came back when Lin, Dima and Dwarak left), I have strong confidence that this time PyTorch is truly resilient. The most aligned culture carriers of PyTorch – Greg, Alban, Ed, Jason and Joe are at the decision table now, and people with strong value alignment – Suo, John and Jana have joined them at the table. And there’s a long list of equally value-aligned people willing to sit at the table should any of these people leave. There are many little things that make up my confidence on the people – John worked on Julia and open-source for a very long time (in fact we hacked a Torch.jl in 2015), Suo has been the strongest systems builder and strategic partner I’ve had for the past two years, and Jana worked on resilient core systems for a very long time, I’ve had long technical and organizational discussions with her over the past few months that give me confidence. And the product lineup and execution in 2025 should be sufficient evidence for any remaining doubt. I’m confident that this band of PyTorchers are going to do exceptionally well. PyTorch might change in flavor because I no longer impose my own taste from the top, but I’m confident that the values are going to stay intact and the product is going to be awesome. My time at Meta The early years of FAIR were absolutely magical. I was part of a small family of absolutely brilliant people building state-of-the-art AI out in the open. From working on GANs with Emily Denton, Rob Fergus, Leon Bottou, Martin Arjovsky and the (now legendary) Alec Radford to building Starcraft bots with Gabriel Synnaeve, to building the first FAIR Cluster with Howard Mansell, to working on object detection with Adam Lerer and Piotr Dollar, to building PyTorch. It was more fun than I can describe in words. 2015 and 2016 were probably the most productive and professionally enjoyable years of my life. I’ll probably romanticize this period of my life forever. When I joined FAIR, I had massive impostor syndrome, and the first 3 months were very very difficult. I can’t credit Andrew Tulloch enough for being the most thoughtful, kind and welcoming mentor, without whom I wouldn’t have made it. I’m so damn bullish for Meta just from the fact that he’s back. --- My time on PyTorch was special. I loved every part of building it—designing it, managing it, being the PM, TL, comms lead, doc engineer, release engineer, squashing bugs, growth hacking, turning it into a coherent product with hundreds of people, transitioning it to industry stakeholdership – the whole nine yards. To the core PyTorch team at Meta: the engineers, researchers, open-source maintainers, docs writers, CI infrastructure folks, hardware partners, the community builders. To the hundreds more inside and outside Meta—thank you. You turned a library into a movement. There are too many people to credit and thank, but I can't not mention Adam Paszke, Sam Gross, Greg Chanan, Joe Spisak, Alban Desmaison, Edward Yang, Richard Zou, Tongzhou Wang, Francisco Massa, Luca Antiga, Andreas Köpf, Zach DeVito, Zeming Lin, Adam Lerer, Howard Mansell and Natalia Gimelshein. And Schrep. They made the launch happen. And so many more people became centrally important later: Lu Fang, Xiaodong Wang, Junjie Bai, Nikita Shulga, Horace He, Mark Saroufim, Jason Ansel, Dmytro Dzhulgakov, Yangqing Jia, Geeta Chauhan, Will Constable, Briah Hirsh, Jane Xu, Mario Lezcano, Piotr Balecki, Yinghai Lu, Less Wright, Andrew Tulloch, Bruce Lin, Woo Kim, Helen Suk, Chris Gottbrath, Peng Wu, Joe Isaacson, Eli Uriegas, Tristan Rice, Yanan Cao, Elias Ellison, Animesh Jain, Peter Noordhuis, Tianyu Liu, Yifu Wang, Lin Qiao and hundreds more. It’s criminal of me to not take the space to list out everyone else I should be mentioning here. PyTorch is nothing without its people ❤️. The most joyful moments of building PyTorch was meeting users eager to share their happiness, love and feedback. I remember a grad student coming to me at Neurips 2017, in a slurring emotional voice he said he’d been trying to make progress on his research for 3 years but within 3 months of using PyTorch he made so much progress that he was ready to graduate. That moment made it tangible that what we do matters, a lot, to a lot of people, even if you don't constantly hear from them. I do miss the intimacy of the PyTorch community, with a 300 person conference that felt like an extended family gathering, but I feel that’s a small price to pay considering the scale of impact PyTorch is truly having today – yes the Conference is now 3,000 people where market-moving deals get brokered, but it’s helping orders of magnitude more people to do their best AI work. I miss the intimacy, but I'm proud of that growth. --- To Mark Zuckerberg and Mike Schroepfer, who believed that open-sourcing is fundamentally important and is a sound business strategy. This is so hard to understand for most people within the course of business, but we’ve run lock-step on this strategy without ever having to discuss it. Without you two, neither FAIR nor PyTorch would’ve happened. And those mean so much to me. To Yann LeCun and Rob Fergus, for building the magical early FAIR that I so revere. To Aparna Ramani, a leader that I find so rare at Meta in her ability to hold a really high bar for the org, technically brilliant with the span to discuss deep infra systems and industry-strategy within the same conversation and for being an absolute execution-machine! I’ve learned so much from you. To Santosh, Kaushik, Delia, Oldham and Ben for being so welcoming to Infra. For someone coming over from FAIR with a wildly different culture, you all made me feel at home and made me part of the family, and thank you for that. To all my managers who've championed me through the PSC video game – Serkan, Howard, Jerome, Abhijit, Yoram, Joelle, Aparna and Damien – I owe you a lifetime of drinks. --- Signing off for now. —Soumith
Soumith Chintala tweet media
English
490
569
10.8K
2.5M
Tao Xu retuiteado
Sam Altman
Sam Altman@sama·
if i were like, a sports star or an artist or something, and just really cared about doing a great job at my thing, and was up at 5 am practicing free throws or whatever, that would seem pretty normal right? the first part of openai was unbelievably fun; we did what i believe is the most important scientific work of this generation or possibly a much greater time period than that. this current part is less fun but still rewarding. it is extremely painful as you say and often tempting to nope out on any given day, but the chance to really "make a dent in the universe" is more than worth it; most people don't get that chance to such an extent, and i am very grateful. i genuinely believe the work we are doing will be a transformatively positive thing, and if we didn't exist, the world would have gone in a slightly different and probably worse direction. (working hard was always an extremely easy trade until i had a kid, and now an extremely hard trade.) i do wish i had taken equity a long time ago and i think it would have led to far fewer conspiracy theories; people seem very able to understand "ok that dude is doing it because he wants more money" but less so "he just thinks technology is cool and he likes having some ability to influence the evolution of technology and society". it was a crazy tone-deaf thing to try to make the point "i already have enough money". i believe that AGI will be the most important technology humanity has yet built, i am very grateful to get to play an important role in that and work with such great colleagues, and i like having an interesting life.
English
979
816
16.4K
1.9M
Ivanka Trump
Ivanka Trump@IvankaTrump·
6. True freedom begins in that stillness — the ability to choose, rather than react. As Viktor Frankl wrote, “Between stimulus and response there is a space. In that space is our power to choose our response. In our response lies our growth and our freedom.”
English
36
68
561
42.5K
Ivanka Trump
Ivanka Trump@IvankaTrump·
Each birthday (and today is my 44th!) invites reflection—on what I’ve learned, what I hold dear, and how I want to walk forward with greater clarity, courage, and grace.
Ivanka Trump tweet media
English
7K
2.3K
20.6K
2.3M
Tao Xu
Tao Xu@txhf·
@kyutai_labs a bug here? shouldn't it be z_quantized in the rhs?
Tao Xu tweet media
English
1
0
1
599
kyutai
kyutai@kyutai_labs·
1/2 We’re releasing an in-depth tutorial on neural audio codecs, the secret sauce that makes it possible for audio LLMs to not sound like a horror movie:
English
12
55
435
47.3K
Tao Xu
Tao Xu@txhf·
@RichardSSutton TD is fundamentally a value based approach, but RL in LLM is almost purely policy based, GRPO even gets rid of value function completely?
English
0
0
3
599
Richard Sutton
Richard Sutton@RichardSSutton·
To learn more about temporal difference learning, you could read the original paper (incompleteideas.net/papers/sutton-…) or watch this video (videolectures.net/videos/deeplea…).
Khurram Javed@kjaved_

The Dwarkesh/Andrej interview is worth watching. Like many others in the field, my introduction to deep learning was Andrej’s CS231n. In this era when many are involved in wishful thinking driven by simple pattern matching (e.g., extrapolating scaling laws without nuance), it’s refreshing to hear an influential voice that is tethered to reality. One clarification for the podcast is that when Andrej says humans don’t use reinforcement learning, he is really saying humans don't use returns as learning targets. His example of LLMs struggling to learn to solve math problems from outcome-based rewards also elucidates the problem with learning directly from returns. Fortunately for RL, this exact problem is solved by temporal difference (TD) learning. All sample-efficient RL algorithms that show human-like learning (e.g., sample-efficient learning on Atari, and our work on learning from experience directly on a robot) rely on TD learning. Now Andrej is not primarily an RL person; he is looking at RL through the lens of LLMs these days, and all RL done in LLMs uses returns as targets, so it’s understandable that he is assuming that RL is all about learning from observed returns. But this assumption leads him to the incorrect conclusion that we need process-based dense rewards for RL to work. If you embrace TD learning, then you don't necessarily need a dense reward. Once you have learned a value function that encodes useful knowledge about the world, you can learn on the fly in the absence of rewards, just like humans and animals. This is possible because in TD learning there is no difference between learning from an unexpected reward and learning from an unexpected change in perceived value.

English
19
120
1.1K
159.4K