Xinyang (Young) Geng

68 posts

Xinyang (Young) Geng

@younggeng

Research scientist at Google DeepMind. Opinions are my own.

Katılım Şubat 2014

526 Takip Edilen1.1K Takipçiler

Sabitlenmiş Tweet

Xinyang (Young) Geng@younggeng·18 Mar

Are you interested in training large models in JAX but are set back by the complicated partition specs and sharding configurations required to scale up? I've recently created scalax, a small library to help developers easily scale up JAX models. github.com/young-geng/sca…

English

234

29.4K

Xinyang (Young) Geng@younggeng·26 Şub

@hyhieu226 @OpenAI @xai Best wishes for you and your family! Hope you'll recover soon.

English

275

Hieu Pham@hyhieu226·26 Şub

I have made the difficult decision to leave @OpenAI. Working here and at @xai before was a once-in-a-lifetime experience. I have met the best people. Not the best people in AI. Not the best people in tech. Simply the best people. At these companies, I have helped creating extremely intelligent entities that will meaningfully improve our lives. The work makes me proud. But the intensive work came with a price. I cannot believe I would say this one day, but I am burnt out. All the mental health deteriorating that I used to scoff at is real, miserable, scary, and dangerous. I am going to take a break from frontier AI labs, and will take my family to my home country Vietnam. There, I will try something new, and also search for a cure for my conditions. I hope I will heal. Until then.

English

1.1K

418

14K

1.2M

Xinyang (Young) Geng retweetledi

Jacob Austin@jacobaustin132·18 Ağu

Today we're putting out an update to the JAX TPU book, this time on GPUs. How do GPUs work, especially compared to TPUs? How are they networked? And how does this affect LLM training? 1/n

English

517

3.5K

402.4K

Xinyang (Young) Geng retweetledi

Jack Rae@jack_w_rae·18 Nis

2.5 Flash is out! You can now specify thinking budgets, or disable thinking entirely for lower latency. Strong code & reasoning capabilities, cost effective, fast. It's a great workhorse model for enterprise and developers, excited to hear your feedback.

Arena.ai@arena

Gemini 2.5 Flash is described as being optimized for speed and scalability. Despite its lighter design, the community voted for it's impressive performance on Hard Prompts, Coding, and Long Queries. Matching the strength of its older sibling, Gemini 2.5 Pro at #1 in these categories

English

202

18K

Xinyang (Young) Geng retweetledi

rdyro@rdyro128523·7 Nis

Llama 4 inference in pure JAX! Expert/tensor parallelism with int8 quantization. Contributions welcome!

English

134

11.5K

Xinyang (Young) Geng retweetledi

Tianhe (Kevin) Yu@TianheYu·2 Nis

The team has built such a smart model. The gap becomes bigger on harder problems!

Mislav Balunović@mbalunovic

Big update to our MathArena USAMO evaluation: Gemini 2.5 Pro, which was released *the same day* as our benchmark, is the first model to achieve non-trivial amount of points (24.4%). The speed of progress is really mind-blowing.

English

5.4K

Xinyang (Young) Geng retweetledi

Jack Rae@jack_w_rae·25 Mar

Today we are launching 2.5 Pro! I think it's the best model in the world. State-of-the-art reasoning and great vibes (+39 ELO gap on lmsys!) 2.5 Pro improves in coding, stem, multimodal, instruction following, and lots more. Available in AI Studio & the Gemini App!

English

461

43.3K

Xinyang (Young) Geng@younggeng·18 Mar

@LiamFedus Congrats Liam!

English

855

William Fedus@LiamFedus·17 Mar

This is what I sent to my colleagues at OpenAI: Hi all, I made the difficult decision to leave OpenAI as an employee, but I’m looking to work closely together as a partner going forward. Contributing to the mission of OpenAI and working with world-class teams to create and improve ChatGPT has been an experience of a lifetime. But I’ve gotten really excited about AI for science. My undergrad was in physics and I’m keen to apply this technology there. Because AI for science is one of the most strategically important areas to OpenAI and achieving ASI, OpenAI is planning to invest in and partner with my new company. So I’ll see you all around! Thanks to all the leadership who believed in me early on, especially, Sam, Greg, and Mark. Thank you everyone on post-training and to all of our collaborators across research and product. I’ll miss working with so many of you, but will be cheering you on! Post-training has an amazing roster of talent and leaders who will continue to drive its success.

English

124

1.8K

508.5K

Xinyang (Young) Geng retweetledi

rdyro@rdyro128523·7 Mar

Deepseek R1 inference in pure JAX! Currently on TPU, with GPU and distilled models in-progress. Features MLA-style attention, expert/tensor parallelism & int8 quantization. Contributions welcome!

English

295

47.9K

Xinyang (Young) Geng retweetledi

Jacob Austin@jacobaustin132·4 Şub

Making LLMs run efficiently can feel scary, but scaling isn’t magic, it’s math! We wanted to demystify the “systems view” of LLMs and wrote a little textbook called “How To Scale Your Model” which we’re releasing today. 1/n

English

388

1.9K

463.1K

Xinyang (Young) Geng retweetledi

Hieu Pham@hyhieu226·30 Oca

Despite many complaints about Jax being hard to use, it has a crucial advantage over PyTorch: for distributed jobs, XLA is sufficiently good at auto-scheduling parallelism strategies, e.g., sharding, overlapping compute and comms. If PyTorch becomes good at that, it's checkmate.

English

171

14.2K

Xinyang (Young) Geng retweetledi

Andrej Karpathy@karpathy·29 Oca

For friends of open source: imo the highest leverage thing you can do is help construct a high diversity of RL environments that help elicit LLM cognitive strategies. To build a gym of sorts. This is a highly parallelizable task, which favors a large community of collaborators.

English

316

823

8.4K

1.2M

Xinyang (Young) Geng retweetledi

Jim Fan@DrJimFan·24 Oca

Whether you like it or not, the future of AI will not be canned genies controlled by a "safety panel". The future of AI is democratization. Every internet rando will run not just o1, but o8, o9 on their toaster laptop. It's the tide of history that we should surf on, not swim against. Might as well start preparing now. DeepSeek just topped Chatbot Arena, my go-to vibe checker in the wild, and two other independent benchmarks that couldn't be hacked in advance (Artificial-Analysis, HLE). Last year, there were serious discussions about limiting OSS models by some compute threshold. Turns out it was nothing but our Silicon Valley hubris. It's a humbling wake-up call to us all that open science has no boundary. We need to embrace it, one way or another. Many tech folks are panicking about how much DeepSeek is able to show with so little compute budget. I see it differently - with a huge smile on my face. Why are we not happy to see *improvements* in the scaling law? DeepSeek is unequivocal proof that one can produce unit intelligence gain at 10x less cost, which means we shall get 10x more powerful AI with the compute we have today and are building tomorrow. Simple math! The AI timeline just got compressed. Here's my 2025 New Year resolution for the community: No more AGI/ASI urban myth spreading. No more fearmongering. Put our heads down and grind on code. Open source, as much as you can. Acceleration is the only way forward.

English

215

635

3.1K

464.7K

Xinyang (Young) Geng retweetledi

Andrej Karpathy@karpathy·23 Oca

It’s done because it’s much easier to 1) collect, 2) evaluate, and 3) beat and make progress on. We’re going to see every task that is served neatly packaged on a platter like this improved (including those that need PhD-grade expertise). But jobs (even intern-level) that need long, multimodal, coherent, error-correcting sequences of tasks glued together for problem solving will take longer. They are unintuitively hard, in a Moravec’s Paradox sense. Fwiw I’m ok and happy to see harder “task” evals. Calling it humanity’s last exam is a bit much, and misleading.

Niels Rogge@NielsRogge

English

235

2.5K

423.9K

Xinyang (Young) Geng retweetledi

Jerry Tworek@MillionInt·29 Ara

Simplify. Scale. Resolve bottlenecks. Repeat.

English

132

9.1K

Xinyang (Young) Geng retweetledi

Jack Rae@jack_w_rae·27 Ara

Appreciate @aidan_mclau looking into the thinking model results. Originally scores looked weak as the response was plucked from the thought content versus output. We are looking into ways of making thinking output less confusing for people running evals. This is why we 🚢, to collect feedback and iterate!

Aidan McLaughlin@aidan_mclau

two aidanbench updates: > gemini-2.0-flash-thinking is now #2 (explanation for score change below) > deepseek v3 is #22 (thoughts below)

English

102

19.5K

Xinyang (Young) Geng@younggeng·20 Ara

Very excited to share the thinking model! It has been a lot of fun working on this.

Jack Rae@jack_w_rae

We released Gemini 2.0 Flash Thinking today! ⚡️🤔 It's a small step towards improved reasoning via inference-time compute, built on top of our small and mighty 2.0 Flash!

English

2.6K

Xinyang (Young) Geng retweetledi

Jerry Tworek@MillionInt·14 Ara

People completely misunderstand the data wall. It's the data slop wall. Most of the data is so bad it's a waste of a good gpu to backprop it.

English

231

24.3K

Xinyang (Young) Geng retweetledi

Charlie Snell@sea_snell·27 Kas

Can we predict emergent capabilities in GPT-N+1🌌 using only GPT-N model checkpoints, which have random performance on the task? We propose a method for doing exactly this in our paper “Predicting Emergent Capabilities by Finetuning”🧵

English

570

156.3K

Xinyang (Young) Geng retweetledi

Cristian Garcia@cgarciae88·12 Kas

People learning JAX, feel free to reach out if the learning feels too steep, hopefully we can flatten it out. Also, checkout the JAX LLM for help from the community: discord.gg/m9NDrmENe2

xjdr@_xjdr

This has been and will continue to be my recommendation for anyone in this position. Learn jax and sign up for sites.research.google/trc/about/ Its one of the best things Google has ever done. You can do meaningful research for free, but the learning curve is steep. strap in

English

281

32.8K

Xinyang (Young) Geng retweetledi

Ayaka Mikazuki (Keep4o)@ayaka14732·29 Eki

We finally have an official `nvidia-smi` for TPU 🎉 Simply install it with `pip install tpu-info`

English

865

81.5K

Keşfet

@hyhieu226 @OpenAI @xai @LiamFedus @aidan_mclau @elonmusk @BarackObama @taylorswift13