Nadav Timor

300 posts

Nadav Timor banner
Nadav Timor

Nadav Timor

@NadavTimor

AI inference, speculative decoding, open source. Built novel decoding algorithms – default in Hugging Face Transformers (155+ ⭐). Making AI faster + cheaper

nyc Katılım Aralık 2017
7.5K Takip Edilen1.3K Takipçiler
Tanishq Kumar
Tanishq Kumar@tanishqkumar07·
I've been working on a new LLM inference algorithm. It's called Speculative Speculative Decoding (SSD) and it's up to 2x faster than the strongest inference engines in the world. Collab w/ @tri_dao @avnermay. Details in thread.
English
134
454
4.1K
603.9K
Mira Murati
Mira Murati@miramurati·
We have parted ways with Barret Zoph. Soumith Chintala will be the new CTO of Thinking Machines. He is a brilliant and seasoned leader who has made important contributions to the AI field for over a decade, and he’s been a major contributor to our team. We could not be more excited to have him take on this new responsibility.
English
250
136
4.2K
989.7K
Nadav Timor
Nadav Timor@NadavTimor·
@sgl_project released eagle3 checkpoints for sota models (incl. kimi-k2, gpt-oss, deepseek-v3.2) + the training recipe
LMSYS Org@lmsysorg

Speculative decoding has shown a lot of promise, though broader adoption has taken time due to the complexity of building production-ready tooling and high-quality draft models. We’re releasing SpecBundle, a collection of large-scale EAGLE-3 draft models trained with SpecForge v0.2. This release brings major system improvements, including refactored training pipelines, multi-backend support with SGLang and @huggingface , and better usability at scale. We also built a performance dashboard to make real end-to-end speedups visible across models and settings. See the dashboard and blog in the thread 👇

English
0
0
10
1.5K
Sergey Levine
Sergey Levine@svlevine·
@NadavTimor Varies by task but typical numbers are around 8 hours (which is about two work days for one person to collect with breaks, resets, etc.). Using a pre-trained robot foundation model drastically lowers the per-task data requirements, which I guess is not surprising.
English
1
0
7
444
Sergey Levine
Sergey Levine@svlevine·
A while back Benjie Holson described a set of "Robot Olympics" challenge tasks -- washing a pan, making a peanut butter sandwich, and more. We tried to fine-tune our models at PI to these tasks, and found that we could do most of them. A few highlights below.
English
9
30
303
81.1K
Nadav Timor
Nadav Timor@NadavTimor·
Tons of high-impact opportunities! And btw, our NYC open-space inference hub is still welcoming active vLLM/SGLang contributors
Greg Brockman@gdb

inference is perhaps the most valuable emerging software category. as models get smarter and more economically valuable, compute will increasingly be spent drawing samples from the models. if you'd like to work on inference at openai, reach out — gdb@openai.com. include a description of an exceptional team you've been a part of, and your contribution towards that team's goals. also indicate any experience in inference, large-scale system optimization, or other areas where you've built up domain expertise. lots of exciting problems to work on, ranging from deeply understanding the model forward pass (including simulating/finding creative opportunities for optimization); to system-level efficiencies such as speculative decoding or kv offloading or workload-aware load balancing; to managing and making observable a massive fleet at scale.

English
0
1
11
2.1K
Stefano Ermon
Stefano Ermon@StefanoErmon·
When we began applying diffusion to language in my lab at Stanford, many doubted it could work. That research became Mercury diffusion LLM: 10X faster, more efficient, and now the foundation of @_inception_ai. Proud to raise $50M with support from top investors.
Inception@_inception_ai

Today’s LLMs are painfully slow and expensive. They are autoregressive and spit out words sequentially. One. At. A. Time. Our dLLMs generate text in parallel, delivering answers up to 10X faster. Now we’ve raised $50M to scale them. Full story from @russellbrandom in @TechCrunch. techcrunch.com/2025/11/06/inc…

English
40
81
1.3K
200.2K
Elon Musk
Elon Musk@elonmusk·
Diffusion will obviously work on any bitstream. With text, since humans read from first word to last, there is just the question of whether the delay to first sentence for diffusion is worth it. That said, the vast majority of AI workload will be video understanding and generation, so good chance diffusion is the biggest winner overall. Also means that the ratio of compute to memory bandwidth will increase.
English
129
186
2.3K
581.3K
Matt Hartman
Matt Hartman@MattHartman·
.@NadavTimor and I are going to train a SO-ARM101 with @LeRobotHF at the @huggingface office next week. If you’re in NYC and have an ARM101 and want to join us let me know! BYO arm 🦾
English
5
1
15
4K
Nadav Timor
Nadav Timor@NadavTimor·
@WajahatAli_231 Just drop links to your PRs here and we’ll add you to the next sprint 🙂
English
0
0
0
201
Nadav Timor
Nadav Timor@NadavTimor·
NYC open-source AI infra contributors — we’ve launched a community research hub above Grand Central where GPUs go brrr 🔥🗽 A place to hack, benchmark, and collaborate — vLLM, SGLang, kernels, inference optimizations all welcome. Open space. Open source. Weekends too. Huge thanks to @Company for supporting this initiative 🙌 𝐋𝐢𝐦𝐢𝐭𝐞𝐝 𝐬𝐞𝐚𝐭𝐬. 𝐃𝐫𝐨𝐩 𝐲𝐨𝐮𝐫 𝐏𝐑𝐬 𝐢𝐧 𝐭𝐡𝐞 𝐜𝐨𝐦𝐦𝐞𝐧𝐭𝐬 𝐭𝐨 𝐣𝐨𝐢𝐧 𝐭𝐡𝐞 𝐧𝐞𝐱𝐭 𝐬𝐩𝐫𝐢𝐧𝐭!
English
8
11
92
9.3K
Nadav Timor
Nadav Timor@NadavTimor·
@vamshi_ihsmav Just drop links to your PRs here and we’ll add you to the next sprint 🙂
English
0
0
0
263
Nadav Timor retweetledi
Ravid Shwartz Ziv
Ravid Shwartz Ziv@ziv_ravid·
Come to work with Nadav, you will not regret it...
Nadav Timor@NadavTimor

NYC open-source AI infra contributors — we’ve launched a community research hub above Grand Central where GPUs go brrr 🔥🗽 A place to hack, benchmark, and collaborate — vLLM, SGLang, kernels, inference optimizations all welcome. Open space. Open source. Weekends too. Huge thanks to @Company for supporting this initiative 🙌 𝐋𝐢𝐦𝐢𝐭𝐞𝐝 𝐬𝐞𝐚𝐭𝐬. 𝐃𝐫𝐨𝐩 𝐲𝐨𝐮𝐫 𝐏𝐑𝐬 𝐢𝐧 𝐭𝐡𝐞 𝐜𝐨𝐦𝐦𝐞𝐧𝐭𝐬 𝐭𝐨 𝐣𝐨𝐢𝐧 𝐭𝐡𝐞 𝐧𝐞𝐱𝐭 𝐬𝐩𝐫𝐢𝐧𝐭!

English
1
1
6
3K