Saar Eliad

448 posts

Saar Eliad banner
Saar Eliad

Saar Eliad

@saareliad

AI sys Researcher & Tech lead | Computer Eng. | Computer Science | Father x3 🇮🇱

Israel Katılım Ağustos 2013
675 Takip Edilen93 Takipçiler
Saar Eliad retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Expectation: the age of the IDE is over Reality: we’re going to need a bigger IDE (imo). It just looks very different because humans now move upwards and program at a higher level - the basic unit of interest is not one file but one agent. It’s still programming.
Andrej Karpathy@karpathy

@nummanali tmux grids are awesome, but i feel a need to have a proper "agent command center" IDE for teams of them, which I could maximize per monitor. E.g. I want to see/hide toggle them, see if any are idle, pop open related tools (e.g. terminal), stats (usage), etc.

English
826
831
10.5K
2.4M
Saar Eliad
Saar Eliad@saareliad·
I'm pretty familiar with Groq's technology and have led research in similar directions. Serving from SRAM is recognized as a frontier for achieving high Tokens/sec/user, becoming increasingly critical as more complex apps (agentic loops, test-time compute) emerge.
English
0
0
0
48
Saar Eliad
Saar Eliad@saareliad·
Nice. I would mention "improved inference systems" as an enabler behind many of the highlights. A wide combination of domains enables wrapping AI as a readily available product (with an affordable price, reasonable response times, and quality) behind our favorite tailored GUIs.
Andrej Karpathy@karpathy

x.com/i/article/2002…

English
0
0
0
24
Saar Eliad retweetledi
vLLM
vLLM@vllm_project·
How does @deepseek_ai Sparse Attention (DSA) work? It has 2 components: the Lightning Indexer and Sparse Multi-Latent Attention (MLA). The indexer keeps a small key cache of 128 per token (vs. 512 for MLA). It scores incoming queries. The top-2048 tokens to pass to Sparse MLA.
vLLM tweet media
DeepSeek@deepseek_ai

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n

English
11
106
699
102.4K
Saar Eliad
Saar Eliad@saareliad·
People who took a swing at LLM Dissagregation in 2023, probably invested all their savings in Nvidia during 2021-2022 (like I did). I guess that dissagregation will be key in mitigating Nvidia's dominance, and mass-tailored disaggregated AI systems will come.
English
0
0
1
18
Saar Eliad
Saar Eliad@saareliad·
Accompanying are frameworks and literature on overlapping comm/comp/mem in a flexible pipeline parallelism for throughput (work I started in 2019).
English
1
0
0
20
Saar Eliad
Saar Eliad@saareliad·
A very nice article from SemiAnalysis semianalysis.com/2025/09/10/ano… . Nice to see disaggregated serving and specialized hardware for it arriving >2.5 years after my first PPT discussing disaggregated serving for LLMs
English
1
0
1
50
Saar Eliad
Saar Eliad@saareliad·
I found that I can't join the X "Machine Learning" community, as I am blocked by the moderators. Guess who... I probably blocked him randomly as a practice of avoiding hate. Let's talk tech, not politics
Saar Eliad tweet media
English
0
0
0
35
Saar Eliad
Saar Eliad@saareliad·
Besides clear technical reasons, another reason might be public hate speech and antisemitic propaganda by HF organization leaders. I do not feel safe promoting this by referring to a model on HF, or a space on Gradio. Time has come for a change
English
1
0
0
39
Saar Eliad
Saar Eliad@saareliad·
Pytorch and vLLM are under the Linux Foundation. Looking forward to a "Huggingface" replacement or similar pushed forward by LF. This will surely advance AI infrastructure and competition.
English
1
0
0
56
Saar Eliad
Saar Eliad@saareliad·
Deepwiki blows my mind
English
0
0
0
44
Saar Eliad retweetledi
Visual Studio Code
Visual Studio Code@code·
Introducing GitHub Copilot agent mode (preview): the next evolution in AI-assisted coding. Copilot in agent mode is capable of running terminal commands, iterating on its own code to fix errors, and so much more. Available in VS Code Insiders today. Learn more: code.visualstudio.com/blogs/2025/02/…
Visual Studio Code tweet media
English
43
242
1.7K
162.8K