Rahul

2.1K posts

Rahul banner
Rahul

Rahul

@selfawareatom

Founding member and leading the foundation models team @sarvamai.

Katılım Ağustos 2009
317 Takip Edilen4.6K Takipçiler
Rahul retweetledi
Haocheng Xi
Haocheng Xi@HaochengXiUCB·
𝗞-𝗺𝗲𝗮𝗻𝘀 𝗶𝘀 𝘀𝗶𝗺𝗽𝗹𝗲. 𝗠𝗮𝗸𝗶𝗻𝗴 𝗶𝘁 𝗳𝗮𝘀𝘁 𝗼𝗻 𝗚𝗣𝗨𝘀 𝗶𝘀𝗻’𝘁. That’s why we built Flash-KMeans — an IO-aware implementation of exact k-means that rethinks the algorithm around modern GPU bottlenecks. By attacking the memory bottlenecks directly, Flash-KMeans achieves 30x speedup over cuML and 200x speedup over FAISS — with the same exact algorithm, just engineered for today’s hardware. At the million-scale, Flash-KMeans can complete a k-means iteration in milliseconds. A classic algorithm — redesigned for modern GPUs. Paper: arxiv.org/abs/2603.09229 Code: github.com/svg-project/fl…
English
36
196
1.7K
278.6K
Rahul
Rahul@selfawareatom·
Finally got time to assemble this. What's the best thing to do with it, apart from playing Hogwarts: Legacy?
Rahul tweet media
English
7
0
58
2.9K
Rahul
Rahul@selfawareatom·
@officialKrishD Beelink came with it. So didn't bother changing. With WSL and docker, you can live with Windows these days
English
1
0
0
138
Krish Dasgupta
Krish Dasgupta@officialKrishD·
@selfawareatom Just curious - Why windows OS though ? What benefits did you see running a windows OS so far ?
English
1
0
0
102
Rahul
Rahul@selfawareatom·
@imkin Beelink GTi15 ultra + GeForce RTX 5060 Ti
Indonesia
1
0
0
259
Rahul
Rahul@selfawareatom·
@min_maxxer Thanks for letting me know!
English
0
0
1
25
-10xdev
-10xdev@min_maxxer·
@selfawareatom . This might be a bug . the model thinks even if the reasoning effort is None
-10xdev tweet media
English
1
0
0
46
Rahul
Rahul@selfawareatom·
Now that our 15 member llm team is infamous, time to expand for next time! If you have done one or more of the following, then please reach out. - pretrained a model of any size, from scratch - posttrained any base model, end to end (data curation, sft, rl) - are a pytorch wizard - are a cuda kernel master - you have any other relevant skills and work to back it up firstnamesarvamai
English
34
35
701
81.7K
Rahul
Rahul@selfawareatom·
@svembu Thank you, sir!
English
0
0
1
144
Sridhar Vembu
Sridhar Vembu@svembu·
Sarvam's highly competitive AI models illustrate an important point: we must do catch-up R&D, however un-prestigious or thankless it feels and as we start to catch up, innovative new ideas will emerge. Sarvam is on a great trajectory! This is why we quietly persist in all the efforts we do.
Pratyush Kumar@pratykumar

📢 Open-sourcing the Sarvam 30B and 105B models! Trained from scratch with all data, model research and inference optimisation done in-house, these models punch above their weight in most global benchmarks plus excel in Indian languages. Get the weights at Hugging Face and AIKosh. Thanks to the good folks at SGLang for day 0 support, vLLM support coming soon. Links, benchmark scores, examples, and more in our blog - sarvam.ai/blogs/sarvam-3…

English
50
441
3.1K
94.7K
Rahul
Rahul@selfawareatom·
The @vllm_project PR has been merged! You can now use the 30B and 105B models directly by installing vLLM's nightly wheel.
Rahul tweet media
English
6
21
280
9.9K
Rahul
Rahul@selfawareatom·
@rasbt Thanks for the shout out Sebastian. Have been following your work since mlxtend!
English
0
0
30
2K
Sebastian Raschka
Sebastian Raschka@rasbt·
While waiting for DeepSeek V4 we got two very strong open-weight LLMs from India yesterday. There are two size flavors, Sarvam 30B and Sarvam 105B model (both reasoning models). Interestingly, the smaller 30B model uses “classic” Grouped Query Attention (GQA), whereas the larger 105B variant switched to DeepSeek-style Multi-Head Latent Attention (MLA). As I wrote about in my analyses before, both are popular attention variants to reduce KV cache size (the longer the context, the more you save compared to regular attention). MLA is more complicated to implement, but it can give you better modeling performance if we go by the ablation studies in the 2024 DeepSeek V2 paper (as far as I know, this is still the most recent apples-to-apples comparison). Speaking of modeling performance, the 105B model is on par with LLMs of similar size: gpt-oss 120B and Qwen3-Next (80B). Sarvam is better on some tasks and worse on others, but roughly the same on average. It’s not the strongest coder in SWE-Bench Verified terms, but it is surprisingly good at agentic reasoning and task completion (Tau2). It’s even better than Deepseek R1 0528. Considering the smaller Sarvam 30B, the perhaps most comparable model to the 30B model is Nemotron 3 Nano 30B, which is slightly ahead in coding per SWE-Bench Verified and agentic reasoning (Tau2) but slightly worse in some other aspects (Live Code Bench v6, BrowseComp). Unfortunately, Qwen3-30B-A3B is missing in the benchmarks, which is, as far as I know, is the most popular model of that size class. Interestingly, though, the Sarvam team compared their 30B model to Qwen3-30B-A3B on a computational performance analysis, where they found that Sarvam gets 20-40% more tokens/sec throughput compared to Qwen3 due to code and kernel optimizations. Anyways, one thing that is not captured by the benchmarks above is Sarvam’s good performance on Indian languages. According to a judge model, the Sarvam team found that their model is preferred 90% of the time compared to others when it comes to Indian texts. (Since they built and trained the tokenizer from scratch as well, Sarvam also comes with a 4 times higher token efficiency on Indian languages.
Sebastian Raschka tweet media
Pratyush Kumar@pratykumar

📢 Open-sourcing the Sarvam 30B and 105B models! Trained from scratch with all data, model research and inference optimisation done in-house, these models punch above their weight in most global benchmarks plus excel in Indian languages. Get the weights at Hugging Face and AIKosh. Thanks to the good folks at SGLang for day 0 support, vLLM support coming soon. Links, benchmark scores, examples, and more in our blog - sarvam.ai/blogs/sarvam-3…

English
45
704
4.1K
250.7K
neural nets.
neural nets.@cneuralnetwork·
running on modal now on a L4
neural nets. tweet media
English
6
0
115
4.2K
Rahul retweetledi
Pratyush Kumar
Pratyush Kumar@pratykumar·
📢 Open-sourcing the Sarvam 30B and 105B models! Trained from scratch with all data, model research and inference optimisation done in-house, these models punch above their weight in most global benchmarks plus excel in Indian languages. Get the weights at Hugging Face and AIKosh. Thanks to the good folks at SGLang for day 0 support, vLLM support coming soon. Links, benchmark scores, examples, and more in our blog - sarvam.ai/blogs/sarvam-3…
English
208
1.3K
6.9K
723.5K
Rahul
Rahul@selfawareatom·
3.. 2.. 1..
3
1
29
1.6K