Pratyush Choudhury (PC)

1.1K posts

Pratyush Choudhury (PC)

@177pc

Activating AI in India | Past: @awscloud, @scaletogether | Previously backed/helped: @emergentlabs, @composio, @rocketdotnew, @thesysdev & more | Views my own

Join 7.8k+ founders & execs → Katılım Haziran 2020

188 Takip Edilen3.6K Takipçiler

Sabitlenmiş Tweet

Pratyush Choudhury (PC)@177pc·24 May

I like @deedydas's work but but this take misses context Sarvam-M isn’t a vanity fine-tune; it’s India’s first open-weights 24 B Indic-centric LLM built under brutal GPU & data scarcity. Judging it by few hours of HuggingFace stats badly misses the point. Most people outside India don't appreciate that compute is quite the invisible ceiling - H100 clusters are still not commercially stocked in India - US export caps tightening next week will squeeze supply even further - Indian teams literally queue for hours of A100/H100 time that US & CN labs get on tap Data is also the long-tail problem Indic languages form <0.01 % of CommonCrawl. You read that right—two orders of magnitude less than Chinese or Spanish. Any local lab must build its corpus first, then train. That’s months of ETL before the first gradient step. Synthetic data is GPU-constrained. Talent pipeline is still forming HPC + RLHF + compiler-level optimisation is new ground in India; Sarvam’s run has already up-skilled dozens of engineers who now know how to wrangle 10 k GPU-hours, FP8 PTQ & GRPO reward engines. Their detailed blog post democratizes a lot of this learning. You can’t AWS-credit your way to that muscle memory. What Sarvam actually shipped - 3.7 M high-diversity Indic prompts, deduped & quality-scored - Two-phase non-think/think alignment that adds+2 pp on IndicGen - GRPO RL with partial-credit rewards—LiveCodeBench jump 0.23→0.44 - FP8 + look-ahead decoding: 2× tokens/s, ½ $/M tok on H100 That means a 🇮🇳-hosted midsize model now matches Gemma-3 27 B and Llama-3.3 70 B on Indic reasoning while costing a fraction to serve. That’s some engineering leverage & definitely not hype. Model adoption is anyways a long-tail - one needs to ship multiple models of non-frontier quality to eventually be able to get to the one that's truly at the frontier (at least along dimensions that we care about). Plus, there's a whole host of Indic-language use-cases where this sovereign model would work much better compared to using any other open-weights model. Look at (LiveCodeBench 0.23→0.44, 2× tokens/s) If you ask for stats, you'll learn that some of their conversational AI platform reaches out to about 50M+ people in just a week. What's next possibly? - Maybe we all recognize the data problem & do Nation-scale data-collection drives (something like CommonCrawl-IN) - Public RL-as-a-service clusters so smaller labs can replicate GRPO - For devs wanting to push the Indic NLP forward, consider forking Sarvam-M, fine-tuning on your domain corpus, benchmarking on Indic-Eval, contributing back patches. Each derivative model widens the knowledge base & closes the English–Indic gap. In summary, celebrating Sarvam's work (I'm not an investor) isn't nationalism, it's recognizing an innovation feat under constraints - India can't out-GPU Mountain View today but there's technical merit on display here, regardless of the metrics. 👏 @pratykumar, @AashaySachdeva, @HarveenChadha & other friends from @SarvamAI Here's to more AI in 🇮🇳

Deedy@deedydas

India's biggest AI startup, $1B Sarvam, just launched its flagship LLM. It's a 24B Mistral small post trained on Indic data with a mere 23 downloads 2 days after launch. In contrast, 2 Korean college trained an open-source model that did ~200k last month. Embarrassing.

English

585

99.3K

Pratyush Choudhury (PC)@177pc·1d

More here: - economictimes.indiatimes.com/tech/technolog… - moneycontrol.com/news/business/…

English

189

Pratyush Choudhury (PC)@177pc·1d

Voice is not just another modality for India - it is the primary interface for both B2B & B2C conversations Low literacy segments, vernacular dominance, high mobile penetration & WhatsApp voice-note culture make text a secondary layer for hundreds of millions of users At Activate, we’re building India’s AI ecosystem. Our mission is to connect global AI leaders w/ the country's best builders as well as investing in startups We’re proud to join @ElevenLabs' latest Series D as their strategic local investor & partner. This is a direct expression of that mission in action. India is already one of their largest & fastest-growing markets globally & we’ve long admired their focus on solving real contextual voice AI problems here. @aakrit & I look forward to working with @mati, @Carles_Reina, Karthik & the entire ElevenLabs leadership team to bring the best of their technology to Indian builders & enterprises, activating voice AI in India.

English

7.2K

Pratyush Choudhury (PC)@177pc·3d

@Arindam_1729 Solo students can of course apply

English

Arindam Majumder 𝕏@Arindam_1729·3d

@177pc Hey, is it for solo builders? Or students running companies and building products as a team?

English

Pratyush Choudhury (PC)@177pc·7 May

India’s AI future is being built by those who ship, not just study. Today I’m announcing Activate Fellows - a summer program for 15 of India’s (and the world’s) best student builders to work inside the country’s leading AI startups Only 15 spots. And 8 days left to apply. Details + link 🧵

English

180

14.3K

Pratyush Choudhury (PC)@177pc·7 May

@aannuujX Would love to explore more - let’s chat?

English

247

dope-a-meme@aannuujX·7 May

@177pc Would love to be join this and have them work with Builders Club and AI teams at Swiggy!

English

628

Pratyush Choudhury (PC)@177pc·7 May

@saurabh First line says “…who ship, not just study…” - absolutely understand that it’s not either or

English

193

Saurabh@saurabh·7 May

@177pc the false binary worth pushing back on — the best shippers i know are also relentless readers. people who only ship plateau fast because they keep solving the same problem in different shapes. study + ship is the unlock. pure-shipper LARP is its own kind of trap.

English

252

Pratyush Choudhury (PC)@177pc·7 May

Excited to be doing this with: @SarvamAI, @emergentlabs, @composio, @GnaniAi, @dashversetech, @Aeos_Labs, @AgraniLabsInc, @Aqqrue, @Deccan_AI, @pre6ai, @SimplismartHQ, @smallest_AI @anshulbhide, @mdebdoot, @GarvitJuniwal, @NirantK, @SriramRajamani @aakrit @ActivateSignal

English

1.5K

Pratyush Choudhury (PC)@177pc·7 May

Open to undergrads, grads, Master’s/PhDs, or self-taught builders anywhere who ship fast, bring founder energy & want to be a part of India's AI story. Program starts June 1. Deadline May 15👇 activatevc.ai/fellows

English

1.4K

Pratyush Choudhury (PC)@177pc·6 May

@JayminSOfficial @xai @AnthropicAI @claudeai Not quite - if you are/trying to be at the frontier, you’d do not lease excess training/inference GPUs to anyone, let alone a competitor Seems it’s against OpenAI at all costs even if it means ceding some part of the frontier race and/or not letting them run away with b2c

English

Jaymin Shah@JayminSOfficial·6 May

Competitors sharing infrastructure. That's where AI is headed, and most people are sleeping on what this actually means. @xai just signed a deal to give @AnthropicAI access to Colossus 1 for @claudeai's compute needs. Two companies competing in the same AI space, now running on the same supercomputer. If you told someone this three years ago, they'd have laughed. But here's why it makes complete sense. Training and running frontier AI models is outpacing what land, power grids, and cooling systems can physically support. The bottleneck isn't talent. It isn't funding. It's raw compute and the energy to power it. No single company, no matter how well-funded, can build infrastructure fast enough to keep up alone. So what do you do? You partner. You rent. You share. This is the "cloud moment" for AI compute, and we're watching it happen in real time. Just like AWS turned Amazon's excess server capacity into a $100B+ business, AI labs are now monetizing their GPU clusters the same way. Colossus 1 alone packs over 220,000 NVIDIA GPUs including H100s, H200s, and GB200 accelerators. That's not just a supercomputer. That's a platform waiting to be rented out. And then there's the orbital compute piece, which almost got buried in the announcement. Anthropic has expressed interest in partnering with SpaceX to build gigawatts of space-based AI compute. Read that again. We are genuinely exploring moving AI infrastructure off-planet because Earth simply cannot keep up with the energy and cooling demands of next-gen models. xAI renting to Anthropic today. Who's next? A sovereign government? A pharma giant running drug discovery models? A defense contractor? The companies that own the compute will have more leverage than the companies that build the models. Infrastructure always wins in the long run, and the race to own it just got a lot more interesting.

English

511

88.1K

Pratyush Choudhury (PC)@177pc·6 May

Several interesting ways to read this: (1) In the short-term, it’s unlikely that Grok will be a frontier model & @SpaceX is behind on the pure model race (2) They do seem to have a new business model of being the go-to neocloud for labs that need burst capacity (both Cursor & Anthropic) (3) Elon wants to take down OpenAI at any costs Given compute is the binding constraint in the AI race & Colossus 1 is one of the world’s largest clusters, handing it wholesale to Anthropic signals xAI has either (a) excess capacity while Colossus 2 (already scaling toward 550k+ GPUs/~1 GW) absorbs Grok 5 training (6–10T parameter variants) or (b) chosen revenue and platform validation over pure model supremacy

Claude@claudeai

We’ve agreed to a partnership with @SpaceX that will substantially increase our compute capacity. This, along with our other recent compute deals, means that we’ve been able to increase our usage limits for Claude Code and the Claude API.

English

378

Pratyush Choudhury (PC)@177pc·1 May

@amnigos Registrations are now closed - we’re at capacity. We squeezed in 5 more builders from the waitlist. The quality of AI builder talent in India is genuinely energising. More conviction than ever that globally important AI companies will be built from here.

English

366

Pratyush Choudhury (PC)@177pc·1 May

Hosting a private AI founders & researchers lunch tomorrow in Bangalore for those around during this long weekend Curated small room, invite-only w/ @amnigos & some of India's best in the room We're opening 2 spots for early-stage founders: (1) At ideation stage (2) Deeply technical/research background DM why it should be you & apply here: luma.com/25kgmdpu Founders only

English

13.1K

Pratyush Choudhury (PC)@177pc·13 Nis

Expecting value to trickle downwards to open weight/smaller models, inference stacks & “useful” on-device intelligence Couldn’t be an co-incidence that Mythos was talked about during a time when a bunch of application layer companies were building domain specific SLMs

Hemant Mohapatra@MohapatraHemant

Literally everyone underestimated the takeoff on demand. Now chip supply crunch + energy shortage is going to have an upward cascading effect on cost & margins. Habit forming & high impact tools can raise prices, ride it out. The iron law of semicon investing never disappoints.

English

842

Pratyush Choudhury (PC) retweetledi

Aakrit Vaish@aakrit·9 Nis

Always so much to learn from @Dthakker02. Fun evening hosting him with some of India's emerging AI founders.

English

4.1K

Pratyush Choudhury (PC) retweetledi

Aakrit Vaish@aakrit·7 Nis

We surveyed 77 of India's top CTOs on how they are using AI. What's working, what's not, model choices, org design, and everything in between. Median company size of 500-2,000 employees across new economy companies such as Meesho, upGrad, Dream11, Groww, Pine Labs, CRED, etc.

English

178

40.1K

Pratyush Choudhury (PC)@177pc·31 Mar

@stuffyokodraws; brilliant systems read on the Claude Code leak. This seems more than anti-distillation - almost a deliberate re-architecture of information flow in agentic APIs By poisoning the data well + black-boxing the state transitions w/ signed connector summaries, Anthropic is treating latent reasoning as a cryptographic primitive rather than a leaky byproduct Feels like the AI equivalent of zk proofs for cognition - protecting the ‘why’ while still delivering the ‘what’ In a world where distillation has been the fastest path to capability diffusion, this flips the economics: closed inference surfaces become the durable moat Really sharp eye on the triple-gating too - that’s the kind of operational paranoia that separates good defense from great Would love your take on how this ripples into neocloud economics - does it accelerate the shift toward inference-as-a-service with built-in IP controls?

English

525

Yoko@stuffyokodraws·31 Mar

How anti distillation is done on Claude Code 👇 Saw this flag ANTI_DISTILLATION_CC in the cc leak: when enabled, it injects decoy tool definitions into API requests to poison distillation training data. Interestingly there's a second mechanism via CONNECTOR_TEXT. The API server itself replaces Claude's reasoning between tool calls with signed summaries, so even if you fully control the client, you never see the original chain-of-thought. Pretty smart way to prevent distillation

English

8.6K

Pratyush Choudhury (PC)@177pc·17 Mar

@theboyinatux @aakrit Let’s chat then, we see atleast a couple dozen different efforts

English

Devansh Shah@theboyinatux·17 Mar

@aakrit @177pc Who's building scale AI for robotics? Think that's real arbitrage and I can connect with robotics labs in berkeley

English

Aakrit Vaish@aakrit·16 Mar

In the 3 weeks since the IndiaAI Summit, @177pc and I have met 43 founding teams. Deeply technical, most in their mid-late 20s, and building for India or with a strong India-edge. Some common themes we saw and like: - AI-led IT Services: What does the Palantir for India look like? Reimagining Infosys/TCS/Wipro AI-first. - Compute for India: As enterprise demand ramps up, the country needs more sovereign inference and infra capabilities. - AI for the "real world": Purpose-built models for material sciences, biotech, manufacturing, security & defence. - Healthcare AI: Both India's AI doctors and AI agents that transform primary care. - Physical AI: More than just the end robots, can India provide the data infrastructure for the world? (Scale AI for robotics) - Voice AI: Everything from foundational research to vertical agents to full stack solutions in what will probably be the largest Voice AI market globally. Indian AI startups will look different and have their own lane. We’re just starting to see the first signs.

English

266

17.1K

Pratyush Choudhury (PC)@177pc·6 Mar

🇮🇳's AI arrives on the world stage, representing a critical milestone in establishing a sovereign AI stack for India (1) @SarvamAI just open-sourced 2 MoE reasoning models - 30B & 105B - trained from scratch entirely in India on IndiaAI Mission compute (2) It's a robust GRPO-based RL pipeline, validating that frontier-tier reasoning, programming & agentic capabilities can be indigenized (3) BrowserComp is a sleeper hit for the 105B - nearly 17x improvement than DeepSeek's R1 suggests their agentic RL pipeline (tool use, search integration) is genuinely differentiated (4) 30B model seems to be the true disruptor - combining Grouped Query Attention (GQA) w/ an ultra-efficient Indic tokenizer (yielding up to a 10x performance delta), the 30B fundamentally alters inference economics for edge & real-time enterprise deployments in the subcontinent (5) 30B's benchmark numbers would have been frontier-class ~12 months ago & the inference optimization story (3-6x throughput on H100) makes it a plausible production model for cost-sensitive Indian enterprise deployments (6) The 105B model demonstrates exceptional depth in tool interaction & environment reasoning. The RL pipeline's use of an asynchronous GRPO architecture (notably bypassing standard KL-divergence constraints against a reference model) explicitly rewards verifiable multi-step execution over mere conversational chattiness. (7) The full-stack inference optimization, achieving 20-40% higher token throughput via custom-shaped MLA optimizations and vocabulary parallelism, creates stickiness at the infrastructure layer that pure model-builders lack. (8) If Sarvam 30B becomes the default Indic voice/conversational model (which the inference economics support), it creates a meaningful wedge in the Indian BFSI conversational AI market. The 2.4B active parameter count at this quality level is a structural cost advantage v/s deploying GPT/Claude for Hindi/Tamil telephonic agents. (9) I see the Indic tokenizer + inference optimization stack as a compounding advantage. Every other model serving Indian languages pays a "tax" in token inefficiency and latency. This compounds across millions of API calls. (10) There are a couple of areas where I'd like to see improvements though: (10.1) SWE-Bench is the elephant in the room. For a model positioned around agentic workflows, this ~20-point gap on real-world software engineering tasks is material. It signals that while the model can reason well in structured settings, it struggles w/ the messy, multi-file, context-heavy nature of real codebases (10.2) In an era where vision-language is table stakes, both models are text-only. They acknowledge this - mentioning future models for "multimodal conversational tasks" - but it's a gap today. (11) For Sarvam as a company, this is a credibility-establishing release. @pratykumar & co have demonstrated they can train competitive models from scratch - a very short list globally. The question is whether the model business itself captures value or whether Sarvam's value creation is upstream (Samvaad platform, enterprise deployments) using these as proprietary infrastructure. (12) If I had a say, I'd suggest the following couple of things: (12.1) They could offer the 30B model completely free (including localized inference hosting) to Indian telcos & financial institutions for edge deployment, explicitly in exchange for federated access to their anonymized customer interaction data. This would create an insurmountable, proprietary data moat for future RLHF. (12.2) I'd think about aggressively commercializing the "romanized colloquial" capability into a proprietary API for WhatsApp/Telegram business layers. Indian commerce runs on WhatsApp in code-mixed "Hinglish" or "Tanglish" - dominating this exact syntactic niche captures the entire B2C transactional layer. (12.3) Voice AI vertical integration - Combining the 30B w/ their existing TTS/STT APIs into an end-to-end voice agent stack purpose-built for Indian BFSI could be a very high ROI product move. Regardless, this is the most credible "sovereign AI" release from India to date - long AI in India.

Pratyush Kumar@pratykumar

📢 Open-sourcing the Sarvam 30B and 105B models! Trained from scratch with all data, model research and inference optimisation done in-house, these models punch above their weight in most global benchmarks plus excel in Indian languages. Get the weights at Hugging Face and AIKosh. Thanks to the good folks at SGLang for day 0 support, vLLM support coming soon. Links, benchmark scores, examples, and more in our blog - sarvam.ai/blogs/sarvam-3…

English

137

9.1K

Keşfet

@ElevenLabs @aakrit @mati @Carles_Reina @Arindam_1729 @aannuujX @saurabh @SarvamAI