Dharmesh Kakadia

9.6K posts

Dharmesh Kakadia

@dharmeshkakadia

Building https://t.co/VcaMs28aTa to give post-training superpower to everyone. @mixtrainai Past @nuro @zoox @Microsoft @MSFTResearch

San Francisco, CA Katılım Ağustos 2011

6.5K Takip Edilen1.2K Takipçiler

Sabitlenmiş Tweet

Dharmesh Kakadia@dharmeshkakadia·26 Şub

"'Three general-purpose models ought to be enough for everybody." mixtrain.ai/blog/special-m…

English

201

Dharmesh Kakadia@dharmeshkakadia·21 Mar

I get this question often (today morning another founder asks me) : why did you decide to start mixtrain? And my answer was exactly what @nikunj says - I had a blueprint in my head for tasks specific platform for almost a year, but was waiting for someone else to build it…

Nikunj Kothari@nikunj

x.com/i/article/2035…

English

131

Dharmesh Kakadia@dharmeshkakadia·13 Mar

@karinanguyen That looks interesting! Posttraining, especially for multimodal + art is lot about taste. Would love to chat.

English

101

Karina Nguyen@karinanguyen·13 Mar

posttraining is a lot of tastemaking. if you’re technical and deeply in philosophy, literature, fiction writing, screenwriting, theater, would love to chat! we already have a wonderful group and are always curious to meet new perspectives.

English

482

27.5K

Dharmesh Kakadia@dharmeshkakadia·12 Mar

There is no best AI model. Only the best one for your task, your data, and your constraints. 4 models, same prompts. Can you pick the best one? mixtrain.ai/blog/best-mode…

English

604

Dharmesh Kakadia retweetledi

Dharmesh Kakadia@dharmeshkakadia·26 Şub

"'Three general-purpose models ought to be enough for everybody." mixtrain.ai/blog/special-m…

English

201

Dharmesh Kakadia retweetledi

Mitchell Hashimoto@mitchellh·8 Eki

Still a lot of polish to go, but soon update popups will never show up in your demos again @romainhuet or anyone else. 😎

English

1.8K

148.5K

Dharmesh Kakadia@dharmeshkakadia·22 Kas

If you could eval anything, you would eval everything

English

129

Dharmesh Kakadia@dharmeshkakadia·11 Kas

With so much noise in the space, interesting to see something like this!

Alexander Doria@Dorialexander

Breaking: we release a fully synthetic generalist dataset for pretraining, SYNTH and two new SOTA reasoning models exclusively trained on it. Despite having seen only 200 billion tokens, Baguettotron is currently best-in-class in its size range.

English

311

Dharmesh Kakadia@dharmeshkakadia·10 Kas

@vikhyatk Trio is really great. Makes the otherwise ill-designed async python APIs tolerable.

English

5.5K

vik@vikhyatk·10 Kas

really enjoying reading the trio tutorial trio.readthedocs.io/en/stable/tuto…

English

398

230.3K

Dharmesh Kakadia@dharmeshkakadia·29 Eki

@ArunabhMishra8 Sharing a training pass between multiple models is 🤯 I dont know the business case to do this, but its cool for sure tinker-docs.thinkingmachines.ai/under-the-hood

English

arunabh@ArunabhMishra8·28 Eki

Updated with more juicy details!

John Schulman@johnschulman2

@giffmana #clock-cycles" target="_blank" rel="nofollow noopener">tinker-docs.thinkingmachines.ai/under-the-hood…

English

231

Dharmesh Kakadia@dharmeshkakadia·30 Eyl

Thank me later. Sora app apps.apple.com/us/app/sora-by…

English

107

Dharmesh Kakadia@dharmeshkakadia·29 Eyl

@bernhardsson @modal @Lux_Capital @Redpoint @AmplifyPartners Congrats! You guys have the best product on the market!

English

Erik Bernhardsson@bernhardsson·29 Eyl

It's true – @modal has raised a $87M Series B at a $1.1B valuation to advance the future of AI infrastructure. Thank you to @Lux_Capital, @Redpoint, @AmplifyPartners, and others. Now more than ever, AI demands a complete reinvention of traditional compute infrastructure

English

140

1.1K

855.2K

Dharmesh Kakadia@dharmeshkakadia·24 Eyl

"So I am yearning for a future when everyone can do this, not just the labs." That's why I am building @mixtrainai Hit me up if you want a demo of what's possible *today* in post training!

Mahesh Sathiamoorthy@madiator

Post-training is the way forward. For what it's worth, the original RAG paper actually finetunes the LLM. It's useful to post-train the models to learn how to use search. If you are doing agentic RAG, you should figure out how to do RL there. When OpenAI wants to build Deep Research agent, they post-train their models to become these agents, and not prompt them. So I am yearning for a future when everyone can do this, not just the labs. And also if you believe in the era of experience, it's really prescribing post-training as well (RL).

English

218

Dharmesh Kakadia@dharmeshkakadia·24 Eyl

@0thernet Substrate was sick! Curious to hear what you think didn't go well for AI platform play?

English

ben guo 🪽@0thernet·24 Eyl

🥲 Rest in peace, Substrate. 2 years later, these ideas still feel ahead of their time. We were building infra for agents, before the agent boom... A ton has changed since then, lots of new players in the AI platform game. Unsurprisingly, we're not using any of them.

English

2.3K

Dharmesh Kakadia@dharmeshkakadia·19 Eyl

This is way bigger story than $5B one. Great move!

Dylan Patel@dylan522p

So excited for Enfabrica x Nvidia I've been an investor + advisor in Enfabrica for 2+ years Nvidia strategy on CX9 + NVSwitch, while ahead of industry, could be better with different approach for fabric boundaries + KV Management Jensen recognized this imo cnbc.com/2025/09/18/nvi…

English

115

Dharmesh Kakadia@dharmeshkakadia·4 Eyl

@corbtt @OpenPipeAI @CoreWeave Congrats Kyle and the team!

English

Kyle Corbitt@corbtt·3 Eyl

Ok, some big news that I've been sitting on for a minute: @openpipeai is getting acquired by @coreweave!

English

546

48.6K

Dharmesh Kakadia@dharmeshkakadia·28 Ağu

Kinds of things I get excited about a lot! Isaac multi backend would be very welcome for the community!

Erwin Coumans 🇺🇦@erwincoumans

Ssh, don't tell anyone, way too preliminary to be amplified: github.com/newton-physics… github.com/isaac-sim/Isaa…

English

201

Dharmesh Kakadia@dharmeshkakadia·14 Ağu

💯 Selling RL env is not a good biz. The right approach IMO is to help companies build (& scale) these env internally for the wide range of use cases. Post Training specific platform, for everyone.

Jon Chu // Khosla Ventures@jonchu

1/ Been asked by 6+ funds and my partners about RFT and environment creation startups. Here’s what I’m looking to invest in and why the first wave of startups here have gotten it wrong 👇

English

285

Dharmesh Kakadia@dharmeshkakadia·11 Ağu

Post training - on your data for your use case - is where the magic happens. Post training for everyone == intelligence for every use case.

will brown@willccbb

i'm increasingly convinced that "transformative ai" is going to look like an abundance of specialized models for everything from drug design to weather sims to robotics to supply chains, not one agent to rule them all. we're going to need a lot more ai researchers

English

962

Dharmesh Kakadia@dharmeshkakadia·10 Ağu

@simonw FYI about ".ropeproject" (you remarked about this in the video) - it's because zed +python lsp. github.com/zed-industries…

English

221

Simon Willison@simonw·1 Tem

I'm trying to get into the habit of producing more video, so on the spur of the moment I made a 7 minute video showing how I used Claude Code to put together a simple GitHub Actions workflow youtube.com/watch?v=VC6dmP…

YouTube

English

620

98.3K

Keşfet

@nikunj @karinanguyen @romainhuet @vikhyatk @ArunabhMishra8 @bernhardsson @modal @Lux_Capital