Ravikant Dewangan

3.2K posts

Ravikant Dewangan

@ronitkd

Seattle, WA Katılım Ekim 2009

466 Takip Edilen211 Takipçiler

Sabitlenmiş Tweet

Ravikant Dewangan@ronitkd·6 Şub

Last weekend I hosted Greg Glassman at my gym for a 2-day @MetFixByBSI seminar. What I learned broke my brain. I went down a rabbit hole I couldn't escape and when I finally came up for air, I felt like I'd been unplugged from the Matrix. Now I'm building a course to teach what I learned. 🧵👇

English

683

Ravikant Dewangan@ronitkd·13m

@tunguz Old hardware running new models is the most underrated path. Pulled a 2018 box off the shelf last month, dropped qwen on it, now it handles overnight batch jobs while my main rig codes. Cheapest agent infra I have built.

English

Bojan Tunguz@tunguz·10h

After seeing these tweets, I decided to try it out on my own old Ubuntu computer with RTX 1070 GPU (the one that I just upgraded from 16.04 all the way to 24.04 the other day). Asked Codex on my Mac to connect to that machine and install and test qwen3 8b. So far really impressive - running 30 tok/s!

Sudo su@sudoingX

ok this is wild. 10 year old gtx 1080 8gb pascal card running qwen3 8b locally at 18-20 tok/s via hermes agent and it's actually doing the thing. asked it to build a wireworld cellular automata simulator with 10 tests. autonomous run, no hand holding. expected it to fail on the tool calls. that's not what happened. write_file works. browser_navigate works. terminal commands work. file ops, package installs, version probes, environment setup. agent is firing tool calls cleanly and the model is reasoning about next steps at 18-20 tok/s. on hardware that pre dates "agentic" as a word. it even hit an npm install fail because node 12 is too old. didn't crash. didn't ask me. just started bootstrapping nvm on its own to fix the environment. 10 minutes in. 40% context used. 7.5gb of 8gb vram occupied. still going. i did not think this would work on this hardware. this is the most i've been wrong this month.

English

221

31.7K

Ravikant Dewangan@ronitkd·14m

@mckbrando Cross-modal handoff is the unlock. Built a coaching ops agent last month that reads athlete PDFs, queries Postgres history, jumps to terminal to push programs, then back to browser to schedule. Pure click loops fall apart on anything real.

English

Brandon McKinzie@mckbrando·5h

this is the way. cua isn't just clicking, it's everything

Neal Chopra@nealchopra

Many people imagine computer use as just clicking. In practice, the best systems are highly cross-modal. The agent drops into a terminal to parse a dense email archive or pull an exact figure out of a PDF, then switches back to the mouse and keyboard to do the work inside the actual application. Left alone, models over-rotate on code, because it's fast and heavily rewarded in training. But a spreadsheet built by a Python library often looks right without being right. So we pushed the executor toward the GUI and let it reach for code only where it genuinely helps. What matters isn't the method but choosing whatever gets the task done. 🧵 5/7

English

1.4K

Ravikant Dewangan@ronitkd·16m

@tszzl Token use tracks the ceiling of what you tried. I burn 10M a week now and ship 4x what I shipped a year ago. The metric is undercounted because most builders still write code like it is 2024.

English

roon@tszzl·38m

token use gets too much hate as a metric - in times of technological transition peoples default will be to underuse and underestimate the new tech. “steam power used” would have been a good KPI for pre industrial civilization just as kardashev scaling remains for ours

English

130

4.8K

Ravikant Dewangan@ronitkd·7h

The hollow position is the most skipped gymnastics fundamental in CrossFit. A 30-second active hold separates athletes who can kip safely from those who break. We screen for it at Persistence Athletics before greenlighting kipping work. Most pull-up shoulder issues trace back to weak hollow control.

English

Ravikant Dewangan@ronitkd·7h

Watching builders chase context length numbers. 1M, 2M, 10M tokens. The real bottleneck isn't context window size, it's retrieval relevance. 200K of useful context beats 2M of noise every time. Most evals never measure this.

English

Ravikant Dewangan@ronitkd·7h

@comradCornelius @AdamKellyFA @vandy_62 S&C catches heat for injuries the position coach and training staff actually own. Pitching volume, mechanics, sleep, off-season programming, split across 4 to 5 people with no integrated plan. CrossFit drills aren't the problem. Lack of a single accountable owner is.

English

Comrade Cornelius@comradCornelius·12h

@AdamKellyFA @vandy_62 Why did so many pitchers get injured? Is it the S&C coaches job to mitigate that? Does Corbin have a say in who that coach is? Does that coach basically run them through CrossFit drills all year?

English

Vandy Sixty Two (62)@vandy_62·1d

Tim Corbin has no one to blame but himself for missing the NCAA Tournament. It all came down to pitching: 12th in the SEC in ERA with a 5.23 2nd in doubles given up 2nd in triples given up Missed on portal additions Relied too much on freshman pitchers

English

4.5K

Ravikant Dewangan@ronitkd·7h

@bindureddy Long-running complex tasks is where eval discipline shows. Most builders never set baselines so they can't tell if a new model is better or just newer. I run the same agent loop with 3 different models per change. The wins are usually obvious only in retrospect.

English

131

Bindu Reddy@bindureddy·10h

GPT 5.5 xHigh is exceptionally good at very long running complex tasks You can create extremely complex apps with a single prompt It's strangely underrated given how good the model is....

English

188

8.9K

Ravikant Dewangan@ronitkd·7h

@PeterDiamandis Running 3 to 5 agents in parallel for coding is my practical limit. Beyond that, context coordination overhead kills the gains. 1000 concurrent agents needs a managerial layer most builders don't have wired up yet.

English

Peter H. Diamandis, MD@PeterDiamandis·10h

Power users will soon want 1,000 concurrent agents. Engineers, architects, designers... all orchestrating swarms. Compare that demand to the current supply. It's peanuts.

English

478

21.8K

Ravikant Dewangan@ronitkd·12h

@npew Native settings UI is the unsexy boss fight for computer use. macOS hides toggles three submenus deep, names change every release. Two clicks vs seven is real IT helpdesk savings. The browser is the easy demo. The OS is where the value lives.

English

Peter Welinder@npew·15h

Using Codex Computer Use to navigate the Settings app is incredible. It knows MacOS so much better than me. Similar experience to it using the command line. Superhuman.

English

5.7K

Ravikant Dewangan@ronitkd·12h

@serenaa_ge DeepSWE is useful but still a snapshot. The signal that matters is what happens at run 87 of an unattended loop: read a stack trace, edit the file, rerun the test, six tools chained without supervision. Harness quality is the moat, not the score.

English

275

Serena Ge (Datacurve)@serenaa_ge·17h

Today we’re releasing DeepSWE, a new standard for agentic coding benchmarks. On public leaderboards, top models often look relatively close in capability. DeepSWE shows where they actually diverge, reflecting the realistic experience of developers in their day-to-day work.

English

337

487

4.1K

962.2K

Ravikant Dewangan@ronitkd·12h

@kimmonismus Ramp's enterprise data: Anthropic 34.4%, OpenAI 32.3%. Gemini does not register in serious paid-seat tracking. Default app swaps inflate the topline. Distribution is real, but it is one OS setting away from being someone else's.

English

Chubby♨️@kimmonismus·18h

I'm not sure if Google is winning the AI race. However, I think they're winning the AI distribution race, which is a different thing. 900M Gemini users is impressive on a slide. But a huge chunk of that is Android users who got a default app swap and Search users who got AI Overviews without opting in. But that doesnt mean its a bad thing. 9.7 trillion tokens/month two years ago. 480 trillion last year. 3.2 quadrillion now. That's a 7x jump in twelve months. To keep that going, Google plans to spend $190 billion on infrastructure this year. OpenAI has been trying to reach the 1b user milestone for some time now. For Google, on the other hand, it's a simpler game. Why? With billions of Android devices, and combined with Google and its AI mode, they have the ability to introduce everyone to AI, specifically Gemini, for free. How do they do it? TPUs! Google not only laid the foundation for modern LLMs with their 2017 paper "Attention is all you need," but also made a far-sighted decision back in 2012 to invest in TPUs - their own in-house chips that are particularly well-suited for machine learning tasks. Now in its eighth iteration, they even have two chips: one particularly good for inference, and one particularly good for training. This makes them more independent. Furthermore, they have a solid foundation that generates strong revenue and good profits, allowing them to subsidize AI usage for free, and without ads, unlike OpenAI (this is not a judgment, just a statement of fact). TPUs Therefore, Google has a very good chance of winning the game thanks to this outstanding starting position and free distribution. But to be fair: the game *is* far *from over*. However, the starting position is outstanding for Google. Image: The Economist article

English

520

35.6K

Ravikant Dewangan@ronitkd·17h

Two-stage VLM content scoring. Cheap model gates which frames the expensive one looks at. 12 cents per clip vs 90 cents running GPT-4o on every frame. Same accuracy on the cohort. Eval harness took longer to wire than the model. Keeps happening.

English

Ravikant Dewangan@ronitkd·20h

@HarryStebbings i dont see market going any other direction! this is true even for small businesses!

English

Harry Stebbings@HarryStebbings·21h

I just interviewed a CEO who said three things that blew my mind: 1. We replaced our $600K Salesforce contract with a vibe-coded CRM, built within 3 weeks. 2. We will get rid of 80% of the SaaS we use internally. 3. If Anthropic doubled pricing, we would not change usage in any way.

English

337

927

383.7K

Ravikant Dewangan@ronitkd·1d

@emollick Builders are running their own n=1 experiments. The data lives in private commit graphs and Stripe dashboards. I ship about 3x what I did in early 2025 on EDSO and MetFix. Most of the signal never enters formal research channels.

English

Ethan Mollick@emollick·1d

We have, as far as I can tell, no good tests of the productivity impact of the autonomous coding tools that appeared starting in December 2025. Every paper out there is from prior to the Claude Code/Codex revolution. A huge gap in our knowledge about what is happening in coding.

English

697

41.6K

Ravikant Dewangan@ronitkd·1d

@karrisaarinen Solo builders are absorbing it. Shipped EDSO and MetFix in months it would have taken me a year. Big orgs still carry the same approval, review, and meeting overhead, so the gains get eaten by the system around the code.

English

Karri Saarinen@karrisaarinen·1d

We keep hearing about 10x or 100x productivity gains in engineering and knowledge work. But outside the model labs, I haven’t seen the corresponding 10-100x revenue growth across the market or increase in quality. So where is the productivity going?

English

423

105

2.6K

315.6K

Ravikant Dewangan@ronitkd·1d

@lennysan @danshipper Point 1 is real. Last six months I've been shipping EDSO and MetFix from inside Claude Code, not my IDE. The agent holding context across sessions is what changed the loop for me.

English

Lenny Rachitsky@lennysan·1d

My biggest takeaways from @danshipper: 1. The future of work will happen inside Codex or Claude Code. Instead of putting AI into your SaaS tool, you’ll use your SaaS tools inside your favorite AI agents' in-app browser. Dan spends all his time in Codex now—writing documents, managing email, doing research, everything. He's using Google Docs, PostHog, and everything he needs within the agent's in-app browser. The agent can see what he’s doing, and has all of his context, so he and his agent collaborate quickly and super effectively. 2. Automation is a lie—every automation needs a human. Dan's company doubled in size this year despite being incredibly AI-forward. Why? Because in order to make automation work well, you need humans making sure everything keeps working. This is why benchmarks are misleading—they measure AI on problems we’ve already framed and can score, but there’s always a higher frame. 3. PMs will win the AI era. Marcus, a former PM who previously ran Axios’s writing product, joined Every after getting super AI-pilled. Now he runs their product Spiral, and ships faster than anyone on the team. He pairs technical knowledge with spiky product sense, deep user empathy, and an eye for what matters. Dan thinks any PM who gets really AI-native will be incredibly dangerous because the building is done for you—what matters is figuring out what to build and if it’s great. 4. Full-stack designers are becoming superheroes. Designers used to make beautiful interactions that engineers didn’t want to build or couldn’t execute properly. Now designers don’t need to hand things off; they can build it themselves. Designers are naturally creative people, and AI is the perfect tool for them because it lets them bring their vision to life without the traditional bottlenecks. 5. SaaS is not dead. In fact, Dan is bullish on SaaS stocks. When users bring their own AI (via Codex or Claude Code) to use SaaS products, the user—not the SaaS company—pays for tokens. This saves SaaS company’s margins. Since the agents need their own seats, Dan predicts that agents will create massive new demand for SaaS because there will be tons of agents using these products at high volume. 6. Every company will have one “super-agent” inside their Slack that every employee will use. Dan initially thought every employee would have their personal work agent, like a shadow AI org chart, but he’s completely flipped his view. He realized agents need humans who care about them. When someone gets tired of maintaining their personal agent, it becomes useless. The winning model is one forward-deployed engineer or AI-savvy person who maintains a company-wide agent (like Shopify’s River or Viktor), and then it trickles down to more specialized team agents as models improve and become less fiddly. 7. The AI job apocalypse is not happening, but you do need to evolve to stay relevant. Models make yesterday’s human competence cheap. But because everyone uses the same models, it all looks the same if you use it the default way; it becomes commoditized slop. Humans then take that frozen competence and use it to make something new and interesting for their specific situation. The key: “ride the models”—use them for everything you do, try new models when they drop, keep turning over rocks. 8. We will read way more AI-generated writing, and we will like it. Human writing is incredibly important for things that matter, but for internal docs, planning, and email, AI-generated is often better because most people are bad at writing strategy documents. 9. Build software for humans and agents to use together. The current model is building a CLI that an agent uses independently. Instead, you and your agent should be using the app together. This creates new design challenges—agents can make a billion requests in three seconds, so you need approval flows, inboxes that summarize what happened, logs, and easy rollback. 10. Forward-deployed engineers are the new most essential role. The big model companies have teams of people managing their internal agents, and those teams aren’t going away. It’s different from traditional software building, and certain engineers love it. As models get better, this role will evolve—you’ll be managing more agents doing more things.

Lenny Rachitsky@lennysan

Automation is a lie. CLIs are over. The SaaSpocalypse is dumb. A year ago @danshipper came on the podcast to predict where AI was heading. He was remarkably right—including the call that everyone was sleeping on Claude Code. Dan has a unique lens into where things are going because his team at @every is possibly the most AI-pilled group of people in tech. I always learn a ton talking to Dan. So I brought him back for round two. We'll score these in exactly a year: 🔸 Every company will have one “super-agent” in Slack. 🔸 Codex and Claude Code will become the new operating system for knowledge work. 🔸 The AI job apocalypse is not happening. 🔸 PMs and designers will thrive. 🔸 We will read way more AI-generated writing and we will like it. 🔸 "I would buy SaaS stocks right now." Listen now 👇 youtube.com/watch?v=4D3hDm…

English

144

234

728.4K

Ravikant Dewangan@ronitkd·1d

Memorial Day Murph at Persistence today. 4 of 12 athletes went unbroken on pull-ups, and all four run our 3x weekly farmer carries. Carries are the most underprogrammed accessory in most CrossFit gyms.

English

Ravikant Dewangan@ronitkd·1d

@yunta_tsai This is exactly how I run agent work now. Last week Claude Code spawned three sub-agents in parallel to audit token counts in my Cowork plugin. Cut the review from 25 minutes to 6. The orchestrator pattern is the bottleneck killer.

English

237

Yun-Ta Tsai@yunta_tsai·1d

My favorite prompt: a) make a plan for <task> b) orchestrate and launch sub-agents to execute the plan c) validate the results from the sub-agents d) repeat b and c until you finish the plan

xAI@xai

Grok Build is now available in Beta for all SuperGrok and X Premium+ users. Use Plan Mode, create images and videos with Imagine, and build automations or orchestrators with the CLI. Visit x.ai/cli to get started.

English

198

931

2.7K

Ravikant Dewangan@ronitkd·1d

@CrossFit Forward rolls into wall walks is the trap. Heart rate spikes from the rolls, then you need shoulder stability inverted. Going to test it Tuesday on six athletes at Persistence and see how the order holds up.

English

CrossFit@CrossFit·1d

Workout of the Day Tuesday 260526 Adrian 7 rounds for time of: 3 forward rolls 5 wall walks 7 toes-to-bars 9 box jumps ♀ 24-inch box ♂ 30-inch box Compare to 120923. Post time to comments. 📍CrossFit Drummond in Quebec, Canada #CrossFit #WorkoutoftheDay

English

7.3K

Ravikant Dewangan@ronitkd·1d

@scion_x_ The agent loop test is what matters. Claude Code rewrites a failing pytest, re-runs, and ships in one turn for me. Curious how Grok Build handles that same read-error-fix-rerun cycle.

English

165

Laurent@scion_x_·1d

Just spent a full coding session with Grok Build and honestly? It's right there with Claude. The model is sharp, the agentic flow holds up on complex tasks, and it has actual personality. Few rough edges on the UX side (sent detailed feedback to the team), but the core is genuinely impressive. @xai cooked 🔥

English

268

2.9K

837.1K

Keşfet

@tunguz @mckbrando @tszzl @comradCornelius @AdamKellyFA @vandy_62 @bindureddy @PeterDiamandis