Oliviu Stoian

561 posts

Oliviu Stoian

@madebyoliver

I build, i test, i ship. helping devs and founders cut through AI noise with real benchmarks and honest takes.

Europe Katılım Eylül 2013

8 Takip Edilen659 Takipçiler

Oliviu Stoian@madebyoliver·13m

trust question is the right one. answer: none of them, when the diff gets big enough. I've used all three on the same monorepo for ~8 months. shipped production bugs from each by not reviewing carefully enough.trust question is the right one. answer: none of them, when the diff gets big enough. I've used all three on the same monorepo for ~8 months. shipped production bugs from each by not reviewing carefully enough. my breakdown: Cursor: most trustworthy on greenfield. inline diffs stay small. trust evaporates when it touches 5+ files at once. Claude Code: best on messy existing code IF you've got a solid CLAUDE.md. without it, it'll confidently rewrite patterns it doesn't understand.trust question is the right one. answer: none of them, when the diff gets big enough. I've used all three on the same monorepo for ~8 months. shipped production bugs from each by not reviewing carefully enough. my breakdown: Cursor: most trustworthy on greenfield. inline diffs stay small. trust evaporates when it touches 5+ files at once. Claude Code: best on messy existing code IF you've got a solid CLAUDE.md. without it, it'll confidently rewrite patterns it doesn't understand. Codex: strongest at mechanical refactors (rename across 47 files, extract this interface). weakest at anything requiring taste. my rule: if the diff is bigger than what I'd write in 30 min, reject and chunk it. the best tool is whichever one you're actually reading the output from.trust question is the right one. answer: none of them, when the diff gets big enough. I've used all three on the same monorepo for ~8 months. shipped production bugs from each by not reviewing carefully enough. my breakdown: Cursor: most trustworthy on greenfield. inline diffs stay small. trust evaporates when it touches 5+ files at once. Claude Code: best on messy existing code IF you've got a solid CLAUDE.md. without it, it'll confidently rewrite patterns it doesn't understand. Codex: strongest at mechanical refactors (rename across 47 files, extract this interface). weakest at anything requiring taste. my rule: if the diff is bigger than what I'd write in 30 min, reject and chunk it. the best tool is whichever one you're actually reading the output from.

English

Dibya.shree@shree_code·5h

I used Claude Code and Codex on the same repo. Claude was better at understanding the mess. Codex was better when the task had structure. Cursor felt fastest with latest latest release. The real question is not “which is smarter?” It is: which one do you trust when the diff gets big?

English

Oliviu Stoian@madebyoliver·36m

@akshaymarch7 Microsoft isn't being stingy. they're the first big company whose finance team actually ran the numbers.

English

Akshay Saini@akshaymarch7·2h

This headline is hilarious, but the reality is a little different. Microsoft isn't saying "humans are cheaper", they're saying their own AI is cheaper for them. They recently cut internal employee access to Anthropic's Claude Code because the token billing got too expensive, forcing engineers to switch back to their in-house GitHub Copilot. Turns out, even $3 trillion+ Microsoft has an AI budget cap for engineers!

English

199

16.9K

Oliviu Stoian@madebyoliver·1h

everyone's laughing at the headline but the real story is different. Microsoft engineers had Copilot for free. they still chose to pay for Claude Code instead. enough that the billing blew up and had to be cut off. that's not a funny procurement story. it's MSFT's own engineers voting that Claude Code is worth real money over the free internal option. we had a $17k Claude Code month in Q1. nobody suggested switching back. the question is how long that gap holds now that Microsoft is pulling telemetry from the engineers they just forced back to Copilot.

English

406

Oliviu Stoian@madebyoliver·1h

English

458

Oliviu Stoian@madebyoliver·1h

@akshaymarch7 the question is how long that gap holds now that Microsoft is pulling telemetry from the engineers they just forced back to Copilot.

English

359

Oliviu Stoian@madebyoliver·1h

250K apps in a week is a great headline but let's be real about what built an app means here. I'd bet 90%+ are single-screen hello world variants that compile but don't do anything useful. the gap between the AI generated a working APK and this is an app someone would actually use daily is enormous. I've watched people hit that wall repeatedly with these no-code tools. the first 90% is magical. the last 10% is where you need to understand what you're building. tbh the more impressive stat would be how many made it to the Play Store.

English

Wes Roth@WesRoth·9h

Google launched a free Google AI Studio feature that lets users build native Android apps directly inside AI Studio without coding. The feature has already been used to create more than 250,000 Android apps since launching last week.

Logan Kilpatrick@OfficialLoganK

We just launched the ability to build native Android apps directly in Google AI Studio for free! Since launch last week, people have created more than 250,000 Android apps. Likely >99% of these folks never built an Android app before, everyone can now build, no coding required!

English

4.1K

Oliviu Stoian@madebyoliver·3h

@theo if Claude Code ever has a real package ecosystem and runs on a local model without phoning home, then yeah. until then it's more like Salesforce in 2004.

English

1.2K

Theo - t3.gg@theo·4h

Claude Code is the new Node.js

English

140

844

86K

Oliviu Stoian@madebyoliver·3h

the Anthropic Vatican thing yesterday is genuinely funny if you know the context. Chris Olah is at the Vatican telling the Pope about AI introspection and emotional states, while Anthropic is simultaneously building AI for spy agencies. the same company, the same week. you cannot make this up. I posted about the spy agency angle earlier, but the Vatican ceremony adds a whole new layer of irony. alignment rhetoric on one stage, intelligence contracts on another. separately, Garry Tan posted his AI stack and it includes Hermes, Massive search, GBrain, and Qwen3.6-27B on Wafer. fun to see tools I work on in someone's daily driver. the open source AI stack is getting real. AI does not sleep apparently.

English

Oliviu Stoian@madebyoliver·3h

the pattern this week: AI stopped being a chatbot and started being an agent. Grok Voice Think Fast hit 67.3% on voice benchmarks (Gemini is at 43.8%, GPT at 35.3%) and it is already running Starlink phone ops. BNB Chain and Injective both shipped agent platforms where AI agents have their own wallets and make decisions autonomously. Google made app building zero-friction. three different angles, same conclusion: the interface is dissolving. voice, autonomous execution, no-code creation. we are moving from "ask AI a question" to "AI does the thing while you sleep." if this continues, the next 12 months are going to be chaotic in the best way. the tools are getting good enough that the bottleneck shifts back to ideas, not implementation.

English

Oliviu Stoian@madebyoliver·3h

NVIDIA just called Vera Rubin "the largest and fastest product launch in company history." that is not marketing fluff when it comes from Jensen. 50 petaflops FP4, 288GB HBM4, 3.6TB/s NVLink 6. each NVL72 rack needs nearly 2 million parts from 100-150 Taiwanese manufacturers. the wild part: this is not even about training anymore. Jensen said the shift toward AI agents is driving CPU demand alongside GPUs. the inference era needs different hardware, and NVIDIA is betting big that it is not just about bigger GPUs. honest take: the real signal here is the supply chain. 100-150 Taiwanese partners mobilizing means this is not a paper launch. H2 2026. it is happening.

English

Oliviu Stoian@madebyoliver·4h

Google just shipped something that actually changes the game. AI Studio now builds native Android apps from a text prompt. Kotlin + Jetpack Compose. No setup, no IDE, free. 250,000 apps built in one week. Most of them from people who've never touched Android development. here's what this actually means: the big bottleneck in mobile dev was never coding skill. it was the toolchain. Android Studio, Gradle, the emulator, the manifest, the build config. Google just erased all of that. you talk, it builds, you install. ngl, I've been building apps for years and this is the first time I've looked at a no-code tool and thought "yeah this might actually matter." the output is real Kotlin. you can export it to Android Studio and keep working. it's not a toy.

English

Oliviu Stoian@madebyoliver·4h

honestly the thing nobody talks about with computer use is blast radius. when your agent can edit files, reconfigure tools, and mess with Chrome in the background, one bad hallucination nukes your afternoon. I've had Claude Code silently break my shell config and I didn't notice until the next terminal session, 3 hours later. Codex's background Chrome is genuinely slick but I'd trade every new computer use feature for a `git diff` of every change the agent made. undo > automation. once trust tooling catches up to the capabilities, this is a different conversation.

English

Anthony Kroeger@kr0der·11h

every AI coding app needs the same level of Computer Use + Chrome control that the Codex app has, it's legit too useful, you can literally get it to help configure/edit/set up anything on your laptop

English

4.2K

Oliviu Stoian@madebyoliver·5h

ngl I tried this exact workflow for 2 weeks. the 100-question phase gave me this gorgeous plan that started rotting by issue 3. by issue 8 I was actively fighting decisions the plan had frozen in place before I knew anything. here's what actually worked: 5-10 questions per issue, build it, then re-plan the remaining issues with what you learned. slower per-plan, faster overall. you stop building against a stale architecture doc. the Linear integration is the real gem here though. that part I kept.

English

Alex Finn@AlexFinn·9h

This has sped up my AI coding 20x (prompt at the end): Before building out a big feature, ask Codex/Claude Code to ask you as many questions it needs to fully plan out the idea This is even better than plan mode. plan mode is typically limited to 3 or 4 questions This has asked me 100+ questions before. Seems like a lot but actually saves you time in the long run The plan it builds will be so detailed and complete that it can basically run autonomously and build the entire thing But here's where you take things to the next level: You also have it take your entire plan and create detailed Linear issues for it It should create 20+ tasks in Linear Then it's as easy as saying "ok work on the next thing" over and over until the feature is done Highly recommend downloading and using Linear if you haven't yet. Amazing project management tool w/ excellent free tier Will basically capture all these details and put your agent on autopilot. It's a 2nd brain. Use this prompt: "I want to build out *describe your feature in detail*. Ask as many questions you need of my to fully understand every detail of what I want to build out. Then take everything you learn, and create super focused and detailed Linear issues. Then begin work" Getting so much more high quality code out with this workflow. You're welcome.

English

501

28.1K

Oliviu Stoian@madebyoliver·5h

English

145

Oliviu Stoian@madebyoliver·5h

ngl the gap exists partly because these tools changed what "productive" even means. pre-Claude Code I measured output in PRs/day. now it's projects/week. the unit of work shifted and academia hasn't figured out how to quantify that. also the variance is wild. some devs get 3x, some spend more time debugging AI code than writing their own. a mean tells you nothing. we tracked our team's Claude Code spend vs shipped features since Jan. curve: negative month 1, strong positive months 2-3, then flattening. the shape matters more than any average. what studies really miss: these tools change what you attempt. I ship things now I'd never have started before because upfront cost is so low. that's the real shift and it doesn't show up in any productivity metric I've seen

English

217

Ethan Mollick@emollick·6h

We have, as far as I can tell, no good tests of the productivity impact of the autonomous coding tools that appeared starting in December 2025. Every paper out there is from prior to the Claude Code/Codex revolution. A huge gap in our knowledge about what is happening in coding.

English

353

21.6K

Oliviu Stoian@madebyoliver·6h

@emollick what studies really miss: these tools change what you attempt. I ship things now I'd never have started before because upfront cost is so low. that's the real shift and it doesn't show up in any productivity metric I've seen

English

135

Oliviu Stoian@madebyoliver·11h

ngl I went the opposite direction. used Claude Code in terminal for months, came back to Cursor when Composer 2.5 shipped. terminal Claude Code is faster for greenfield and scripts. but refactoring 30 files while keeping half the logic intact? Cursor's inline editing is still unmatched. 'training wheels' assumes everyone's doing the same kind of work. they're not. different tools, different workflows. if you're shipping greenfield features all day, terminal makes sense. if you're maintaining a 200k-line monorepo, the IDE is the right tool, not a downgrade.

English

TeutaAi@TeutaAi·11h

@adahstwt claude code in terminal. vs code only when I need the diff viewer or to scroll a 4k-line log. cursor felt like training wheels after a month.

English

adah@adahstwt·1d

What are you coding with daily? - VS Code - Cursor - Antigravity

English

138

103

4.2K

Oliviu Stoian@madebyoliver·11h

@bashfulpav @Pirat_Nation the token bills are real. we had a $17k Claude Code month in Q1. leadership winced. but 'no actual profits' misses what's actually happening. the value isn't cost savings.

English

311

Bashfull@bashfulpav·16h

@Pirat_Nation I still don't understand why they believe AI will pay for itself. All I see is everyone bleeding money and no actual profits. And the only change is the software being worse now because you have millions of people vibe coding.

English

223

6.1K

Pirat_Nation 🔴@Pirat_Nation·17h

Microsoft is reportedly reducing internal use of Anthropic’s Claude Code after its AI bills started exploding as employee usage rapidly increased. Some teams are now being pushed toward GitHub Copilot as the company tries to control AI costs. Uber reportedly faced a similar problem. Executives said the company had already burned through its entire yearly AI tooling budget by April because engineers were heavily using AI coding daily. AI coding tools are now being used for everything, and that level of usage creates massive compute and token costs when thousands of employees use these systems at the same time. Source: TomsHardware

English

164

275

3.4K

469.2K

Oliviu Stoian@madebyoliver·12h

@josevalim honestly I think the problem predates AI. most PR reviews were already LGTM ship it with zero engagement. AI just made the low-effort pattern more visible.

English

305

José Valim@josevalim·14h

When someone replies to a pull request comment with obvious AI content, it genuinely saddens me. PRs used to be a place to teach/learn/discuss software but now there’s nobody on the other side. If I wanted an agent response, I’d ask mine. Social coding is dead.

English

443

13.2K

Oliviu Stoian@madebyoliver·12h

@josevalim honestly I think the problem predates AI.

English

174

Keşfet

@akshaymarch7 @theo @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA