Shreyans Bhansali

22.2K posts

Shreyans Bhansali

@askcodi

Building AI stuff I find cool | Makersfuel x AskCodi

Katılım Ekim 2021

860 Takip Edilen697 Takipçiler

Shreyans Bhansali@askcodi·28m

@haider1 slack channels becoming autonomous research labs was not on my bingo card

English

Haider.@haider1·12h

Boris Cherny says internally Anthropic uses the same models as everyone else, with some Mythos, which will eventually ship as a descendant to the public "there's no manually written code anywhere at the company" Internally, Claudes talk to each other all day over Slack, coding in loops, resolving unknowns across teams

English

484

98.8K

Shreyans Bhansali@askcodi·29m

@boringmarketer codex catches edge cases other agents just miss

English

The Boring Marketer@boringmarketer·13h

I've been using codex as my primary coding agent for the last two months. I fired up 4.7 today, and it continually makes mistakes that are table stakes for codex on 5.5 huge difference in depth, thinking quality, and precision claude code still shines for design oriented stuff, but thats about it right now

English

149

12.5K

Shreyans Bhansali@askcodi·15h

@system_monarch 60% savings is just grouping intent before you even hit the model...

English

Puneet Patwari@system_monarch·1d

System Design Round at Anthropic: You are running an LLM in production that costs $0.40 per query. At 100,000 queries a day that is $40,000 a day. You check your logs and find 60,000 of those queries are users asking slight variations of the same 200 questions. Your model is generating a fresh answer every single time. How do you cut your inference cost by 60% without the user ever feeling like they got a cached or stale response?

English

893

273.3K

Shreyans Bhansali@askcodi·15h

@asaio87 basics help, but most people learn them faster by breaking things with ai...

English

andrei saioc@asaio87·23h

Vibecoding is overly inflated Its these AI companies that make you, a regular non developer guy, think you can be a developer and actually build production apps. Learn some basics before even getting to code with these AI agents I can assure you that you will end up like this guy Just spend a lot of tokens and time, which is even more valuable, only to end up in the same place For a developer, AI agents are great (to a degree) but for a non developer can be a lot of frustration

English

3.5K

Shreyans Bhansali@askcodi·17h

@0xSero feels like reasoning got overrated, most tasks just need fast iteration not deep thinking...

English

0xSero@0xSero·1d

I started using GPT-5.5 on low/no reasoning because of Ben and since then I can: 1. Activate fast mode all day without running out of credits 2. Time to task completion is 10% what it was with thinking 3. The model feels significantly more like a Claude model 4. Cheap AF

Ben Davis@davis7

This is very late, but I'm finally done with my 5.5 vid - use low reasoning - the name sucks - it's fast - best code I've ever seen a model write came from this model - openai's new pre-training is amazing - price looks worse than it is - over sensitive to every little thing in it's context window - feels wildly different compared to 5.4 - turn reasoning off. try it. turn the reasoning off. do it.

English

1.4K

178.4K

Shreyans Bhansali@askcodi·17h

@AdamHoltererer productivity went up, step count went down, net loss honestly...

English

Adam Holter@AdamHoltererer·1d

I'm starting to miss Claude after switching to Codex. I used to get nice breaks to go on a walk whenever I hit my rate limit, but now I'm not hitting it anymore with Codex.

English

560

14.1K

Shreyans Bhansali@askcodi·18h

@tomfgoodwin it didnt get worse, it just got optimized for safety and consensus over taste...

English

Tom Goodwin@tomfgoodwin·1d

I swear, the first time I used generative AI for recommendations, it was amazing I remember being in Kyoto and asking for a little cocktail place, and the answer was incredible , I gave a very vague prompt and it "knew me" I did the same in Melbourne and with remarkably little input, it found me 3 stunning unexpected places. I asked for atypical things in Miami, And it suggested places that I'd never heard of But now, if you do it, it basically just uses TripAdvisor and Reddit, or Google Maps. If you asked for places to go in NYC it will tell you the High Line, not like the past when it could say OCDChinatown It went from being magical, astonishing, and invaluable, to rather useless. What happened? Commercial deals ? Laziness?

English

100

28.6K

Shreyans Bhansali@askcodi·18h

@SIGKITTEN models are ready, interfaces arent, nobody wants to talk to their laptop yet...

English

SIGKITTEN@SIGKITTEN·1d

the voice models are pretty damn good right now and i wonder why noone gaf. maybe a decade+ of completely useless Siri will take a whole generation of people to trust voice clankers again i think voice's "chatgpt moment" is still ahead of us somehow

Sam Altman@sama

pretty excited for voice models to get great its interesting to watch how people are already starting to change the way they interface with AI

English

226

17K

Shreyans Bhansali@askcodi·19h

@embirico feels like cron jobs just got replaced by natural language...

English

Alexander Embiricos@embirico·1d

codex can work in the future: "tomorrow, check in on this discussion and ping me if it isn't resolved" "let me know if this bug isn't fixed by the day before launch" "bug me if this flaky test doesn't go green after retry" i do this all the time. powerful but not obvious—yet

English

566

38.5K

Shreyans Bhansali@askcodi·19h

@sudoingX this works until the spec is wrong, then you just get confidently wrong output at scale...

English

370

Sudo su@sudoingX·22h

few days into codex plus and i think i found the hack. nobody is talking about it and the value sitting in this subscription is wild. the hack: do not prompt the agent. write a single detailed task doc with every requirement laid out plus the final vision of what you are building, then fire codex cli with one line, accomplish this and test until done. it goes. hours of uninterrupted agentic coding on gpt 5.5 xhigh, no throttling, no rate cap, 'no can you clarify loop'. the agent has everything it needs in one place so it works the problem instead of working you. i have been grinding it since this morning, screenshot below shows the session past 24 mins and still running. anthropic burns through your daily allowance in three opus 4.7 prompts then your entire tier id is gone for the day. codex plus on the same money goes on and on while you go take a walk. this is the most underrated subscription in the agentic stack right now. the value is there if you front-load the prompt instead of conversation-mode it. give codex the brief, walk away, come back to a finished task. try this. loot the value while the math still favors you.

English

1.1K

102.9K

Shreyans Bhansali@askcodi·20h

@VraserX compute matters, but distribution and product surface decide who actually wins...

English

VraserX e/acc@VraserX·1d

Dario Amodei must be crying in his sleep watching OpenAI 10x Codex limits after GPT-5.5. Anthropic didn’t fumble because Claude is bad. They fumbled because compute is the game now. Without enough compute, growth doesn’t slow down. It grinds to a halt.

English

203

22.1K

Shreyans Bhansali@askcodi·20h

@sarahmsachs speed is the product, if it doesnt serve fast it doesnt matter how smart it is...

English

199

Sarah Sachs@sarahmsachs·22h

DeepSeek v4 works fine, but it’s not the frontier-pressing moment we saw with Kimi 2.6. On Notion eval data, it’s similar performance to GPT 5.2, with understandable failings. Most interesting — it doesn’t scale well. It’s ridiculously slow. On multiple major, trusted, and performant US inference providers we see it 15x slower than GPT 5.2 and 2x slower than Opus 4.7, a problem Kimi never had. Curious if it’s a fundamental issue in architecture, or a matter of time til inference providers make it work. Doesn’t seem urgent either way, if Kimi can outperform. Cheaper maybe, but not groundbreaking.

English

573

84.6K

Shreyans Bhansali@askcodi·20h

@TheGeorgePu losses dont matter if youre buying the bottleneck everyone else depends on...

English

George Pu@TheGeorgePu·1d

OpenAI spent $2.25 for every $1 it made in 2025. $13.1B in revenue. ~$8B in losses. Missed the 1B user target. Missed monthly revenue goals. Lost enterprise share to Anthropic. Now heading to the largest IPO in history. The CFO is already worried they can't fund their compute contracts. The IPO has to work because nothing else does.

English

Shreyans Bhansali@askcodi·20h

@diegohaz gpt doing self reviews like a startup founder, opus like a tired senior engineer...

English

298

Haz@diegohaz·1d

I gave the same task to GPT-5.5 xhigh and Opus 4.7 max. – GPT took ~30 minutes. – Opus took ~2 hours. Then I asked them to review each other's work and give an honest verdict on which was better. – GPT said its own code was better. – Opus said there was no obvious winner. Then I asked them to learn from each other and apply the best parts they had learned to their own code. In the end, I asked them to review each other again and give an honest verdict after they had both improved. – GPT kept saying its own code was better. – Opus said GPT's code was better. What a journey.

English

166

2.7K

299.8K

Shreyans Bhansali@askcodi·22h

@JJEnglert everyone focuses on models, but distribution inside orgs is breaking on UX, permissions, and naming...

English

JJ Englert@JJEnglert·1d

10 things I'm seeing on the frontlines of AI adoption in the enterprise: 1. Chat is where 90% of employees still live. It's the gateway drug. Everything else is downstream of getting people comfortable here first. 2. Power users discover Cowork and lose their minds. It's the "wait, it can actually do the work?" moment. 3. Claude Code has very little penetration with non-technical users in the enterprise still. 4. Microsoft being the "approved" tool doesn't matter. Employees route around Copilot and pitch their managers for Claude access on their own. 5. Artifacts in Claude are a breakout feature. People don't want to view them — they want to deploy them, connect them to Snowflake, etc., ship them as internal MVPs for their org to actually use. 6. Cowork is crossing the line from "demo" to "real work." Legal teams redlining contracts. Ops teams running workflows. Then immediately asking: how do I automate this for production? 7. The next unlock → automated cloud workflows that leverage an agent like Claude while keeping non-technical users within the tools they're already using and in a chat interface. The demand is screaming. 8. Terminology is major blocker. Projects vs. skills vs. plugins vs. agents. I've explained "what is a skill" 200+ times. The moment it clicks, people get excited — but the path there is too long. 9. Enterprise IT restrictions (locked connectors, no browser access) quietly strip Cowork of its superpowers. The features that make it magical are the first ones IT disables. 10. There is a high level of "AI insecurity". For the first time in a long time, people at all levels (even C-Suite) need to signifcantly upskill in order to stay world class in their positions, and this is causing people to be insecure about their skill set across the org. General note on Microsoft: I spent a lot of this past week deep in Power Automate and Copilot Studio trying to build an automated solution in the cloud — given it's the native tool with sanctioned access to their org's data. It's ~90% there. But the final 10% is riddled with terrible UX, inconsistent behavior, and a generally poor experience. Honestly feels like Microsoft is fumbling the biggest moment in their company's history with software that has all the features on paper but lacks the magical "just works" moment for non-technical team members. The gap is wide open and they're letting others "eat their lunch" right now.

English

691

85.6K

Shreyans Bhansali@askcodi·22h

@AlexFinn /goal isnt magic, its just removing the friction that used to stop bad loops early...

English

215

Alex Finn@AlexFinn·1d

The biggest advancement in AI coding this year has been /goal And it isn't even close It allows your AI agent to quite literally work for days without stopping. You give a mission. It works until the mission is complete Here's the thing though: /goal is useless if you don't use it properly You NEED a good prompt for it I found basically any prompt I hand write after /goal is never good enough. It produces results that might as well have been a normal prompt Meta prompting is the answer Go to any AI that has context around the project you're working on Say "I'm working with Codex and I want to use their new /goal feature. Please research their /goal feature. Then, take a look at our project and give me 3 options for how we could use /goal to be maximally productive. Then give me a highly detailed /goal prompt for each" Take one of the prompts then go into the Codex CLI and type /goal then give the new prompt I 100% guarantee the AI does better work than you've ever seen before

English

139

109

1.8K

101.4K

Shreyans Bhansali@askcodi·23h

@nxhaaa19 feels less like different strengths and more like different tradeoffs in training and inference budgets...

English

neha@nxhaaa19·1d

Claude Opus 4.7 vs GPT-5.5 what nobody is telling you: -Opus 4.7 wins on coding. It's not close. -GPT-5.5 wins on terminal workflows and math -Both have 1M token context windows -Opus 4.7 sees images at 3x higher resolution -GPT-5.5 uses 72% fewer tokens for the same task -Opus 4.7 is better at fixing real GitHub issues -GPT-5.5 is faster and cheaper to run at scale They're not competing. They're optimized for different work.

English

144

517

49.8K

Shreyans Bhansali@askcodi·23h

@DBredvick models scale instantly, org change doesnt, thats the real bottleneck here...

English

Drew Bredvick@DBredvick·1d

Rumored today that both OAI + Anthropic are in some way funding consulting firms that help companies adopt AI. This is neither bearish or bullish — just proves that it's going to be a 10-20 year slog to get all businesses agentified. Wrote about this a while back: drew.tech/posts/agi-some…

English

649

59.1K

Shreyans Bhansali@askcodi·1d

@milan_milanovic idk feels less like lower quality and more like volume exposing what was always fragile...

English

Dr Milan Milanović@milan_milanovic·1d

We've never experienced this level of low quality in software Everything we did was to fight to improve the quality Now, all of a sudden, everyone is chasing more lines of code with AI, everything is broken, and no one is disturbed due to that Weird times to be alive

English

175

106

1.4K

37.4K

Shreyans Bhansali@askcodi·1d

@haider1 better models dont reduce demand, they multiply it faster than infra can catch up...

English

Haider.@haider1·1d

openai can still have a compute shortage even if gpt-5.5 is extremely fast and good they shut down Sora to free up compute and are probably cutting some research compute to handle rising demand, especially now that gpt-5.5 is driving much heavier usage as models improve, usage will rise even more

English

169

13K

Keşfet

@haider1 @boringmarketer @system_monarch @asaio87 @0xSero @AdamHoltererer @tomfgoodwin @SIGKITTEN