Shreyans Bhansali

22.2K posts

Shreyans Bhansali banner
Shreyans Bhansali

Shreyans Bhansali

@askcodi

Building AI stuff I find cool | Makersfuel x AskCodi

Katılım Ekim 2021
860 Takip Edilen697 Takipçiler
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@haider1 slack channels becoming autonomous research labs was not on my bingo card
English
0
0
0
8
Haider.
Haider.@haider1·
Boris Cherny says internally Anthropic uses the same models as everyone else, with some Mythos, which will eventually ship as a descendant to the public "there's no manually written code anywhere at the company" Internally, Claudes talk to each other all day over Slack, coding in loops, resolving unknowns across teams
English
43
33
484
98.8K
The Boring Marketer
The Boring Marketer@boringmarketer·
I've been using codex as my primary coding agent for the last two months. I fired up 4.7 today, and it continually makes mistakes that are table stakes for codex on 5.5 huge difference in depth, thinking quality, and precision claude code still shines for design oriented stuff, but thats about it right now
English
32
1
149
12.5K
Puneet Patwari
Puneet Patwari@system_monarch·
System Design Round at Anthropic: You are running an LLM in production that costs $0.40 per query. At 100,000 queries a day that is $40,000 a day. You check your logs and find 60,000 of those queries are users asking slight variations of the same 200 questions. Your model is generating a fresh answer every single time. How do you cut your inference cost by 60% without the user ever feeling like they got a cached or stale response?
English
59
38
893
273.3K
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@asaio87 basics help, but most people learn them faster by breaking things with ai...
English
0
0
0
25
andrei saioc
andrei saioc@asaio87·
Vibecoding is overly inflated Its these AI companies that make you, a regular non developer guy, think you can be a developer and actually build production apps. Learn some basics before even getting to code with these AI agents I can assure you that you will end up like this guy Just spend a lot of tokens and time, which is even more valuable, only to end up in the same place For a developer, AI agents are great (to a degree) but for a non developer can be a lot of frustration
andrei saioc tweet media
English
36
5
37
3.5K
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@0xSero feels like reasoning got overrated, most tasks just need fast iteration not deep thinking...
English
0
0
0
21
0xSero
0xSero@0xSero·
I started using GPT-5.5 on low/no reasoning because of Ben and since then I can: 1. Activate fast mode all day without running out of credits 2. Time to task completion is 10% what it was with thinking 3. The model feels significantly more like a Claude model 4. Cheap AF
Ben Davis@davis7

This is very late, but I'm finally done with my 5.5 vid - use low reasoning - the name sucks - it's fast - best code I've ever seen a model write came from this model - openai's new pre-training is amazing - price looks worse than it is - over sensitive to every little thing in it's context window - feels wildly different compared to 5.4 - turn reasoning off. try it. turn the reasoning off. do it.

English
57
41
1.4K
178.4K
Adam Holter
Adam Holter@AdamHoltererer·
I'm starting to miss Claude after switching to Codex. I used to get nice breaks to go on a walk whenever I hit my rate limit, but now I'm not hitting it anymore with Codex.
English
27
13
560
14.1K
Tom Goodwin
Tom Goodwin@tomfgoodwin·
I swear, the first time I used generative AI for recommendations, it was amazing I remember being in Kyoto and asking for a little cocktail place, and the answer was incredible , I gave a very vague prompt and it "knew me" I did the same in Melbourne and with remarkably little input, it found me 3 stunning unexpected places. I asked for atypical things in Miami, And it suggested places that I'd never heard of But now, if you do it, it basically just uses TripAdvisor and Reddit, or Google Maps. If you asked for places to go in NYC it will tell you the High Line, not like the past when it could say OCDChinatown It went from being magical, astonishing, and invaluable, to rather useless. What happened? Commercial deals ? Laziness?
English
51
1
100
28.6K
Alexander Embiricos
Alexander Embiricos@embirico·
codex can work in the future: "tomorrow, check in on this discussion and ping me if it isn't resolved" "let me know if this bug isn't fixed by the day before launch" "bug me if this flaky test doesn't go green after retry" i do this all the time. powerful but not obvious—yet
English
36
28
566
38.5K
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@sudoingX this works until the spec is wrong, then you just get confidently wrong output at scale...
English
0
0
8
370
Sudo su
Sudo su@sudoingX·
few days into codex plus and i think i found the hack. nobody is talking about it and the value sitting in this subscription is wild. the hack: do not prompt the agent. write a single detailed task doc with every requirement laid out plus the final vision of what you are building, then fire codex cli with one line, accomplish this and test until done. it goes. hours of uninterrupted agentic coding on gpt 5.5 xhigh, no throttling, no rate cap, 'no can you clarify loop'. the agent has everything it needs in one place so it works the problem instead of working you. i have been grinding it since this morning, screenshot below shows the session past 24 mins and still running. anthropic burns through your daily allowance in three opus 4.7 prompts then your entire tier id is gone for the day. codex plus on the same money goes on and on while you go take a walk. this is the most underrated subscription in the agentic stack right now. the value is there if you front-load the prompt instead of conversation-mode it. give codex the brief, walk away, come back to a finished task. try this. loot the value while the math still favors you.
Sudo su tweet media
English
66
46
1.1K
102.9K
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@VraserX compute matters, but distribution and product surface decide who actually wins...
English
0
0
0
25
VraserX e/acc
VraserX e/acc@VraserX·
Dario Amodei must be crying in his sleep watching OpenAI 10x Codex limits after GPT-5.5. Anthropic didn’t fumble because Claude is bad. They fumbled because compute is the game now. Without enough compute, growth doesn’t slow down. It grinds to a halt.
VraserX e/acc tweet media
English
51
15
203
22.1K
Sarah Sachs
Sarah Sachs@sarahmsachs·
DeepSeek v4 works fine, but it’s not the frontier-pressing moment we saw with Kimi 2.6. On Notion eval data, it’s similar performance to GPT 5.2, with understandable failings. Most interesting — it doesn’t scale well. It’s ridiculously slow. On multiple major, trusted, and performant US inference providers we see it 15x slower than GPT 5.2 and 2x slower than Opus 4.7, a problem Kimi never had. Curious if it’s a fundamental issue in architecture, or a matter of time til inference providers make it work. Doesn’t seem urgent either way, if Kimi can outperform. Cheaper maybe, but not groundbreaking.
English
88
25
573
84.6K
George Pu
George Pu@TheGeorgePu·
OpenAI spent $2.25 for every $1 it made in 2025. $13.1B in revenue. ~$8B in losses. Missed the 1B user target. Missed monthly revenue goals. Lost enterprise share to Anthropic. Now heading to the largest IPO in history. The CFO is already worried they can't fund their compute contracts. The IPO has to work because nothing else does.
English
29
4
64
6K
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@diegohaz gpt doing self reviews like a startup founder, opus like a tired senior engineer...
English
0
0
0
298
Haz
Haz@diegohaz·
I gave the same task to GPT-5.5 xhigh and Opus 4.7 max. – GPT took ~30 minutes. – Opus took ~2 hours. Then I asked them to review each other's work and give an honest verdict on which was better. – GPT said its own code was better. – Opus said there was no obvious winner. Then I asked them to learn from each other and apply the best parts they had learned to their own code. In the end, I asked them to review each other again and give an honest verdict after they had both improved. – GPT kept saying its own code was better. – Opus said GPT's code was better. What a journey.
English
166
67
2.7K
299.8K
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@JJEnglert everyone focuses on models, but distribution inside orgs is breaking on UX, permissions, and naming...
English
0
0
1
11
JJ Englert
JJ Englert@JJEnglert·
10 things I'm seeing on the frontlines of AI adoption in the enterprise: 1. Chat is where 90% of employees still live. It's the gateway drug. Everything else is downstream of getting people comfortable here first. 2. Power users discover Cowork and lose their minds. It's the "wait, it can actually do the work?" moment. 3. Claude Code has very little penetration with non-technical users in the enterprise still. 4. Microsoft being the "approved" tool doesn't matter. Employees route around Copilot and pitch their managers for Claude access on their own. 5. Artifacts in Claude are a breakout feature. People don't want to view them — they want to deploy them, connect them to Snowflake, etc., ship them as internal MVPs for their org to actually use. 6. Cowork is crossing the line from "demo" to "real work." Legal teams redlining contracts. Ops teams running workflows. Then immediately asking: how do I automate this for production? 7. The next unlock → automated cloud workflows that leverage an agent like Claude while keeping non-technical users within the tools they're already using and in a chat interface. The demand is screaming. 8. Terminology is major blocker. Projects vs. skills vs. plugins vs. agents. I've explained "what is a skill" 200+ times. The moment it clicks, people get excited — but the path there is too long. 9. Enterprise IT restrictions (locked connectors, no browser access) quietly strip Cowork of its superpowers. The features that make it magical are the first ones IT disables. 10. There is a high level of "AI insecurity". For the first time in a long time, people at all levels (even C-Suite) need to signifcantly upskill in order to stay world class in their positions, and this is causing people to be insecure about their skill set across the org. General note on Microsoft: I spent a lot of this past week deep in Power Automate and Copilot Studio trying to build an automated solution in the cloud — given it's the native tool with sanctioned access to their org's data. It's ~90% there. But the final 10% is riddled with terrible UX, inconsistent behavior, and a generally poor experience. Honestly feels like Microsoft is fumbling the biggest moment in their company's history with software that has all the features on paper but lacks the magical "just works" moment for non-technical team members. The gap is wide open and they're letting others "eat their lunch" right now.
English
71
60
691
85.6K
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@AlexFinn /goal isnt magic, its just removing the friction that used to stop bad loops early...
English
1
0
2
215
Alex Finn
Alex Finn@AlexFinn·
The biggest advancement in AI coding this year has been /goal And it isn't even close It allows your AI agent to quite literally work for days without stopping. You give a mission. It works until the mission is complete Here's the thing though: /goal is useless if you don't use it properly You NEED a good prompt for it I found basically any prompt I hand write after /goal is never good enough. It produces results that might as well have been a normal prompt Meta prompting is the answer Go to any AI that has context around the project you're working on Say "I'm working with Codex and I want to use their new /goal feature. Please research their /goal feature. Then, take a look at our project and give me 3 options for how we could use /goal to be maximally productive. Then give me a highly detailed /goal prompt for each" Take one of the prompts then go into the Codex CLI and type /goal then give the new prompt I 100% guarantee the AI does better work than you've ever seen before
Alex Finn tweet media
English
139
109
1.8K
101.4K
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@nxhaaa19 feels less like different strengths and more like different tradeoffs in training and inference budgets...
English
0
0
0
9
neha
neha@nxhaaa19·
Claude Opus 4.7 vs GPT-5.5 what nobody is telling you: -Opus 4.7 wins on coding. It's not close. -GPT-5.5 wins on terminal workflows and math -Both have 1M token context windows -Opus 4.7 sees images at 3x higher resolution -GPT-5.5 uses 72% fewer tokens for the same task -Opus 4.7 is better at fixing real GitHub issues -GPT-5.5 is faster and cheaper to run at scale They're not competing. They're optimized for different work.
English
144
25
517
49.8K
Drew Bredvick
Drew Bredvick@DBredvick·
Rumored today that both OAI + Anthropic are in some way funding consulting firms that help companies adopt AI. This is neither bearish or bullish — just proves that it's going to be a 10-20 year slog to get all businesses agentified. Wrote about this a while back: drew.tech/posts/agi-some…
English
54
43
649
59.1K
Dr Milan Milanović
Dr Milan Milanović@milan_milanovic·
We've never experienced this level of low quality in software Everything we did was to fight to improve the quality Now, all of a sudden, everyone is chasing more lines of code with AI, everything is broken, and no one is disturbed due to that Weird times to be alive
English
175
106
1.4K
37.4K
Shreyans Bhansali
Shreyans Bhansali@askcodi·
@haider1 better models dont reduce demand, they multiply it faster than infra can catch up...
English
0
0
0
7
Haider.
Haider.@haider1·
openai can still have a compute shortage even if gpt-5.5 is extremely fast and good they shut down Sora to free up compute and are probably cutting some research compute to handle rising demand, especially now that gpt-5.5 is driving much heavier usage as models improve, usage will rise even more
English
24
8
169
13K