Ahmed Janim
70 posts

Ahmed Janim
@janim007
building the strongest AI assistant.
Burlingame, CA Katılım Haziran 2023
60 Takip Edilen4 Takipçiler

New CursorBench results just dropped.
Two big takeaways.
Composer 2.5 is way better than most people think.
63.2% score at $0.55 per task.
Nearly matching Opus 4.7 Max and GPT 5.5 Extra High at 20x less cost.
This is insane value.
Gemini 3.5 Flash is #10 at 49.8%.
Below GPT 5.5 Low.
Below Opus 4.7 Low.
Google's newest model can't even beat budget tier competition.
Composer 2.5 is the sleeper.
Gemini 3.5 Flash is the disappointment.

English

@Tech_girlll That’s how computers work, you get input you process it you send back the new response
English

@suni_code Start, end will execute immediately in order while the other two go to macrotask so the engine comes back to it and prioritizes the promise over the settimeout
English

@BacLeodiv I’m lead engineer and have an engineer in my team pushes PRs using AI without verifying,
it’s freaking headache to review their code and there’s at least 30% chance the AI replaced/drletef code that has to be there.
and another 20% chance it’s logic is not reliable
English

@weswinder Stop building mobile apps
Start building VR apps
Trust me bro
English

@ztyan AI security is a good bet but the real unlock is AI toolchain that works on the desktop, not just in the cloud. most devs still live in their terminal and editor.
English

@0xNairolf the `overflow-x: hidden` on body and 14 nested divs for a login form are the tell. vibecoding is great for prototypes, less so when you inherit the CSS.
English

@AdamHoltererer Plain wrong ? Have you actually tried using ai with production frontend , like consuming thousands of incoming websockst messages and display multiple charts , playing videos in a single page view
English

@AdamHoltererer It works like 80% of the time, same range as vibe-coding
English

@suni_code Their first video of someone coding the entire app was clearly heavily edited, you could see the difference in time stamps between messages was like hours but they made it seem like few minutes and hide all back and forth error handling
English

@kapilansh_twt it's not the coding that's addictive. it's watching something you described in 3 sentences actually work
English

@forgebitz the smarter it gets the more creative its mistakes. had one rename all my files to emojis yesterday
English

@ryanvogel the scary part isn't llama-7b. it's the fine-tuned version trained on your last 6 months of work
English

@TTrimoreau been thinking about this. the moat isn't the model. it's the context you feed it. same AI, different data, completely different behavior
English

@Prathkum the CLI has one advantage GUIs never solved: you can automate it without reverse engineering someone's CSS
English



















