Elie Steinbock — oss/acc

34.7K posts

Elie Steinbock — oss/acc banner
Elie Steinbock — oss/acc

Elie Steinbock — oss/acc

@elie2222

Building https://t.co/0MTUhgDLIE, your executive assistant for email. 15k users. OSS | Cursor Ambassador | YouTube on open source: https://t.co/qf66pPJzgf

Tel Aviv Katılım Haziran 2010
3.2K Takip Edilen13.7K Takipçiler
Sabitlenmiş Tweet
Elie Steinbock — oss/acc
OMG I got Qwen 3.5 4b running on my emails now And it's handling them correctly on just 5GB RAM 🤯 using @inboxzero_ai as the harness
Elie Steinbock — oss/acc tweet media
English
18
9
240
29.7K
Wilson Wilson
Wilson Wilson@euboid·
@elie2222 I find it hallucinates way too much in internal benchmarks. So much that I can't trust it for almost any use-case 😅
English
1
0
1
80
Wes Bos
Wes Bos@wesbos·
Composer 2 vs Opus 4.6 vs GPT 5.4 - a totally unscientific test > Create a Twitter clone. Use Better Auth, Vite, Sqlite, Drizzle, Typescript, and React with Tanstack Start. Each one took ~5 mins in plan mode. Each had access to a browser to test. Composer: 5 mins $6.04 1,250 LOC Opus: 19 mins $10.43 1,000 LOC GPT: 22 mins $14.15 2,000 LOC Opus seems to use the cache WAY more than composer, so it's not really 10x more expensive. Composer app ran first try. Other two needed a bit of CORS debugging but did work. Code between all was extremely similar All three done inside cursor - so Claude Code / Codex may have been different Models used: Composer 2.0 (regular, not fast) Opus 4.6 Medium Thinking GPT 5.4 Medium Thinking
Wes Bos tweet mediaWes Bos tweet media
Wes Bos@wesbos

Cursor just launched Composer 2 - their own model. It's 10× cheaper than Opus 4.6 and supposed to rival it. I've been using it for a few days, I don't have any skewed graphs to show you but from a pure vibes POV I can tell you it's pretty good™ My litmus test right now is if it can build a 3D Printable model with Manifold CAD. I build a Gif zoetrope generator and it did fantastic.

English
58
27
738
180.6K
Elie Steinbock — oss/acc
@harriskennyx @inboxzero_ai ya. solid option to use in product. i wouldn't worry too much about exact model. best model for price changes the whole time. so you want to be adaptable. if you use openrouter/vercel ai gateway/ai sdk, it makes it easy to switch to the best option when things change
English
0
0
1
19
Harris Kenny
Harris Kenny@harriskennyx·
@elie2222 @inboxzero_ai this is exactly what i was thinking… i have some things i want to use AI for in our product but would be potentially very high volume… this might be it!
English
1
0
1
9
Wilson Wilson
Wilson Wilson@euboid·
Has anybody figured out how to do this? - @getsentry issue reported - codex agent spun up with access to sentry + axiom logs & traces - Draft PR auto-created w/ root cause analysis + fix
English
33
1
80
22.9K
Elie Steinbock — oss/acc
@harriskennyx @inboxzero_ai You need good quality at a good price point. 3 Flash is very strong for the price. There is a new wave of models like Kimi/Qwen/Minimax that may even be stronger. But privacy concerns are the problem there.
English
1
0
0
11
Elie Steinbock — oss/acc
@harriskennyx So for your day to day dev, I wouldn't recommend it. It's fine and cheap. But just use the frontier models for that. But if you have an AI product and you're spending thousands on tokens. eg. processing millions of emails as we do for @inboxzero_ai, then you don't need Opus.
English
1
0
0
13
Elie Steinbock — oss/acc
This is massive! I'm yet to be convinced it's as strong as Opus 4.6. But it's strong, and price point is very good. I need to test more to see if it really competes with Opus. But why this is so big: #1 reason people have been moving away from Cursor is the price. Anthropic and OpenAI have been massively subsidizing tokens. Cursor had to resell someone else's model. With this upgrade, Cursor can finally sell their own model. But sell a version that's stronger and cheaper. Long term this is a huge advantage.
Cursor@cursor_ai

Composer 2 is now available in Cursor.

English
0
0
9
1K
Elie Steinbock — oss/acc
@wickedguro @euboid @getsentry Ah, we weren't talking about customer support above. But simple approach that'll work across Codex/Claude Code is to write a skill, simple CLI it can call. That's about it. Also look at Claude SDK if you haven't already.
English
1
0
1
58
Nevo David
Nevo David@wickedguro·
@elie2222 @euboid @getsentry This is interesting. I was thinking of using this @openai/codex-sdk" target="_blank" rel="nofollow noopener">npmjs.com/package/@opena… But, it's CLI-based, so I'm not sure how well I can run it with Docker on a server. This cursor automation stuff sounds good
English
1
0
1
47
Chris Tate
Chris Tate@ctatedev·
~100% of my dev is done in sandboxes in the cloud Highly recommend it: - Unlimited parallel agent sessions - My local machine stays safe - Can work from anywhere - Can close laptop - Lap stays cool Interesting idea to visualize with Kanban
Ryan Carson@ryancarson

100% of dev is going to be done in sandboxes in the cloud, controlled by kanban boards. Trust me, I love my local machine and gorgeous mac apps, but all of it is just a terrible form factor for running a team of agents effectively.

English
60
37
880
137.7K
Elie Steinbock — oss/acc retweetledi
Elie Steinbock — oss/acc
Minimax 2.7 looking strong at $1.20m output 🤯🤯🤯
Elie Steinbock — oss/acc tweet media
English
1
1
12
1.1K
Jan
Jan@janschultecom·
@elie2222 Gemini flash is just an amazing model and the only one that was actually useful for cuttlekit.com generative ui
English
1
0
1
37