
Rohan Arun
1.1K posts

Rohan Arun
@RohanArun
Co-founded 3D sports startup with CNET founder 1st approved by openAI to sell automation in 21': https://t.co/vP75GBJwgV AgentsBase:acquired StarterStory:https://t.co/hivciZ26pi








A bunch of people are starting to dunk on this as if us releasing this were bad. If you're 18 and a brilliant engineer, you aren't born with this. This is 101, basic knowledge. And is it complicated? Yes Accounting you could spend a lifetime learning! Hard to distill to 1 page.

Side-by-side benchmarks beating @OpenAI Codex computer-use using their own models! 👀 Round 1: Clip a youtube video from our channel and upload it to Tiktok ✅ GetSupers.com + GPT 5.4 + our computer-use-kit: Successfully uploads a clip with subtitles and hook after 16 minutes (and works in iPhone/Android) ❌ Codex + GPT 5.4: Gets the clip format wrong 3 times, asks for human intervention, and finally fails after 21 minutes. Codex actually does try iPhone mirroring and Capcut, which is very cool and kudos to the team, but it ultimately fails after burning credits. @sama this is not easy to do, but happy to help you guys integrate our computer-use-kit. 😀 I co-founded the first startup approved by openAI to sell GPT3 for automation in August 2021(Cheatlayer) months before adept.ai so I've been working on this for a long time. We automate Mac/Windows/Linux/Chrome/Android/Iphone + @daytonaio sandboxes and @browserbase cloud browsers out of the box. We also just shipped automated benchmarks, so we're building the most comprehensive computer-use benchmark for long-running tasks on the planet.

Deepseek V4 Pro vs GPT-5.5 in a gamedev contest (full prompt is below)🏎️ Cost: Deepseek V4 Pro: $0.07656 GPT-5.5: $0.33063 Output stats: Deepseek: 34 tok/s · 9m 5s · 18,869 tokens GPT-5.5: 25 tok/s · 7m 5s · 10,580 tokens Conclusion: GPT-5.5 clearly made the better karting game. Deepseek V4 Pro was 4.3x cheaper and generated almost 2x more tokens, but the final result was weaker. It struggled with graphics, visual polish, and creative direction, while GPT-5.5 delivered better game quality, better visuals, more creativity, and stronger overall execution. Even though Deepseek positions itself as a strong model for coding, in this gamedev test it still felt far behind GPT-5.5. Try the same karting prompt with another AI model and share your result below.


🚨Striking new benchmarks for long-running computer-use beating @OpenAI with their own models! ❌ Codex Computer Use: 21 minutes and fails Computer-use-kit + our native Mac app: ✅ @Alibaba_Qwen Qwen 3.6 Plus: 5m 27s success $0.325/$1.95👀 ✅ @deepseek_ai V4: 3m 34s success. $1.74/$3.48 👀 ✅ GPT 5.4: 2m 41s success $2.50/$15 ✅ GPT 5.5 Pro: 4m 34s success $30/$180 Task: clip the latest video from our Youtube channel and post it to Tiktok. We just published a new realtime upgrade to optimize our computer-use-kit runtimes(API launching soon). We finally solved reliable computer-use for long-running tasks, and we use benchmarks to rigorously test and report which use-cases will work best on which models. The cool thing is our benchmarks clearly show an upper limit to tasks, so you can use open models to run them!

Side-by-side benchmarks beating @OpenAI Codex computer-use using their own models! 👀 Round 1: Clip a youtube video from our channel and upload it to Tiktok ✅ GetSupers.com + GPT 5.4 + our computer-use-kit: Successfully uploads a clip with subtitles and hook after 16 minutes (and works in iPhone/Android) ❌ Codex + GPT 5.4: Gets the clip format wrong 3 times, asks for human intervention, and finally fails after 21 minutes. Codex actually does try iPhone mirroring and Capcut, which is very cool and kudos to the team, but it ultimately fails after burning credits. @sama this is not easy to do, but happy to help you guys integrate our computer-use-kit. 😀 I co-founded the first startup approved by openAI to sell GPT3 for automation in August 2021(Cheatlayer) months before adept.ai so I've been working on this for a long time. We automate Mac/Windows/Linux/Chrome/Android/Iphone + @daytonaio sandboxes and @browserbase cloud browsers out of the box. We also just shipped automated benchmarks, so we're building the most comprehensive computer-use benchmark for long-running tasks on the planet.

Side-by-side benchmarks beating @OpenAI Codex computer-use using their own models! 👀 Round 1: Clip a youtube video from our channel and upload it to Tiktok ✅ GetSupers.com + GPT 5.4 + our computer-use-kit: Successfully uploads a clip with subtitles and hook after 16 minutes (and works in iPhone/Android) ❌ Codex + GPT 5.4: Gets the clip format wrong 3 times, asks for human intervention, and finally fails after 21 minutes. Codex actually does try iPhone mirroring and Capcut, which is very cool and kudos to the team, but it ultimately fails after burning credits. @sama this is not easy to do, but happy to help you guys integrate our computer-use-kit. 😀 I co-founded the first startup approved by openAI to sell GPT3 for automation in August 2021(Cheatlayer) months before adept.ai so I've been working on this for a long time. We automate Mac/Windows/Linux/Chrome/Android/Iphone + @daytonaio sandboxes and @browserbase cloud browsers out of the box. We also just shipped automated benchmarks, so we're building the most comprehensive computer-use benchmark for long-running tasks on the planet.

@murr: 🥇@Super_Powers_AI: A combination of free, open-source models and parallelizing agents is very powerful. 🥈@ArgonathAI: Removing bottlenecks in defence is important. 🥉@Yorby_ai: Distribution is the most important thing to solve for companies.

Today we're announcing two product changes for organizing communities on X: 1. XChat now supports joinable links for groupchats. Create a public link & share direct to Timeline. With support for 350 members per chat (and growing), Groupchat Links are the fastest way to bring people together on X. 2. Due to declining usage, we're deprecating X Communities on May 6. To migrate your Community's members, pin your groupchat link so people can join it over the next 2 weeks. This is part of our broader effort to simplify the experience on X. Make no mistake: we are investing heavily in niche communities with the launch of Custom Timelines—and much more to come.

@mikemarg_: 🥇@Super_Powers_AI: There is no substitute for strong open-source traction. 🥈@askgrapple: Likes the focus on solving analytics in the age of AI. 🥉@askgrapple: Has a very clear ICP, and the buying process in the federal government is stuck in the past.




Side-by-side benchmarks beating @OpenAI Codex computer-use using their own models! 👀 Round 1: Clip a youtube video from our channel and upload it to Tiktok ✅ GetSupers.com + GPT 5.4 + our computer-use-kit: Successfully uploads a clip with subtitles and hook after 16 minutes (and works in iPhone/Android) ❌ Codex + GPT 5.4: Gets the clip format wrong 3 times, asks for human intervention, and finally fails after 21 minutes. Codex actually does try iPhone mirroring and Capcut, which is very cool and kudos to the team, but it ultimately fails after burning credits. @sama this is not easy to do, but happy to help you guys integrate our computer-use-kit. 😀 I co-founded the first startup approved by openAI to sell GPT3 for automation in August 2021(Cheatlayer) months before adept.ai so I've been working on this for a long time. We automate Mac/Windows/Linux/Chrome/Android/Iphone + @daytonaio sandboxes and @browserbase cloud browsers out of the box. We also just shipped automated benchmarks, so we're building the most comprehensive computer-use benchmark for long-running tasks on the planet.







