Rohan Arun (@RohanArun) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Side-by-side benchmarks beating @OpenAI Codex computer-use using their own models! 👀 Round 1: Clip a youtube video from our channel and upload it to Tiktok ✅ GetSupers.com + GPT 5.4 + our computer-use-kit: Successfully uploads a clip with subtitles and hook after 16 minutes (and works in iPhone/Android) ❌ Codex + GPT 5.4: Gets the clip format wrong 3 times, asks for human intervention, and finally fails after 21 minutes. Codex actually does try iPhone mirroring and Capcut, which is very cool and kudos to the team, but it ultimately fails after burning credits. @sama this is not easy to do, but happy to help you guys integrate our computer-use-kit. 😀 I co-founded the first startup approved by openAI to sell GPT3 for automation in August 2021(Cheatlayer) months before adept.ai so I've been working on this for a long time. We automate Mac/Windows/Linux/Chrome/Android/Iphone + @daytonaio sandboxes and @browserbase cloud browsers out of the box. We also just shipped automated benchmarks, so we're building the most comprehensive computer-use benchmark for long-running tasks on the planet.

English

47

35

99

106.8K

Rohan Arun@RohanArun·11h

@AlexanderTw33ts I'll pay $.10 on the dollar it might run 5 benchmarks for us 😂

English

1

0

5

373

Alex@AlexanderTw33ts·12h

guys what should I do with this?

English

70

0

90

10.7K

Rohan Arun@RohanArun·14h

🚨Striking new benchmarks for long-running computer-use beating @OpenAI with their own models! 3 random people who follow me, comment, and quote repost the tweet below wins $100 in 48 hours! X will be removing communities soon so follow me anyway to stay tuned for more launches and promos.

Rohan Arun@RohanArun

🚨Striking new benchmarks for long-running computer-use beating @OpenAI with their own models! ❌ Codex Computer Use: 21 minutes and fails Computer-use-kit + our native Mac app: ✅ @Alibaba_Qwen Qwen 3.6 Plus: 5m 27s success $0.325/$1.95👀 ✅ @deepseek_ai V4: 3m 34s success. $1.74/$3.48 👀 ✅ GPT 5.4: 2m 41s success $2.50/$15 ✅ GPT 5.5 Pro: 4m 34s success $30/$180 Task: clip the latest video from our Youtube channel and post it to Tiktok. We just published a new realtime upgrade to optimize our computer-use-kit runtimes(API launching soon). We finally solved reliable computer-use for long-running tasks, and we use benchmarks to rigorously test and report which use-cases will work best on which models. The cool thing is our benchmarks clearly show an upper limit to tasks, so you can use open models to run them!

English

7

6

23

765

Rohan Arun@RohanArun·14h

🚨Striking new benchmarks for long-running computer-use beating @OpenAI with their own models! ❌ Codex Computer Use: 21 minutes and fails Computer-use-kit + our native Mac app: ✅ @Alibaba_Qwen Qwen 3.6 Plus: 5m 27s success $0.325/$1.95👀 ✅ @deepseek_ai V4: 3m 34s success. $1.74/$3.48 👀 ✅ GPT 5.4: 2m 41s success $2.50/$15 ✅ GPT 5.5 Pro: 4m 34s success $30/$180 Task: clip the latest video from our Youtube channel and post it to Tiktok. We just published a new realtime upgrade to optimize our computer-use-kit runtimes(API launching soon). We finally solved reliable computer-use for long-running tasks, and we use benchmarks to rigorously test and report which use-cases will work best on which models. The cool thing is our benchmarks clearly show an upper limit to tasks, so you can use open models to run them!

Rohan Arun@RohanArun

Side-by-side benchmarks beating @OpenAI Codex computer-use using their own models! 👀 Round 1: Clip a youtube video from our channel and upload it to Tiktok ✅ GetSupers.com + GPT 5.4 + our computer-use-kit: Successfully uploads a clip with subtitles and hook after 16 minutes (and works in iPhone/Android) ❌ Codex + GPT 5.4: Gets the clip format wrong 3 times, asks for human intervention, and finally fails after 21 minutes. Codex actually does try iPhone mirroring and Capcut, which is very cool and kudos to the team, but it ultimately fails after burning credits. @sama this is not easy to do, but happy to help you guys integrate our computer-use-kit. 😀 I co-founded the first startup approved by openAI to sell GPT3 for automation in August 2021(Cheatlayer) months before adept.ai so I've been working on this for a long time. We automate Mac/Windows/Linux/Chrome/Android/Iphone + @daytonaio sandboxes and @browserbase cloud browsers out of the box. We also just shipped automated benchmarks, so we're building the most comprehensive computer-use benchmark for long-running tasks on the planet.

English

9

10

24

1.7K

Rohan Arun@RohanArun·17h

Just 5 days ago this video went viral for beating @OpenAI computer-use using their own models in 16 minutes on long-running tasks.. @deepseek_ai v4 runs the same benchmark now with our latest realtime computer-use breakthrough in ~3 minutes 34 seconds! 🤯 More striking benchmarks coming soon..

Rohan Arun@RohanArun

Side-by-side benchmarks beating @OpenAI Codex computer-use using their own models! 👀 Round 1: Clip a youtube video from our channel and upload it to Tiktok ✅ GetSupers.com + GPT 5.4 + our computer-use-kit: Successfully uploads a clip with subtitles and hook after 16 minutes (and works in iPhone/Android) ❌ Codex + GPT 5.4: Gets the clip format wrong 3 times, asks for human intervention, and finally fails after 21 minutes. Codex actually does try iPhone mirroring and Capcut, which is very cool and kudos to the team, but it ultimately fails after burning credits. @sama this is not easy to do, but happy to help you guys integrate our computer-use-kit. 😀 I co-founded the first startup approved by openAI to sell GPT3 for automation in August 2021(Cheatlayer) months before adept.ai so I've been working on this for a long time. We automate Mac/Windows/Linux/Chrome/Android/Iphone + @daytonaio sandboxes and @browserbase cloud browsers out of the box. We also just shipped automated benchmarks, so we're building the most comprehensive computer-use benchmark for long-running tasks on the planet.

English

4

3

19

493

Rohan Arun@RohanArun·1d

That's what I'm talking about! Won #1 twice in the Launch pitch competition this week!

LAUNCH@LAUNCH

@murr: 🥇@Super_Powers_AI: A combination of free, open-source models and parallelizing agents is very powerful. 🥈@ArgonathAI: Removing bottlenecks in defence is important. 🥉@Yorby_ai: Distribution is the most important thing to solve for companies.

English

13

3

35

924

Rohan Arun@RohanArun·1d

Twitter is taking down communities! The other way to basically emulate a community is if everyone follows me and subscribes to notifications, and I can pin posts on my profile @RohanArun" target="_blank" rel="nofollow noopener">twitter.com/@RohanArun

Nikita Bier@nikitabier

Today we're announcing two product changes for organizing communities on X: 1. XChat now supports joinable links for groupchats. Create a public link & share direct to Timeline. With support for 350 members per chat (and growing), Groupchat Links are the fastest way to bring people together on X. 2. Due to declining usage, we're deprecating X Communities on May 6. To migrate your Community's members, pin your groupchat link so people can join it over the next 2 weeks. This is part of our broader effort to simplify the experience on X. Make no mistake: we are investing heavily in niche communities with the launch of Custom Timelines—and much more to come.

English

5

0

17

395

Rohan Arun@RohanArun·1d

LFG that's what I'm talking about!

LAUNCH@LAUNCH

@mikemarg_: 🥇@Super_Powers_AI: There is no substitute for strong open-source traction. 🥈@askgrapple: Likes the focus on solving analytics in the age of AI. 🥉@askgrapple: Has a very clear ICP, and the buying process in the federal government is stuck in the past.

English

8

2

23

343

Rohan Arun@RohanArun·1d

@AlliySalaudeen You can see in the post that I said I would DM the winners and I have already done that

English

0

1

26

Salaudeen Alliy Adeniran@AlliySalaudeen·1d

@RohanArun Hi Rohan What a great work you’ve put in so far Must commend you for that Just wanna ask if you’ve picked the winners for the contest that earned 83k views and prolly sent a DM as you said? @RohanArun

English

1

0

26

Rohan Arun@RohanArun·3d

Great job guys! We hit 83k views and got on the radar of several famous accounts. I'll DM the winners tonight. We got on the radar of Alibaba: I got credits from Alibaba to test their new Qwen model, and they'll share it when I finish benchmarks showing them beating OpenAI.

Rohan Arun@RohanArun

Side-by-side benchmarks beating @OpenAI Codex computer-use using their own models! 👀 Round 1: Clip a youtube video from our channel and upload it to Tiktok ✅ GetSupers.com + GPT 5.4 + our computer-use-kit: Successfully uploads a clip with subtitles and hook after 16 minutes (and works in iPhone/Android) ❌ Codex + GPT 5.4: Gets the clip format wrong 3 times, asks for human intervention, and finally fails after 21 minutes. Codex actually does try iPhone mirroring and Capcut, which is very cool and kudos to the team, but it ultimately fails after burning credits. @sama this is not easy to do, but happy to help you guys integrate our computer-use-kit. 😀 I co-founded the first startup approved by openAI to sell GPT3 for automation in August 2021(Cheatlayer) months before adept.ai so I've been working on this for a long time. We automate Mac/Windows/Linux/Chrome/Android/Iphone + @daytonaio sandboxes and @browserbase cloud browsers out of the box. We also just shipped automated benchmarks, so we're building the most comprehensive computer-use benchmark for long-running tasks on the planet.

English

21

6

37

2.8K

Rohan Arun@RohanArun·1d

@Grandyt55 You can see in the post that I said I would DM the winners and I have already done that

English

0

1

2

34

Grandyt@Grandyt55·1d

@RohanArun @RohanArun what’s the update on winners announcement, ser?

English

1

0

22

Rohan Arun@RohanArun·2d

We published our "breakthrough" that makes open models way more reliable than GPT 5.4 in the latest Mac client. It's working very well with Qwen 3.6 plus. Try it on the Tiktok agent 😁

English

8

1

24

718

Rohan Arun@RohanArun·2d

Switch to Qwen 3.6 Plus! It's very good and free for a limited time since we partnered with their team. It actually beats GPT 5.4 using our computer-use-kit 👀

Rohan Arun@RohanArun

Our initial @Alibaba_Qwen 3.6 Plus benchmarks on long-running computer-use are very good! It's beats even @OpenAI GPT 5.4 using our computer-use-kit 👀

English

1

0

16

689

Rohan Arun@RohanArun·2d

Our initial @Alibaba_Qwen 3.6 Plus benchmarks on long-running computer-use are very good! It's beats even @OpenAI GPT 5.4 using our computer-use-kit 👀

Qwen@Alibaba_Qwen

🚀 Meet Qwen3.6-27B, our latest dense, open-source model, packing flagship-level coding power! Yes, 27B, and Qwen3.6-27B punches way above its weight. 👇 What's new: 🧠 Outstanding agentic coding — surpasses Qwen3.5-397B-A17B across all major coding benchmarks 💡 Strong reasoning across text & multimodal tasks 🔄 Supports thinking & non-thinking modes ✅ Apache 2.0 — fully open, fully yours Smaller model. Bigger results. Community's favorite. ❤️ We can't wait to see what you build with Qwen3.6-27B! 👀 🔗👇 Blog: qwen.ai/blog?id=qwen3.… Qwen Studio: chat.qwen.ai/?models=qwen3.… Github: github.com/QwenLM/Qwen3.6 Hugging Face: huggingface.co/Qwen/Qwen3.6-2… huggingface.co/Qwen/Qwen3.6-2… ModelScope: modelscope.cn/models/Qwen/Qw… modelscope.cn/models/Qwen/Qw…

English

1

0

15

1K

Rohan Arun@RohanArun·2d

@Alibaba_Qwen Our initial benchmarks on long-running computer-use are very good! It's beats even @OpenAI GPT 5.4 👀

English

1

14

982

Qwen@Alibaba_Qwen·2d

🚀 Meet Qwen3.6-27B, our latest dense, open-source model, packing flagship-level coding power! Yes, 27B, and Qwen3.6-27B punches way above its weight. 👇 What's new: 🧠 Outstanding agentic coding — surpasses Qwen3.5-397B-A17B across all major coding benchmarks 💡 Strong reasoning across text & multimodal tasks 🔄 Supports thinking & non-thinking modes ✅ Apache 2.0 — fully open, fully yours Smaller model. Bigger results. Community's favorite. ❤️ We can't wait to see what you build with Qwen3.6-27B! 👀 🔗👇 Blog: qwen.ai/blog?id=qwen3.… Qwen Studio: chat.qwen.ai/?models=qwen3.… Github: github.com/QwenLM/Qwen3.6 Hugging Face: huggingface.co/Qwen/Qwen3.6-2… huggingface.co/Qwen/Qwen3.6-2… ModelScope: modelscope.cn/models/Qwen/Qw… modelscope.cn/models/Qwen/Qw…

English

502

1.7K

12.4K

3.5M

Rohan Arun@RohanArun·2d

@iScienceLuvr Benchmarks are expensive, but if you build them and show Qwen winning you may be able shame OpenAI into giving you credits to run them.

English

0

1

7

153

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·2d

Any medical benchmarks? 🥺

Qwen@Alibaba_Qwen

🚀 Meet Qwen3.6-27B, our latest dense, open-source model, packing flagship-level coding power! Yes, 27B, and Qwen3.6-27B punches way above its weight. 👇 What's new: 🧠 Outstanding agentic coding — surpasses Qwen3.5-397B-A17B across all major coding benchmarks 💡 Strong reasoning across text & multimodal tasks 🔄 Supports thinking & non-thinking modes ✅ Apache 2.0 — fully open, fully yours Smaller model. Bigger results. Community's favorite. ❤️ We can't wait to see what you build with Qwen3.6-27B! 👀 🔗👇 Blog: qwen.ai/blog?id=qwen3.… Qwen Studio: chat.qwen.ai/?models=qwen3.… Github: github.com/QwenLM/Qwen3.6 Hugging Face: huggingface.co/Qwen/Qwen3.6-2… huggingface.co/Qwen/Qwen3.6-2… ModelScope: modelscope.cn/models/Qwen/Qw… modelscope.cn/models/Qwen/Qw…

English

10

2

59

9.4K

Rohan Arun@RohanArun·2d

We're coordinating a big launch next week with multiple influencers! To help you can: 1) Help us build benchmarks for the bounties: x.com/RohanArun/stat… 2) Plug in an Android device with our APK and switch it to "monetize" mode: #android-private-streamer" target="_blank" rel="nofollow noopener">app.getsupers.com/docs#android-p… You can earn up to $1/device/hour and I'll start activating devices today on our global network. 3) Test the new Qwen models in our native Mac client, which automates benchmarks, because Alibaba will share it after we record enough benchmarks. Free for a limited time. It works really well for an open model and beats GPT 5.4 in our benchmarks using our computer-use-kit!

Qwen@Alibaba_Qwen

🚀 Meet Qwen3.6-27B, our latest dense, open-source model, packing flagship-level coding power! Yes, 27B, and Qwen3.6-27B punches way above its weight. 👇 What's new: 🧠 Outstanding agentic coding — surpasses Qwen3.5-397B-A17B across all major coding benchmarks 💡 Strong reasoning across text & multimodal tasks 🔄 Supports thinking & non-thinking modes ✅ Apache 2.0 — fully open, fully yours Smaller model. Bigger results. Community's favorite. ❤️ We can't wait to see what you build with Qwen3.6-27B! 👀 🔗👇 Blog: qwen.ai/blog?id=qwen3.… Qwen Studio: chat.qwen.ai/?models=qwen3.… Github: github.com/QwenLM/Qwen3.6 Hugging Face: huggingface.co/Qwen/Qwen3.6-2… huggingface.co/Qwen/Qwen3.6-2… ModelScope: modelscope.cn/models/Qwen/Qw… modelscope.cn/models/Qwen/Qw…

English

4

2

25

1.2K

Rohan Arun

Keşfet