Cua

662 posts

Cua banner
Cua

Cua

@trycua

Open-source infrastructure for Computer-Use Agents // YC X25

San Francisco, CA Katılım Ocak 2025
1.8K Takip Edilen4.9K Takipçiler
Sabitlenmiş Tweet
Cua
Cua@trycua·
We've been using Cua-Bench internally—and with customers—for the last few months to evaluate every computer-use agent we deploy. Today it's open-source. 15 public tasks, 40 variations, adapters for OSWorld and Windows Agent Arena. One CLI, self-hostable.
English
17
11
80
86.3K
Cua retweetledi
Francesco
Francesco@francedot·
Excited to see our research partners at Snorkel AI launch Open Benchmarks Grants - $3M to close the evaluation gap in agentic AI. This is one of the biggest bottlenecks in the space and it's finally getting funded properly. Apply → benchmarks.snorkel.ai/apply
vincent sunn chen@vincentsunnchen

Our ability to measure AI has been outpaced by our ability to develop it, and this evaluation gap is one of the most important problems in AI. Today we're launching Open Benchmarks Grants — a $3M commitment to fund open benchmarks for frontier AI and close the evaluation gap. Grateful to be partnering with @HuggingFace, @togethercompute, @PrimeIntellect, Factory HQ, @harborframework, and @PyTorch to back the teams building these benchmarks! 🚀

English
0
1
9
2.6K
Cua
Cua@trycua·
Human-agent-computer interaction has been waiting for the right primitive. Today we ship it. Tune in 6PM PST - live from ClawCon SF.
Cua tweet media
English
3
2
17
1.8K
Cua retweetledi
GitHub Projects Community
GitHub Projects Community@GithubProjects·
Make your agents better at computers. Cua-Bench evaluates computer-use agents on real tasks across desktop and mobile platforms.
GitHub Projects Community tweet media
English
3
6
46
6.8K
Cua retweetledi
Erik Thorelli
Erik Thorelli@esthor·
I keep saying this... 2026 is going to be very interesting for computer-use agents. a lot of folks are building with these frameworks right now. that means we're going to see the products they're building launch in the coming months... 👀
Francesco@francedot

Trending on GitHub again in Python, right next to our friends at @Prince_Canuma - thank you for 12k ⭐ - your support means everything. Chef @trycua cuala will keep cooking releases this week👨‍🍳🐨

English
0
4
10
1.8K
vas
vas@vasuman·
Guy who has never shipped a single thing with Claude Code is now telling everyone they need to buy 6 Mac Minis to run Clawd
English
83
56
1.5K
85K
Cua retweetledi
Sarina Li
Sarina Li@sarinajnli·
This is why I encourage anyone to try a startup, only @francedot could push me to try something trending on X/Twt after he saw it 2 h ago and make it production ready + actually ship it! Result? We did 130k+ impressions on our blog post and video and we open sourced it and we continue to work on computer use :) @trycua gratitude!!
Francesco@francedot

A bunch of you asked about our Remotion setup after the article. It's now open-source: github.com/trycua/launchp… • Video templates for product launches • Shared animation components • Works with Claude Code + Remotion skills • How we made the Cua-Bench video in 2 hours

English
0
1
13
4.3K
Cua retweetledi
Chris Barber
Chris Barber@chrisbarber·
"recording a human demonstration via noVNC, turning it into a Claude skill, then a CUA agent replaying it on the same spreadsheets task from the ShowUI-Aloha paper"
Cua@trycua

x.com/i/article/2014…

English
0
2
16
3.9K
Cua retweetledi
Kyle Wong
Kyle Wong@ewveggies·
Computer use agents that can learn, remember, and reapply your workflows on the fly! Cua has been shipping awesome stuff this entire week, check it out!
Cua@trycua

x.com/i/article/2014…

English
0
3
14
2.2K
Cua retweetledi
Snorkel AI
Snorkel AI@SnorkelAI·
Congrats to @trycua on open-sourcing cua-bench! We're collaborating on task design & data curation for GUI workflows - bringing the same systematic evaluation approach from Terminal-Bench to computer-use agents. 15 native tasks / 40 variations + OSWorld & Windows Agent Arena. Early cua-bench-basic results: -> Opus 4.5: 83.8% (57/68) -> Sonnet 4.5: 79.4% (54/68) -> Haiku 4.5: 79.4% (54/68) These are basic tasks — stay tuned for much more complex, realistic workflows 👀 Cross-platform. MIT licensed: cua.ai/docs/cuabench
Cua@trycua

We've been using Cua-Bench internally—and with customers—for the last few months to evaluate every computer-use agent we deploy. Today it's open-source. 15 public tasks, 40 variations, adapters for OSWorld and Windows Agent Arena. One CLI, self-hostable.

English
0
2
12
1.8K