Christopher Settles

919 posts

Christopher Settles banner
Christopher Settles

Christopher Settles

@never_settles_

Building RL gyms @refresh_dev | Prev AI @Uber , CS @UofIllinois | believer in community

San Francisco Katılım Şubat 2022
2K Takip Edilen2.2K Takipçiler
Christopher Settles retweetledi
Shuyan Zhou
Shuyan Zhou@shuyanzh36·
In 2023, WebArena took 7 grad students more than 6 months to build just 5 environments with 812 variable browser-use tasks. Now, it takes under 10 hours and less than $100 per environment, with easy support for parallel generation. Excited to introduce WebArena-Infinity: a scalable approach for automatically generating high-authenticity, high-complexity browser environments with verifiable tasks suitable for RL training and benchmarking. Even strong open-source models that already achieve 60%+ success rates on WebArena and OSWorld complete fewer than 50% of tasks here. Project page: webarena.dev/webarena-infin… Repo: github.com/web-arena-x/we… 🧵 (1/n)
GIF
English
12
46
317
38.8K
Christopher Settles retweetledi
Arcee.ai
Arcee.ai@arcee_ai·
Here are a few of our favorite shots from our recent out-of-home campaign. Loving how the Arcee teal cuts right through the noise of downtown SF and the traffic on the 101 + a bonus shot from the DC metro.
Arcee.ai tweet mediaArcee.ai tweet mediaArcee.ai tweet mediaArcee.ai tweet media
English
1
6
26
1.7K
Christopher Settles retweetledi
Luke Melas-Kyriazi
Luke Melas-Kyriazi@lukemelas·
Our first frontier-level model! It's the result of our first continued pretraining run as well as further scaling RL. Very excited to hear how people like it! Feel free to send me feedback and we'll incorporate it into future models.
Cursor@cursor_ai

Composer 2 is now available in Cursor.

English
7
3
86
4.9K
Christopher Settles retweetledi
Tzafon
Tzafon@tzafon_company·
We're open sourcing Northstar CUA Fast, a frontier 4B open-source Computer Use Action (CUA) model, built for accuracy and long-horizon action planning.
English
3
7
29
1.7K
Arlan
Arlan@arlanr·
If you do a work trial or work at @nozomioai, the least you get is: - unlimited doordash and steaks - unlimited Hinge and Tinder - Airbnb - unlimited access to white monster and diet coke - $5,000 worth of claude code every week - handsome founder
Arlan tweet media
English
46
5
264
17.9K
Christopher Settles retweetledi
RunRL
RunRL@runrl_com·
RunRL tweet media
ZXX
0
1
6
241
Christopher Settles retweetledi
Ishaan Sehgal
Ishaan Sehgal@ishaansehgal·
every dev wants to code from anywhere but SSH is a pain. cloud sandboxes don't know your environment. remote control apps die when the laptop closes. so we mapped every approach 🔗 omnara.com/blog/mobile-co…
Ishaan Sehgal tweet media
English
5
5
15
820
Christopher Settles
Christopher Settles@never_settles_·
It's finally hot in SF because Claude has been running all the GPUs overclocked
English
1
0
7
202
Christopher Settles retweetledi
Xiangyi Li
Xiangyi Li@xdotli·
Room ready for the largest Agent Skills hackathon at @fdotinc Sat March 7. 🌁 We added the following speakers: @underyx from Anthropic @FurqanR founder of @fdotinc @thirdweb @nebulagg @ryanmart3n creator of Harbor and Terminal Bench @xdotli yours truly who made SkillsBench as well. We have two tracks: - Make skills in economically valuable domains where models are less trained on, and help the model. This can be OpenClaw skills for marketing, writing compliance, excels, etc. - Make skills continual learning pipelines. Join us to make agent skills reliable and self-evolving! @belindmo @roeybc @turboblitzzz @ruslanjabari @never_settles_ @fdotinc
Xiangyi Li tweet media
English
4
5
50
3.9K
Christopher Settles retweetledi
Daanish Khazi
Daanish Khazi@bertgodel·
We’re announcing Kos-1 Lite, a medical model that achieves SOTA on HealthBench Hard at 46.6%. As a medium sized language model (~100B), it achieves these results at a fraction of the serving cost of frontier trillion-parameter models.
Daanish Khazi tweet media
English
40
59
319
25K
Shubhan Dua
Shubhan Dua@defi_dua·
I’m pleased to announce that @AnswersAi_ai has been acquired and our Team SF is joining @quizlet We started this journey 3 years ago as Juniors at Cal and UCLA as a hackathon project and built our way through Senior year, three offices and more. I’m proud of our team that got us 2M Users, 1M followers, 2 Billion views and $3.5M+ across our lifetime. We’re grateful for our investors, supporters and team that took a bet on us, starting with my co founders. Thank you to Kurt Beidler, Ismail Orujov and the entire Quizlet team for taking a bet on us. We continue our journey there alongside the incredible @satapathy_dev_ , @DanielBerezhnoy and @angeldzzz23 Always day one as we continue making a dent in the universe
English
107
19
371
110.1K
Christopher Settles
Christopher Settles@never_settles_·
Claude code took off partly because its core feature was dead clear (Agent Mode). Cursor led with next edit prediction for a while, even though Agent Mode was just as good as Claude code, so lots of people formed an opinion before ever trying it. Any parallels in CUA apps?
English
3
0
4
379
Christopher Settles retweetledi
Sam Altman
Sam Altman@sama·
We have raised a $110 billion round of funding from Amazon, NVIDIA, and SoftBank. We are grateful for the support from our partners, and have a lot of work to do to bring you the tools you deserve.
English
4.2K
2.6K
39.5K
8.9M