Seth Karten

3.1K posts

Seth Karten banner
Seth Karten

Seth Karten

@sethkarten

Agents….PokeChamp, PokeAgent, LLM Economist | CS PhD @Princeton | Former CMU Waymo

🐯 Katılım Ekim 2012
615 Takip Edilen1.6K Takipçiler
Seth Karten
Seth Karten@sethkarten·
@emollick Train LLMs to manage resources in Civ and they will be better at managing gpus for ai science
English
1
0
0
120
Benjamin Todd
Benjamin Todd@ben_j_todd·
Opus 4.6 is hugely better at Pokemon: • Opus 4.0 took 1,000 hours to get half way through • Opus 4.5 could almost finish in 1,000 hours • Opus 4.6 was another 10x faster!
Benjamin Todd tweet media
English
28
50
745
108.2K
ikka
ikka@Shahules786·
(1/n) Today, we’re releasing Cloning Bench. Labs are paying 6-7 figures for clones of web apps to do web/computer use-based RL training. At @VibrantLabsAI , our fundamental goal is to automate the creation of RL environments. For web/CUAs, one way that we do that is by using coding agents and custom harness to automatically generated the simulation environment. We tested Codex, Gemini, Claude Code, and GLM using our harness on their ability to recreate a Slack workspace and benchmarked their performances. We have published our methods, results and analysis here today: vibrantlabs.com/blog/cloning-b…
ikka tweet media
English
7
12
142
11.9K
Seth Karten
Seth Karten@sethkarten·
@yasei_no_otoko We can add support for other languages for the rpg. We will just need to add a checksum per language. PRs welcome!
English
0
0
0
97
Seth Karten
Seth Karten@sethkarten·
@lateinteraction Thanks, Omar! I’ve been testing a version of GEPA for PokeAgent recently. More soon!
English
0
1
5
393
Seth Karten
Seth Karten@sethkarten·
reddit.com/r/LocalLLaMA/c… with quantization and cpu offloading, it may be possible since there are 3B active MoE. I def think it is worth exploring with the largest model that works on your gpu at the very least. Coding agents will only become more prevalent and require smaller models for the same performance
English
1
0
1
47
sacha🥝
sacha🥝@alexUnder_sky·
@sethkarten I don't think I can run anything locally, as I've got like 16 gigs of ram. Maybe through kaggle notebooks or smth.
English
1
0
0
38
sacha🥝
sacha🥝@alexUnder_sky·
@sethkarten, a quick question: have you actually learned rust (or can debug the code you've got) or you fully rely or codex/claude as your swe executors?
English
1
0
0
106
Seth Karten
Seth Karten@sethkarten·
@alexUnder_sky Can you try Qwen3.5-35B-A3B quantized with OpenCode? I haven't tried open source models yet but it could be interesting depending on your local gpu.
English
1
0
1
61
sacha🥝
sacha🥝@alexUnder_sky·
@sethkarten Thank you so much. I don't have coding agents, but at least would be fun to understand at least something
English
1
0
0
21
Seth Karten
Seth Karten@sethkarten·
@haoailab Very cool! Do you have plans to open-source or is this a start-up product launch?
English
0
0
7
853
Hao AI Lab
Hao AI Lab@haoailab·
(1/N) We're launching Dreamverse. Most AI video models take minutes to generate a 5 s 1080p clip. In 4.5 seconds, we can generate 30 s 1080p clips on a single GPU. Our videos generate faster than you can watch them: stop waiting on prompts and start directing scenes live. 🕹️Demo: dreamverse.fastvideo.org 📑 Blog: haoailab.com/blogs/dreamver… Welcome to the era of vibe-directing 👇
English
39
56
540
76K
sacha🥝
sacha🥝@alexUnder_sky·
@sethkarten I just look at your work and came across the concepts of rust and they're quite different from cpp. Maybe, with quite a bit of familirity I could debug the rust code but it seems really heroic from your side to do it in rust. However, everyone does everything in rust, idk...
English
2
0
0
33
h
h@harlanv11·
@sethkarten This is like the coolest intersection of all my favorite things. Where might someone hypothetically contribute to something like this in the future
English
1
0
1
498
Seth Karten
Seth Karten@sethkarten·
Economic alignment is a difficult problem to address since you must balance the individual’s autonomy with the collective’s welfare and growth. Even a slightly misaligned objective can be disastrous and unstable. Im staying followed :) i’ll have some more flushed out thoughts soon
English
0
0
0
1.8K
Peter McCrory
Peter McCrory@PeterMcCrory·
I want to share a bit more about my vision for the Economic Research team at Anthropic in the coming years. This is a forward-looking vision. Some pieces we’ve yet to develop. Aspects of this work will surely change. Consider joining the effort. 1/6 #heading=h.j1ij8p6h22u5" target="_blank" rel="nofollow noopener">docs.google.com/document/d/1OM…
English
22
114
1.2K
194.6K
Jack Clark
Jack Clark@jackclarkSF·
I'm scaling the economic research function here @AnthropicAI to meet the challenge of powerful AI. This team today produces the best data in the industry via the Anthropic Economic Index + recent work on job exposure to AI. We have many very ambitious plans in the works. Join!
Jack Clark tweet media
Peter McCrory@PeterMcCrory

I want to share a bit more about my vision for the Economic Research team at Anthropic in the coming years. This is a forward-looking vision. Some pieces we’ve yet to develop. Aspects of this work will surely change. Consider joining the effort. 1/6 #heading=h.j1ij8p6h22u5" target="_blank" rel="nofollow noopener">docs.google.com/document/d/1OM…

English
28
36
367
48.4K