Christina @ATX

18.5K posts

Christina @ATX banner
Christina @ATX

Christina @ATX

@truffle

Veteran game developer turned AI researcher. Creator of League of Legends: Wild Rift. Lead Designer Mass Effect 1-3. Co-Founder of Elodie Games.

Austin, TX Katılım Haziran 2007
587 Takip Edilen10K Takipçiler
Christina @ATX
Christina @ATX@truffle·
@MegaCrit I think pressing x to view your relics while in the rewards screen may be the trigger for not being able to navigate to card rewards
English
0
0
0
14
Christina @ATX
Christina @ATX@truffle·
@MegaCrit couple bugs I have noticed with controller. Sometimes going up/down during a battle I can't select units to highlight them. Fixed itself lasts only a turn. Possibly related to drawing cards. Sometimes when browsing rewards I can't select a card reward at all. Need to quit the game and restart at which point it is fixed. So minor, game is great!
English
1
0
1
63
Christina @ATX
Christina @ATX@truffle·
@thdxr I don't use opencode at all but I still like your tweets!
English
0
0
0
42
dax
dax@thdxr·
do you use the opencode sidebar
English
84
0
80
24.7K
Christina @ATX retweetledi
Reads with Ravi
Reads with Ravi@readswithravi·
A peak productivity advice from James Clear: “You are not your grand plans. You are your daily patterns.”
English
39
790
6.2K
90.9K
Ava
Ava@noampomsky·
friend is in the stage of claude psychosis where he asks claude to send him newspapers about what claude is doing for him
Ava tweet media
English
256
449
8.6K
394.9K
Christina @ATX
Christina @ATX@truffle·
@iruletheworldmo I find it hard to believe anyone thinks closed source can win in an era where quality coding has become so inexpensive
English
0
0
0
27
🍓🍓🍓
🍓🍓🍓@iruletheworldmo·
do people really think open source has a chance. this guys argument feels like, give me money and i can build a netflix competitor from my living room. the only way ‘open source’ wins is if openai or dario do it. and they won’t. because if they did. composer 3 would be agi.
0xSero@0xSero

x.com/i/article/2034…

English
87
3
120
37.5K
PMtheBuilder
PMtheBuilder@PMThebuilder·
"The agents do not listen to my instructions" is the PM problem hiding in plain sight. Every engineering fix in this thread — lint rules, review agents, second passes — is downstream of the real gap: nobody wrote a spec precise enough that the agent couldn't misinterpret it. We're learning that vague requirements don't just slow down humans. They actively mislead agents.
English
3
0
5
1.8K
Andrej Karpathy
Andrej Karpathy@karpathy·
Thank you Sarah, my pleasure to come on the pod! And happy to do some more Q&A in the replies.
sarah guo@saranormous

Caught up with @karpathy for a new @NoPriorsPod: on the phase shift in engineering, AI psychosis, claws, AutoResearch, the opportunity for a SETI-at-Home like movement in AI, the model landscape, and second order effects 02:55 - What Capability Limits Remain? 06:15 - What Mastery of Coding Agents Looks Like 11:16 - Second Order Effects of Coding Agents 15:51 - Why AutoResearch 22:45 - Relevant Skills in the AI Era 28:25 - Model Speciation 32:30 - Collaboration Surfaces for Humans and AI 37:28 - Analysis of Jobs Market Data 48:25 - Open vs. Closed Source Models 53:51 - Autonomous Robotics and Atoms 1:00:59 - MicroGPT and Agentic Education 1:05:40 - End Thoughts

English
268
368
5.1K
606.1K
ᴅᴀɴɪᴇʟ ᴍɪᴇssʟᴇʀ 🛡️
MCP is confusing. Some think it died when CLIs took over, but you can also view them as complimentary. Like MCP is how you stay fresh on what's available in the tool/service, and how to use it, and CLI is how you actually execute.
English
11
4
38
4.9K
dax
dax@thdxr·
i still don't get why we need to push code up to get an LLM review via awkward github ui hacks opencode has /review which can also do things like run your code to check things but a full time team focused on this would do it better, i just don't like the workflow they offer
English
85
9
907
58.5K
Louis Arge
Louis Arge@louisvarge·
i made a thing where now any Claude Code can send messages to any other Claude Code on my machine they can ask clarifying questions about work, or become friends
English
246
227
3.9K
644K
Christina @ATX
Christina @ATX@truffle·
@noahzweben It would be nice if you can configure claude's CWD it currently is the first directory in the files array. Often the files the user will want to search and browse are not the same as the files we will want Claude to have visibility into.
English
0
0
0
466
Noah Zweben
Noah Zweben@noahzweben·
A few 🔥 Claude Code VSCode drops 1. Remote Control in VSCode
Noah Zweben tweet media
English
43
23
585
99.3K
dax
dax@thdxr·
we've been experimenting with getting rid of the bash tool agents can write js fine which can do what bash can (though some gaps with things like git) and is more cross platform and then could run that in this
Rivet@rivet_dev

Introducing the Secure Exec SDK Secure Node.js execution without a sandbox ⚡ 17.9 ms coldstart, 3.4 MB mem, 56x cheaper 📦 Just a library – supports Node.js, Bun, & browsers 🔐 Powered by the same tech as Cloudflare Workers $ 𝚗𝚙𝚖 𝚒𝚗𝚜𝚝𝚊𝚕𝚕 𝚜𝚎𝚌𝚞𝚛𝚎-𝚎𝚡𝚎𝚌

English
89
26
1K
209.7K
Christina @ATX retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.
Andrej Karpathy tweet media
English
968
2.1K
19.4K
3.5M
Christina @ATX
Christina @ATX@truffle·
@andrewchen @emilybenn12 Reviewing every line is based on the idea that the cost of errors is high (impact, detection, fixing). The real cost right now is human attention.
English
0
0
0
19
andrew chen
andrew chen@andrewchen·
One question I've been asking founders is: do you try to review all the code that the LLMs write or do you just accept it? I think it's about 50-50 right now but the momentum is towards just accepting the AI-generated code and I think that number will eventually go to 100%. To me this is one of the most telling indications of how AI-native a team is. It's hard to get super high throughput if you are reviewing every line.
English
15
0
19
4.8K
Emily Bennett
Emily Bennett@emilybenn12·
One pattern from conversations with technical founders in our current cohort: There is a massive hiring divide emerging between engineers who are AI native and those who are not. This is not about seniority. It is about posture. The AI native engineers: -Treat Codex, Cursor, and agents as default infrastructure -Orchestrate systems instead of writing every line themselves -Spend more time reviewing, shaping, and stress testing than writing code They function more like Engineering Managers for AI than ICs in practice. The engineers stuck in older workflows: -See AI tools as shortcuts rather than leverage -Measure output in lines written -Default to manual implementation even when automation is available The hiring question is no longer just: Are they a strong coder? It is: Do they know how to compound their output with AI? If they don't, they will almost certainly get left behind by those that do.
English
18
1
56
8K
Christina @ATX
Christina @ATX@truffle·
@rodarmor I do hate this. Sometimes typos send Claude off in the wrong direction.
English
0
0
0
33
Casey
Casey@rodarmor·
I feel like coding agents have no backbone or depth to their thinking. Me: [question] Agent: [churns for 5 minutes] You should do X. Me: What about minor problem Y? Agent: You're right, X is completely inappropriate, do Z.
English
80
5
222
16.5K
Shinji
Shinji@Sh1nj1_kor·
@truffle @AnthropicAI Most people won't notice, and they'll notice their allowances getting smaller much faster because their conversations will accumulate a lot of useless information that otherwise would've been compacted away at 200k tokens.
English
1
0
1
18
Shinji
Shinji@Sh1nj1_kor·
I am calling it now. The 1M context window by default is @AnthropicAI rugpulling the MAX subscribers. Opus now by default eats way more tokens than before and is not as aggresive on the cleaning of context, easily eats up your weekly allowance. Time for codex i guess
English
4
0
6
509
exQUIZitely 🕹️
exQUIZitely 🕹️@exQUIZitely·
A glitch in the Matrix, all games after the year 2000 are wiped off the earth and you can only play one pre-2000 era game for the rest of your life. Which do you pick? I'd go with Civilization (1991).
exQUIZitely 🕹️ tweet media
English
641
23
1.1K
73.6K
BOOTOSHI 👑
BOOTOSHI 👑@KingBootoshi·
HOLY FUK I JUST LEARNED ABOUT TLA+ AND IT'S SO GOOD FOR AGENTIC CODING ur telling ME that i can mathematically fact check every possible scenario of my design STATE to prevent bugs and crashes AND IF IT FINDS SOMETHING THE AGENTS GET INSTANT FEEDBACK AND LOOP FIXING IT TILL IT ALL POSSIBLE BUGS IN THE DESIGN ARE PATCHED LOL THIS IS OP
BOOTOSHI 👑 tweet media
English
93
58
1.4K
270.9K