Aaron

2.5K posts

Aaron banner
Aaron

Aaron

@aaronbatilo

Putting the ML in YAML. Eng @CoreWeave. Previously at @MicrosoftAI, @InflectionAI, @Cohere, @Color I like smash melee

Colorado, USA Katılım Haziran 2014
172 Takip Edilen319 Takipçiler
Sabitlenmiş Tweet
Aaron
Aaron@aaronbatilo·
Whichever model completes The Witness is AGI
English
0
0
0
436
Anuj Saharan
Anuj Saharan@anujsaharan_·
gigakernel is next?
Filipino
1
0
1
245
Aaron
Aaron@aaronbatilo·
/goal in codex is like SOOOO much better. /goal in Claude Code ends up just giving up sometimes, like it literally just stops working while "Goal active" is just sitting there. I think it has something to do with loop prompts that execute in the middle of goal completion
English
1
0
0
92
Aaron
Aaron@aaronbatilo·
@difficultyang Has anyone ever been experiment limited, ever
English
1
0
0
165
difficultyang
difficultyang@difficultyang·
At what GPU cluster size do you stop being experiment limited and become gpu capacity limited
English
5
0
43
5.4K
Aaron
Aaron@aaronbatilo·
@charles_irl How do you conceptualize the difference with your idea versus dropout?
English
0
0
1
62
Charles 🎉 Frye
Charles 🎉 Frye@charles_irl·
my gut says that to solve float numerics problems from nondeterminism x nonassociativity, we need to think bigger than determinism. models could eg be trained with large amounts of "implementation noise" so that the learned network is more robust to implementation skew.
English
10
1
57
5.4K
Aaron
Aaron@aaronbatilo·
Me getting ready to have Claude do all my work
roon@tszzl

English
0
0
1
68
Aaron
Aaron@aaronbatilo·
Let's start a cooking show where we try to get various ingredients to have the Maillard reaction. We'll call it: Will It Brown?
kalomaze@kalomaze

bro

English
0
0
2
85
Aaron retweetledi
dax
dax@thdxr·
just got inside info that openai is working on a new model
English
254
12
2.3K
143.7K
Aaron
Aaron@aaronbatilo·
Do you think Lord Dario asked Karpathy to tweet on the day of Google I/O?
English
0
0
1
41
Aaron
Aaron@aaronbatilo·
@KranenKyle It's criminal that you only have 500 followers
English
1
0
1
28
Kyle Kranen
Kyle Kranen@KranenKyle·
No matter the amount of full attention sparsity, decreased attention dim, MLA/GQA, infinite long context will always have to deal with quadratic prefill cost. Kind of begs the question on if we can successfully train a model with local prefill and global decode?
English
8
0
31
3.7K
Aaron
Aaron@aaronbatilo·
My onboarding experience to Antigravity CLI was to login (great), ask the model what it thinks my current repo is about, and then to have it respond purely in Mandarin
English
1
0
1
92
Aaron
Aaron@aaronbatilo·
@bubbleboi The new gamer girl bath water
English
0
0
0
27
bubble boi
bubble boi@bubbleboi·
I would drink data center water.
English
37
17
426
18.2K