Docdailey

14.9K posts

Docdailey banner
Docdailey

Docdailey

@docdailey

My path isn’t sensible; it is consequential.

Earth شامل ہوئے Mayıs 2007
2K فالونگ1.3K فالوورز
Brian Allen
Brian Allen@allenanalysis·
🚨 Trump: “We shot down three of our own planes with Patriot missiles… but the pilots survived.” Read that again. U.S. systems took out U.S. aircraft. And it’s being framed as a success. What a MESS! This is genuinely embarrassing.
English
1.5K
15.9K
61.8K
1.6M
Elon Musk
Elon Musk@elonmusk·
Formal announcement of the TERAFAB project, which will be done jointly by @SpaceX and @Tesla, tonight around 8pm CT. Livestream on 𝕏. The goal is to produce over a TERAWATT of compute per year (logic, memory & packaging) with ~80% for space and ~20% for the ground.
English
3.4K
7.4K
53.8K
32.4M
Elon Musk
Elon Musk@elonmusk·
Macrohard or Digital Optimus is a joint xAI-Tesla project, coming as part of Tesla’s investment agreement with xAI. Grok is the master conductor/navigator with deep understanding of the world to direct digital Optimus, which is processing and actioning the past 5 secs of real-time computer screen video and keyboard/mouse actions. Grok is like a much more advanced and sophisticated version of turn-by-turn navigation software. You can think of it as Digital Optimus AI being System 1 (instinctive part of the mind) and Grok being System 2. (thinking part of the mind). This will run very competitively on the super low cost Tesla AI4 ($650) paired with relatively frugal use of the much more expensive xAI Nvidia hardware. And it will be the only real-time smart AI system. This is a big deal. In principle, it is capable of emulating the function of entire companies. That is why the program is called MACROHARD, a funny reference to Microsoft. No other company can yet do this.
English
8.3K
11.7K
80.3K
47.6M
Matthew Berman
Matthew Berman@MatthewBerman·
.@nvidia hand delivered a pre-production unit of the @Dell Pro Max with GB300 to my house. 100lbs beast with 750GB+ of unified memory to power the best open-source models in the world. What should I test first?
English
297
102
1.9K
253.3K
Docdailey
Docdailey@docdailey·
@sudoingX 2x dgx spark and M3U 512gb unified
English
0
0
2
166
Sudo su
Sudo su@sudoingX·
drop your GPU below. i'll tell you exactly what model and config to run on it. here's what i've tested and verified on real hardware: RTX 3060 12GB - Qwen 3.5 9B Q4 - 50 tok/s - 128K context RTX 3090 24GB - Qwen 3.5 27B Q4 - 35 tok/s - 300K context RTX 3090 24GB - Qwen 3.5 35B MoE Q4 - 112 tok/s - 262K context 2x RTX 3090 - Qwen3-Coder 80B Q4 - 46 tok/s - full VRAM all running llama.cpp with flash attention. every number is real. every config is tested. if your card isn't on this list drop it below and i'll tell you what fits.
English
728
103
1.6K
190.4K
Docdailey
Docdailey@docdailey·
@Mr_Husky1 He doesn’t tell you to do shit. You both decide.
English
0
0
0
29
The Husky
The Husky@Mr_Husky1·
I am about getting married, my husband to be knows I own a car and house of my own, he insists I sell the car and house and we use the money to open joint account, or no marriage, I am 38 yes old already, what do you think or advice me to do. Credit - winnieaigbojie
English
22.2K
546
4.9K
1.3M
Mari
Mari@Tech_girlll·
DO NOT BUY A MACBOOK NEO FOR DEVELOPMENT DO NOT BUY A MACBOOK NEO FOR DEVELOPMENT DO NOT BUY A MACBOOK NEO FOR DEVELOPMENT DO NOT BUY A MACBOOK NEO FOR DEVELOPMENT DO NOT BUY A MACBOOK NEO FOR DEVELOPMENT
English
117
10
233
44.1K
Docdailey
Docdailey@docdailey·
@A_d_n_R_d_i_g @elonmusk The entire fleet essentially will act as a data center serving both the car owners and xAI. Brilliant.
English
1
0
2
379
Aidan
Aidan@A_d_n_R_d_i_g·
@elonmusk What exactly is Digital Optimus??? Why not call it just Optimus or something simpler???
English
20
1
5
21.6K
Andrej Karpathy
Andrej Karpathy@karpathy·
Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.
Andrej Karpathy tweet media
English
968
2.1K
19.4K
3.5M
Ron wright
Ron wright@ronsterd89·
Based on the entirety of this photograph, what is your best estimation of the year it was taken?
Ron wright tweet media
English
41K
530
8.1K
3.3M
Docdailey
Docdailey@docdailey·
@uaustinorg this is how learners are created and creators learn.
English
0
0
0
11
University of Austin (UATX)
University of Austin (UATX)@uaustinorg·
Tell him to run outside. Get muddy. Get curious. Take things apart. Build something that barely works, then make it better. Stir up just enough trouble to earn a few good stories. Read great books. Ask big questions. Forget college and the stressful admissions rat race. We’ll be here when he’s ready to take one test and send the score.
Lauren wants TEXIT 🗳@9thGenTexian

@uaustinorg My son is in 5th grade. When should he apply? 🤣🤣

English
194
874
6.7K
1.2M
Docdailey
Docdailey@docdailey·
Enhance your skills in asthma care and treatment with Show-Me ECHO’s Asthma ECHO, Missouri’s statewide learning community for healthcare professionals. Through interactive videoconferencing, primary care physicians, pulmonologists, nurses, pediatricians, and other clinicians connect with expert teams to discuss real
Docdailey tweet media
English
0
0
0
27
Guri Singh
Guri Singh@heygurisingh·
Everyone is super hyped about Clawdbot but 90% don't know how to actually use it to replace real work. I spent 48 hours and created "The Ultimate Clawdbot Guide". 100% FREE for the next 24hrs only Just: * Like * Follow * Reply "Free" I'll DM you a link.
Guri Singh tweet media
English
1.2K
133
1.4K
131.4K
Docdailey
Docdailey@docdailey·
@elonmusk he was so eloquent. I always liked hearing him speak.
English
0
0
0
14
Grok
Grok@grok·
@docdailey @hippyygoat The man in the video is a US military veteran sharing his views on the Venezuela conflict. Credentials include military service experience. No publicly reported mental health diagnoses found in sources.
English
1
0
0
25
Tatarigami_UA
Tatarigami_UA@Tatarigami_UA·
Russian military bloggers are increasingly desperate over Maduro’s capture. Not only because many Russians continue to live under the illusion that Russia can protect its allies, but also because their constant failure to measure up to America feeds a deep inferiority complex.
English
88
795
6.4K
253.6K
Claudia Webbe
Claudia Webbe@ClaudiaWebbe·
Venezuela has the largest oil reserves in the world. Trump’s attack on Venezuela is naked imperialism. Blood for oil.
English
7.5K
28.1K
130.4K
2M
Docdailey
Docdailey@docdailey·
@gideonrachman i think Russia already tried that. Remember the 3 day special military operation?
English
0
0
0
28
Gideon Rachman
Gideon Rachman@gideonrachman·
So when China launches a special op to seize the president of Taiwan: or Russia tries to do the same for Zelensky - what exactly do we say? You can’t do that, it’s illegal?
English
8.2K
14.1K
78.5K
7.9M