zach

6.9K posts

zach banner
zach

zach

@swe_zach

riding the tiger of accelerationism

SF → Seattle Katılım Ağustos 2016
1.6K Takip Edilen2.6K Takipçiler
Sabitlenmiş Tweet
zach
zach@swe_zach·
Sometimes I think I’m a good engineer then I see dudes like Karpathy and geohot cook and I’m like oh I suck
Andrej Karpathy@karpathy

Day 24 of llm.c: we now do multi-GPU training, in bfloat16, with flash attention, directly in ~3000 lines of C/CUDA, and it is FAST! 🚀 We're running ~7% faster than PyTorch nightly, with no asterisks, i.e. this baseline includes all modern & standard bells-and-whistles: mixed precision training, torch compile and flash attention, and manually padding vocab. (Previous comparisons included asterisks like *only inference, or *only fp32 etc.) Compared to the current PyTorch stable release 2.3.0, llm.c is actually ~46% faster. My point in these comparisons is just to say "llm.c is fast", not to cast any shade on PyTorch. It's really amazing that PyTorch trains this fast in a fully generic way, with ability to cook up and run ~arbitrary neural networks and run them on a ton of platforms. I see the goals and pros and cons of these two projects as different, even complementary. Actually I started llm.c with my upcoming education videos in mind, to explain what PyTorch does for you under the hood. How we got here over the last ~1.5 weeks - added: ✅ mixed precision training (bfloat16) ✅ many kernel optimizations, including e.g. a FusedClassifier that (unlike current torch.compile) does not materialize the normalized logits. ✅ flash attention (right now from cudnn) ✅ Packed128 data structure that forces the A100 to utilize 128-bit load (LDG.128) and store (STS.128) instructions. It's now also possible to train multi-GPU - added: ✅ First version of multi-gpu training with MPI+NCCL ✅ Profiling the full training run for NVIDIA Nsight Compute ✅ PR for stage 1 of ZeRO (optimizer state sharding) merging imminently We're still at "only" 3,000 lines of code of C/CUDA. It's getting a bit less simple, but still bit better than ~3 million. We also split off the fp32 code base into its own file, which will be pure CUDA kernels only (no cublas or cudnn or etc), and which I think would make a really nice endpoint of a CUDA course. You start with the gpt2.c pure CPU implementation, and see how fast you can make it by the end of the course on GPU, with kernels only and no dependencies. Our goal now is to create a reliable, clean, tested, minimal, hardened and sufficiently optimized LLM stack that reproduces the GPT-2 miniseries of all model sizes, from 124M to 1.6B, directly in C/CUDA. A lot more detail on: "State of the Union [May 3, 2024]" github.com/karpathy/llm.c…

English
15
35
838
114K
zach
zach@swe_zach·
@FOS That’s my intern wtf
English
0
0
2
82
Front Office Sports
The Cubs broadcast showed fans working remotely from Wrigley Field during the team's day game 💻
English
537
959
25.1K
18.5M
Bryan Johnson
Bryan Johnson@bryan_johnson·
i’ve started a sunmaxxing protocol let me cook
Bryan Johnson tweet media
English
522
56
3.7K
496.4K
ConnectedSF
ConnectedSF@ConnectedSF·
Happening Now: ➡️Protest opposing 25-story Marina Safeway Development proposal. Last week, SF Planning Dept deemed eligibility of AB 2011 approval, fast-rtacking design review. Fight back by signing the petition that will be emailed to City Hall today form.jotform.com/253625476102151
ConnectedSF tweet media
English
112
3
84
392.3K
zach
zach@swe_zach·
I guess I’ll take the other side of this. When you click on the website it loads. It’s a blog not NASA flight software. That said - using LOC or god forbid token usage as a way to measure productivity is obviously retarded
gregorein@Gregorein

so... I audited Garry's website after he bragged about 37K LOC/day and a 72-day shipping streak. here's what 78,400 lines of AI slop code actually looks like in production. a single homepage load of garryslist.org downloads 6.42 MB across 169 requests. for a newsletter-blog-thingy. 1/9🧵

English
0
0
3
1.7K
zach
zach@swe_zach·
I’ve had my head down securing the bag for the past couple years and I’m just now starting to get a feeling of existential dread about missing the AI wave 600k TC can completely zap your agency. It’s the paradigm shift of a lifetime and I’m still mostly on the sidelines.
English
18
0
252
32.3K
zach
zach@swe_zach·
@Aella_Girl Is this still true when you control for other group differences in IQ?
English
0
0
1
282
Aella
Aella@Aella_Girl·
Happy trans day of visibility! This is your reminder that there's a good chance transwomen have higher avg IQs than cis men
English
215
46
1.3K
181.6K
Aella
Aella@Aella_Girl·
my friend just went to a lesbian orgy for women only. She was nervous, but said it went great. Said she was a bit surprised tho to find at the end of the night that 50% of the visible genitals were penises
English
240
72
3.3K
507.7K
zach
zach@swe_zach·
@artwithinpod Who is that little shit?!? That’s not HP
English
0
0
0
157
Georgia Coley
Georgia Coley@artwithinpod·
WHY IS IT SO COMICALLY DARK AND BLUE?????
Georgia Coley tweet mediaGeorgia Coley tweet mediaGeorgia Coley tweet mediaGeorgia Coley tweet media
English
131
338
6.1K
3.1M
atlas
atlas@creatine_cycle·
"literally in the trenches. intense warzone" >air conditioned office >all you can eat catered meals >making 7 figures >live in palo alto >3 months
Rich@im_rich_zou

I just left @xai It was not an easy decision. The past three months were an absolute blast - I've been in many trenches in my life and can say this was by far one of the most intense warzones. I love fighting. Especially being in the trenches with my friends, working on problems that will actually advance humanity. But the current environment wasn't serving my growth. And that's a really hard thing to admit - I've always looked up to Elon, and I genuinely believe xAI will win. I still do. One thing I'll say: don't stay somewhere just because of the name. If you're unhappy, and you know you can't grow 100x where you are - it's the right call to leave. What's next? Get some sleep back. Then find the next trench worth fighting in. I'll always be meeting exceptional people - that was never because of a recruiting title. I just love finding smart people and helping however I can. Many more side quests to come!!!

English
55
66
4.2K
255.4K
zach
zach@swe_zach·
@nickgillespie Prolly cuz income doesn’t actually matter that much cuz boomers pulled the ladder up on assets
English
0
0
1
95
Nick Gillespie
Nick Gillespie@nickgillespie·
'“I think we’re middle class for this area,” Mr. O’Leary said.' In fact, per a link in the article: median household income in 2023 for their neighborhood was $155,710; for the city as a whole, $79,480. Why do so many people live in fantasyworlds abt their own wealth?
Nick Gillespie tweet media
English
181
135
2.2K
406.4K
Crémieux
Crémieux@cremieuxrecueil·
Across RCTs for ADHD medications, we often see outcomes like increased grades, reduced suicide risk, and lower odds of misbehavior. But one that I think is neglected is the notable increase in reported quality of life. These drugs make people happier about their lives!
Crémieux tweet media
Crémieux@cremieuxrecueil

DEA quotas and related controls are still driving an Adderall shortage. Unfortunately, the DEA is run by busybodies. When one manufacturer asked the DEA to please hurry up, the DEA responded by threatening to shut them down completely.

English
34
36
626
39K
Crémieux
Crémieux@cremieuxrecueil·
@swe_zach Not convinced it does that and can't recommend it due to cancer risk. Am currently on it.
English
1
0
3
250
Crémieux
Crémieux@cremieuxrecueil·
It's worth noticing that the crowd that supports ivermectin and failed to hype things that actually worked for COVID also seems to hate Ozempic. Perhaps they just don't like drugs that work.
Crémieux@cremieuxrecueil

It really is incredible that we found tons of extremely cheap, largely generic drugs that actually help with COVID and the conspiracists only embraced drugs that didn't work. Every nutjob sells ivermectin, but none of them are harping on dexamethasone and other corticosteroids.

English
35
29
629
28.4K
zach
zach@swe_zach·
@samswoora And they’ll likely be evaluated by their token spend. Wild times
English
0
0
0
34
Samswara
Samswara@samswoora·
I think engineers will start competing over their monthly ai spend as a status marker of AI adoption, I already do this a bit. My coworker beat me on token usage and I 100% resent him for it
English
12
0
42
1.8K
pbs
pbs@powerbottomson·
@swe_zach how much is a ton
English
1
0
3
206
zach
zach@swe_zach·
Yesterday I took a ton of adderall and prompted 5 parallel sessions of Claude code with speech to text for 10 hours Puppet master
English
3
0
12
714
Nikita Bier
Nikita Bier@nikitabier·
@TurnerNovak I was investigating a guy running 30 accounts with Indonesian IP addresses and I was trying to figure out what tools he was using. I found out it was AI: Actual Indonesians.
English
1K
1.9K
14.6K
2.3M
Turner Novak 🍌🧢
Turner Novak 🍌🧢@TurnerNovak·
Just spent two hours talking to this guy about AI: - completely changed the game - most efficient team member - outperformed with limited resources - misunderstood in the court of public opinion - always has the answer Turns out he was actually talking about Allen Iverson.
English
43
69
1.7K
235K
Crémieux
Crémieux@cremieuxrecueil·
I think what we have to do is clear: Semaglutide (Ozempic/Wegovy) needs to be treated as the standard of care for weight loss and it needs to be an active comparator in trials going forward. Everyone knows these drugs work, so we can't do inert placebos anymore.
Crémieux@cremieuxrecueil

This is not good: People have learned that GLP-1s are really effective, so if they're not losing weight, they know they're in the placebo group. So these people getting placebos are getting mad and leaving the trials.

English
40
48
988
74.7K