Michael Griffiths

5.5K posts

Michael Griffiths banner
Michael Griffiths

Michael Griffiths

@msjgriffiths

Data Science

Brooklyn Katılım Ekim 2007
3K Takip Edilen4.2K Takipçiler
Donut Gup
Donut Gup@DonutGup·
@msjgriffiths @StreetsblogNYC no judge would ever put that driver liable, how is he supposed to know that a person, one who is already very hidden with their clothing, is going to start running in the direction of his car? he had no time to react once she entered his lane of travel.
English
3
0
1
221
Donut Gup
Donut Gup@DonutGup·
@StreetsblogNYC new york is permissive yellow, which means that driver had every right to go. there is no reason for her to have started walking, and shes literally running diagonal instead of to the curb, and then even stops running midway.
English
2
0
9
1.3K
Michael Griffiths
Michael Griffiths@msjgriffiths·
Humorously, the LLM falls into the same trap many people do of want to write what it did, not what the reader should take away. It also handwaves references in literature search! Very human. Read here: prism.openai.com/?u=aba1c799-69…
English
0
2
2
64
Michael Griffiths
Michael Griffiths@msjgriffiths·
For fun, I had Codex w/ GPT 5.4 on xhigh do a series of small scale experiments with transformers (i.e. can we use Tracr to encode functions in transformers before the grokking stage, and thus shorten pretraining time?). It's not amazing, but fun to see how far they can go now!
Michael Griffiths tweet media
English
1
0
1
97
Michael Griffiths
Michael Griffiths@msjgriffiths·
I wonder if this would make continual learning easier.
English
0
0
0
17
Michael Griffiths
Michael Griffiths@msjgriffiths·
Tokenization in general seems to have progressed less than I expected from fasttext. Still doing subwords, common words, despite work showing compressed tokens work (eg zip2zip), new domains (code, harnesses) is iffy. Feels like byte level ngrams should be part of mid training.
English
1
0
0
53
Michael Griffiths
Michael Griffiths@msjgriffiths·
Dumb question: why isn’t BPE computed as a streaming algorithm with a warmup period? People seem to train on a fraction of full corpus in an exact way but that doesn’t seem obviously better?
English
0
0
0
50
Michael Griffiths
Michael Griffiths@msjgriffiths·
@ryderkessler The flip side is that services should also extend to the top 1% - like childcare - instead of “means testing”
English
1
0
2
418
Ryder Kessler
Ryder Kessler@ryderkessler·
There should not be anything controversial about raising taxes on the wealthiest New Yorkers. Not only can the 1% afford to pay their fair share to support the public sector services that benefit everyone, but this level of inequality is bad for democracy.
unusual_whales@unusual_whales

"The top 1 percent of American households, which have a minimum net worth of $11.1 million, now collectively own about $25.6 trillion worth of stocks and mutual funds, the same amount as the remaining 99% of the country," per the Federal Reserve

English
27
3
18
4.3K
Michael Griffiths retweetledi
Palli Thordarson
Palli Thordarson@PalliThordarson·
Proud with @UNSWRNA to have been involved & making the mRNA-LNP for Rosie. There are nuances here that the thread below misses but nevertheless, the intersection of RNA technology, genomic & AI poses an opportunity to change the way do medicine and make access more equitable 1/8
Greg Brockman@gdb

How AI empowered Paul Conyngham to create a custom mRNA vaccine to cure his dog’s cancer when she had only months to live. The first personalized cancer vaccine designed for a dog:

English
49
246
1.6K
215.3K
Michael Griffiths
Michael Griffiths@msjgriffiths·
@emollick And the flip side is that competitive advantage comes from the bottlenecks. Perhaps it's in "environment" (data generation, aka data) so it's more ML logic. But I am not sure.
English
0
0
0
11
Michael Griffiths
Michael Griffiths@msjgriffiths·
@emollick i.e. it does for all "technical" skills what calculators do for basic maths (addition/subtraction/multiplication). Obviously those skills drop in value tremendously: then something else *must* become the bottleneck.
English
1
0
0
28
Ethan Mollick
Ethan Mollick@emollick·
I wrote about the exponential improvement path of AI, the early signs of massive transformations in the nature of work (including software companies where nobody codes any more), and how one week in February is an omen of our future as things get weirder. open.substack.com/pub/oneusefult…
English
39
86
578
87.2K
Sherwood
Sherwood@shcallaway·
@ChiragCX @LakshyAAAgrawal GEPA uses AI to generate prompt “mutations” and iteratively achieve the optimal prompt. Here, Karpathy is doing something similar to achieve optimal pre-training hyper parameters
English
1
4
26
2.1K
Michael Griffiths
Michael Griffiths@msjgriffiths·
Marx’s critique of factories - alienation of worker from the work - now coming to knowledge work.
Machine Learning Street Talk@MLStreetTalk

A masterclass from @jeremyphoward on why AI coding tools can be a trap -- and what 45 years of programming taught him that most vibe coders will never learn. - AI coding tools exploit gambling psychology - The difference between typing code and software engineering - Enterprise coding AND prompt-only vibe coding are "inhumane" i.e. disconnecting humans from understanding-building - AI tools remove the "desirable difficulty" you need to build deep mental models. Out on MLST now!

English
0
0
0
74
Michael Griffiths retweetledi
Ted Zadouri
Ted Zadouri@tedzadouri·
Asymmetric hardware scaling is here. Blackwell tensor cores are now so fast, exp2 and shared memory are the wall. FlashAttention-4 changes the algorithm & pipeline so that softmax & SMEM bandwidth no longer dictate speed. Attn reaches ~1600 TFLOPs, pretty much at matmul speed! joint work w/ Markus Hoehnerbach, Jay Shah(@ultraproduct), Timmy Liu, Vijay Thakkar (@__tensorcore__ ), Tri Dao (@tri_dao) 1/
Ted Zadouri tweet media
English
7
132
781
221.9K
Michael Griffiths
Michael Griffiths@msjgriffiths·
@rickasaurus Sub agents in a single session are cool. I expect that to develop a lot over the next year.
English
0
0
1
27
Rick
Rick@rickasaurus·
What I want is an actual manager Claude to look after my other Claude’s but to come to me for what to do. Like a CTO reporting to a product focused CEO.
Andrej Karpathy@karpathy

I had the same thought so I've been playing with it in nanochat. E.g. here's 8 agents (4 claude, 4 codex), with 1 GPU each running nanochat experiments (trying to delete logit softcap without regression). The TLDR is that it doesn't work and it's a mess... but it's still very pretty to look at :) I tried a few setups: 8 independent solo researchers, 1 chief scientist giving work to 8 junior researchers, etc. Each research program is a git branch, each scientist forks it into a feature branch, git worktrees for isolation, simple files for comms, skip Docker/VMs for simplicity atm (I find that instructions are enough to prevent interference). Research org runs in tmux window grids of interactive sessions (like Teams) so that it's pretty to look at, see their individual work, and "take over" if needed, i.e. no -p. But ok the reason it doesn't work so far is that the agents' ideas are just pretty bad out of the box, even at highest intelligence. They don't think carefully though experiment design, they run a bit non-sensical variations, they don't create strong baselines and ablate things properly, they don't carefully control for runtime or flops. (just as an example, an agent yesterday "discovered" that increasing the hidden size of the network improves the validation loss, which is a totally spurious result given that a bigger network will have a lower validation loss in the infinite data regime, but then it also trains for a lot longer, it's not clear why I had to come in to point that out). They are very good at implementing any given well-scoped and described idea but they don't creatively generate them. But the goal is that you are now programming an organization (e.g. a "research org") and its individual agents, so the "source code" is the collection of prompts, skills, tools, etc. and processes that make it up. E.g. a daily standup in the morning is now part of the "org code". And optimizing nanochat pretraining is just one of the many tasks (almost like an eval). Then - given an arbitrary task, how quickly does your research org generate progress on it?

English
2
0
4
885
Michael Griffiths
Michael Griffiths@msjgriffiths·
@ModeledBehavior You mean you want people to come threaten my livelihood?? Take money away from my kids? No, that sounds like oppressor power language, real colonialist stuff. I have an inviolable right to life (of my choosing), freedom (from pain/threat), and property (the more the better!) /s
English
0
0
1
105