Sasha Krassovsky

746 posts

Sasha Krassovsky

Sasha Krassovsky

@bztree

Performance @AnthropicAI

Seattle Katılım Mart 2023
541 Takip Edilen1.7K Takipçiler
Beff (e/acc)
Beff (e/acc)@beffjezos·
Karpathy joining Extropic instead of Anthropic would have been a more entertaining outcome
English
10
1
170
7.3K
i2cjak
i2cjak@i2cjak·
I should quit being such a fucking idiot
English
21
17
172
8.6K
Sasha Krassovsky
Sasha Krassovsky@bztree·
@ThePrimeagen What is C? I only know ⌘ and ⌃. ⌘-c and ⌘-v works fine for me in Terminal-dot-app
English
0
0
1
230
ThePrimeagen
ThePrimeagen@ThePrimeagen·
without googling, i still cannot figure out how to copy and paste on a mac into: 1. the provided Terminal app (which sucks) 2. Ghostty. C-V does nothing, C-v goes into escape insertion (expected). I refuse to google this and it should just be obvious...
English
411
9
1.7K
352.5K
Sasha Krassovsky
Sasha Krassovsky@bztree·
This is the part of today’s announcement I’m most excited about 🙂
xAI@xai

SpaceXAI and @AnthropicAI have also expressed interest in partnering to develop multiple gigawatts of orbital AI compute capacity

English
0
0
4
480
Sasha Krassovsky
Sasha Krassovsky@bztree·
@benhylak @pronounced_kyle I just finished the part where he puts number theory into a formal system, around page 230. It’s getting tough to keep going but I’m on a mission
English
1
0
1
228
ben hylak
ben hylak@benhylak·
i know a lot of people who love this book, and none of them have finished it.
Ihtesham Ali@ihtesham2005

A 34-year-old physics graduate student spent years writing a strange 800-page book in 1979 about a logician, a Dutch artist, and a German composer. It won the Pulitzer Prize the following year. It quietly became required reading at every AI lab in the world. It is the only book in history that makes the deepest ideas in computer science feel like a dream you cannot stop thinking about. I read it across 3 months on a single side table next to my bed and walked away seeing intelligence, consciousness, and AI in a way I cannot un-see. His name is Douglas Hofstadter. The book is called Gödel, Escher, Bach. Almost nothing in modern AI makes sense without this book. ChatGPT, Claude, Gemini, the entire architecture of self-attention, the alignment problem, the strange feeling that LLMs sometimes seem to understand and other times seem to be playing an elaborate symbol-shuffling game, all of it traces back to questions Hofstadter laid out in a single book published before most of today's AI engineers were born. Here is the story almost nobody tells you about how the book came to exist. Hofstadter was the son of Robert Hofstadter, who won the Nobel Prize in Physics in 1961 for measuring the size of the proton. He was supposed to follow in his father's footsteps. He started a physics PhD at the University of Oregon. He was miserable. He could not focus. He did not love the work. He kept getting pulled toward something else. The something else was a single question that had haunted him since childhood. How can meaning emerge from meaningless symbols? Specifically, how does a brain, which is made of nothing but cells firing electrical signals at each other, produce something that feels like consciousness, like understanding, like a self? He could not let the question go. He left physics. He started writing. The book took him years. He wrote it largely in isolation, working in the basement of his parents' house and at Indiana University, where he eventually finished it. He thought it would be read by maybe a few hundred logicians and AI researchers. Basic Books published it in 1979 as a 777-page hardcover. The next year it won the Pulitzer Prize for general non-fiction and the National Book Award for science. The book is structured in a way that almost no other book has ever attempted. The chapters alternate between two layers. One layer is technical chapters about logic, computability, neuroscience, and AI. The other layer is fictional dialogues between a tortoise and Achilles, characters borrowed from a paradox by Lewis Carroll. The dialogues play with the same ideas the technical chapters explain. Read in order, they do not feel like a textbook. They feel like a strange house with rooms that loop back into each other and corridors that change shape behind you. The first thing the book does is explain Gödel's incompleteness theorems in a way no math textbook had ever managed. Kurt Gödel, an Austrian logician working in 1931, proved something that broke mathematics. He showed that any formal system powerful enough to describe arithmetic contains statements that are true but cannot be proven inside that system. Mathematics, the most certain thing humans had ever built, has holes in it that can never be filled. Hofstadter spends hundreds of pages making you understand this proof not just as a mathematical theorem, but as a structural fact about every sufficiently complex system. Including the brain. Including any AI. The reason AI alignment is genuinely hard is not just engineering. It is structural. Any system smart enough to model itself will contain truths about itself it cannot reach from inside itself. Hofstadter showed this 50 years before AI safety was a field. The second thing the book does is introduce his core idea. He calls it the strange loop. A strange loop is what happens when a system, by climbing through layers of itself, somehow ends up back where it started. Escher's drawings of staircases that always go up but somehow loop back are visual strange loops. Bach's musical canons that modulate up through keys and end on the original note are auditory strange loops. Gödel's self-referential statements that talk about themselves are logical strange loops. Hofstadter argues that consciousness is a strange loop. Your brain builds a model of the world. Inside that model, it builds a model of itself perceiving the world. Inside that self-model, it builds a model of itself thinking about itself perceiving the world. The recursion does not bottom out. The self is what the loop feels like from the inside. This is the part that AI researchers cannot stop returning to. Modern transformer models use self-attention, which is technically a mechanism where a network attends to its own internal states across layers. Recursive reasoning, where a model thinks about its own thinking, is now a research area with its own conferences. Meta-learning, where models learn how to learn, is a direct descendant of what Hofstadter described in 1979 as the necessary structure of any conscious system. He wrote the philosophy. The engineers are now building the implementation. The third thing the book does is the part that haunts every AI conversation today. Hofstadter argued that meaning is not something separate from symbol manipulation. It is what symbol manipulation looks like from the inside, when the manipulation is complex enough and self-referential enough. A simple lookup table does not understand anything. But a system that processes symbols at sufficient depth, with enough self-modeling, with enough recursion, starts to look identical from the outside, and possibly from the inside, to a system that understands. This is the deepest question in modern AI. When ChatGPT generates a response, is it actually thinking, or is it just doing very fast symbol shuffling? Hofstadter spent 800 pages arguing that the distinction may not exist at sufficient scale. If a system shuffles symbols according to the right structure, meaning is what the shuffling looks like from the inside. You can read modern debates about AI consciousness from Yann LeCun, Geoffrey Hinton, Ilya Sutskever, and David Chalmers, and you will find that they are all, in their own ways, having the argument Hofstadter framed in 1979. The fourth thing the book did is the one that took the longest to be vindicated. Hofstadter argued, and continued arguing for decades, that the actual engine of human intelligence is not logic. It is not deduction. It is not pattern matching in any simple sense. It is analogy. The ability to see one thing as similar to another thing, to map the structure of one situation onto a different situation, is, in his view, the core of thought itself. For decades this was unfashionable. Symbolic AI focused on logic and rules. Statistical AI focused on pattern matching. Almost nobody worked seriously on analogy. Then large language models started working. And the people who looked closely at what they were doing realized something uncomfortable. LLMs are, fundamentally, analogy machines. They learn structural patterns from text and apply those patterns by analogy to new situations. They do not deduce. They do not reason logically by default. They map the shape of one thing onto the shape of another thing and produce output that fits the new shape. Hofstadter saw this before any of it existed. His later book Surfaces and Essences, written with Emmanuel Sander, is 600 pages defending the claim that analogy is the core of cognition. It came out in 2013. It was largely ignored. The ChatGPT release in 2022 was, in some sense, a vindication of the entire argument. The strangest thing about reading Gödel, Escher, Bach in 2026 is realizing how lonely the book must have felt when it was written. In 1979 there was no GPT. No deep learning. No transformer. The dominant approach to AI was symbolic logic, and most researchers thought minds were going to be programmed top-down, rule by rule, like a complicated chess engine. Hofstadter said the opposite. He said minds were emergent. They came from the bottom up. They were strange loops in complex substrates. The programmers' approach would never produce real intelligence because it was missing the recursive self-modeling that made minds real. He was right. The book is hard. I had to use all the LLMs and NotebookLM to understand it. It is not a beach read. You do not finish it in a weekend. The math chapters require attention. The dialogues require patience. Most people who buy it never finish it. That is fine. The book is structured so that reading any 50 pages produces a permanent shift in how you think. Bill Gates lists it among the books that shaped him. Steve Jobs read it. Almost every senior AI researcher in the world will tell you it was the book that made them fall in love with the question of intelligence in the first place. Hofstadter himself has been in doubt about modern LLMs. He has said they may have proven him right about analogy and wrong about consciousness at the same time. He is still writing. He is still working on the same question that pulled him out of physics 50 years ago. The 800-page book that explained intelligence before AI existed is sitting one click away from you. Most people will never open it. The ones who do will see the world differently for the rest of their lives.

English
53
22
445
144.7K
Sasha Krassovsky
Sasha Krassovsky@bztree·
@cmuratori "Loading" and "updating the screen" don't seem all that related anyway? Like you can have a game giving you a load screen and updating the loading spinner at 60 FPS if it's really loading GBs from your hard drive.
English
0
0
0
279
Casey Muratori
Casey Muratori@cmuratori·
At this point I feel like I should do a stream tomorrow to talk about the replies I've seen to this post. I completely disagree with people's umbrage about use of FPS as a metric here: A) that is exactly what time-to-show actually is (we measure 1% and .1% lows in for a reason!), and b) to me, FPS is the most relatable number for response time for average people to understand given that they don't work on software performance for a living like I do. Many people (especially gamers!) intuitively know what 10 or 11fps responsiveness feels like for an action. Few intuitively know what "94ms" responsiveness feels like. I also find it unacceptable to call this "load time" because the user is not asking to "load" anything - it is an action they are taking from a UI that they perceive to be contiguous, and the choice to involve a "load" of any kind at this point is purely the fault of the designers of the system, not some inevitability. Everything has already "loaded" from the point of the view of the user, and if you are claiming to have done a rewrite with performance "top-of-mind", you should have preloaded or precached whatever it is that you believe takes 94ms to "load" here.
Casey Muratori@cmuratori

Just want to make sure I'm reading this right: Microsoft rewrote the run dialog with performance "top-of-mind", and the best they could manage to do when putting up a single text box was 10fps?

English
72
22
922
48.2K
Sasha Krassovsky
Sasha Krassovsky@bztree·
@awesomekling EU nutrition labels drive me nuts. The per 100g system is obviously inferior. Like suppose I want to have a protein bar. I care how much is in the protein bar, not in a glob of 2.37 protein bars
English
1
0
1
151
Andreas Kling
Andreas Kling@awesomekling·
EU nutrition labels: - nutrients per 100g US nutrition labels: - nutrients per 0.5 cup - nutrients per 3 pieces - nutrients per 5 sprays
Français
36
7
538
24.5K
Sasha Krassovsky
Sasha Krassovsky@bztree·
@filpizlo @zuhaitz_dev My mind was blown reading the C++ FQA for the first time. C++ really is just reinventing every C feature in its own way.
English
0
0
0
81
Filip Jerzy Pizło
Filip Jerzy Pizło@filpizlo·
@zuhaitz_dev I'm talking about the full latest version of C++ It's just sugar. C and C++ are two dialects of the same thing
English
5
0
16
9.9K
Sasha Krassovsky
Sasha Krassovsky@bztree·
OK last post for the night: I tried all the fancy stuff they recommended in their GEMM doc: Z-curve, static extents, accumulation group synchronization. None of it seemed to make any performance improvement - I seem to be stuck at 40 TFLOPs in bf16 across a variety of shapes.
Sasha Krassovsky@bztree

I got my M5 MacBook over the weekend and had some time to mess around with Metal 4 and the Neural Accelerators! Wanted to document some of my first impressions below:

English
2
1
21
3.4K
Sasha Krassovsky
Sasha Krassovsky@bztree·
@__simt__ @anemll @ekryski @mweinbach How do you use the fp19? When I had my metal kernel mark the inputs as `float`, the profiler seemed to tell me it wasn't using the neural accelerator, but the normal fp32 ALUs?
English
0
0
0
85
Sasha Krassovsky
Sasha Krassovsky@bztree·
I got my M5 MacBook over the weekend and had some time to mess around with Metal 4 and the Neural Accelerators! Wanted to document some of my first impressions below:
English
3
7
238
58.3K
Eric Kryski
Eric Kryski@ekryski·
@bztree Good stuff! Any chance you got an open source repo of your experiment? 👀 Asking for a friend...
English
1
0
0
77
Sasha Krassovsky
Sasha Krassovsky@bztree·
@mweinbach I believe that’s for allocating Tensors on the host, nothing to do with the neural accelerator
English
1
0
0
328
Sasha Krassovsky
Sasha Krassovsky@bztree·
@mweinbach Yes! It’s a very good document. I haven’t implemented the Z-order traversal they recommend yet, but plan to.
English
1
0
7
3.1K
Sasha Krassovsky
Sasha Krassovsky@bztree·
Overall had a fun time! To close off with some criticisms: - it took me a long time to figure out how to enable Metal 4. I wish this were better-documented - MPP seems a little boiler-platey. I wish there were a slightly more convenient syntax for this stuff, but not a dealbreaker. Hope this was interesting!
Sasha Krassovsky tweet media
English
2
3
35
4.4K
Sasha Krassovsky
Sasha Krassovsky@bztree·
I was also expecting a much more dramatic speedup from the Neural Accelerator. It seemed that with my original tile size of 32x32, I was only getting 244 GB/s of memory bandwidth. Bumping it up to 64x64 gave me 740 GB/s, dropping the time to 3.36ms!
Sasha Krassovsky tweet mediaSasha Krassovsky tweet media
English
1
2
20
5.1K