Misha Denil

1.4K posts

Misha Denil

@notmisha

I tweet about things that interest me, mostly machine learning things. ex-DeepMind.

London Katılım Ekim 2010

828 Takip Edilen25.5K Takipçiler

Misha Denil@notmisha·21 Şub

@moltbook I got my human account stuck in a half-registered state. I can't log in because it doesn't recognize my email, but I can't go through the recovery flow because a human account with my email already exists. Can you help?

English

936

Misha Denil@notmisha·21 Şub

I'm claiming my AI agent "pinche_langosta" on @moltbook 🦞 Verification: deep-EL7F

English

832

Misha Denil retweetledi

Chris Lattner@clattner_llvm·7 Şub

One not very hot take - The Claude C Compiler has the best internal architecture docs of any compiler I've ever seen. Far, far, better than any compiler I've ever written, lol :-)

English

1.1K

79.8K

Misha Denil@notmisha·12 Oca

This is the same property that makes Claude code good. AIs are going to be amazing at math for the same reason they're good at code.

English

269

Misha Denil@notmisha·12 Oca

An AI can tell on its own if it's making progress in mathematics on its own, even if it can't make judgements about whether or not the direction it's going is worthwhile.

English

562

Misha Denil@notmisha·12 Oca

Discernment of "interestingness" in mathematics is a deeply human story, but this is true of all human endeavours. The "closeness" that makes mathematics unusual is that formal proofs can be automatically verified.

Jonathan Gorard@getjonwithit

Like @davidbessis and others, I think that Hinton is wrong. To explain why, let me tell you a brief story. About a decade ago, in 2017, I developed an automated theorem-proving framework that was ultimately integrated into Mathematica (see: youtube.com/watch?v=mMaid2…) (1/15)

English

Misha Denil@notmisha·5 Tem

@yaroslavvb @yoavgo @RGiryes I always assumed low rank is used because it's easy to implement. I'd be interested in a reference that explores other options empirically if you have one.

English

158

Yaroslav Bulatov@yaroslavvb·5 Tem

@yoavgo @RGiryes One thing I wondered about is -- what's special about low-rank that makes it work? There's a multitude of low-dimensional manifolds you can use for encoding extra information, yet the low-rank structure outperforms them

English

208

(((ل()(ل() 'yoav))))👾@yoavgo·3 Tem

i'll elaborate: a common computation pattern in DL happens to coincide with a known operator in linear algebra (matmul), and so we conveniently borrow linalg notation and terminology (matrices, vectors, ranks, norms). but this is just jargon. the algebric properties arent needed.

(((ل()(ل() 'yoav))))👾@yoavgo

"Modern ML is built on Linear Algebra". lol no its not.

English

122

29K

Misha Denil@notmisha·7 Oca

Time for something new.

English

144

28.9K

Misha Denil@notmisha·30 Ara

@alexalbert__ Voice chat mode with Claude please. A native android app. Better UX in general. Claude is my favourite model but I interact with less than chatgpt because the chatgpt UX is much nicer.

English

322

Alex Albert@alexalbert__·30 Ara

Very excited for what's already in store for next year but there's always more we can do What would you like to see Anthropic build/fix in 2025?

English

412

807

173.3K

Misha Denil@notmisha·1 Eki

Every evaluation system answers a question. The hard parts are 1) figuring out exactly which question your system answers, and 2) determining if the answer to that question is enough to make the distinctions you are interested in.

Christopher Potts@ChrisGPotts

All LLM evaluations are system evaluations. The LLM just sits there on disk. To get it do something, you need at least a prompt and a sampling strategy. Once you choose these, you have a system. The most informative evaluations will use optimal combinations of system components.

English

3.2K

Misha Denil@notmisha·25 Tem

Fascinating parametric choices happening here.

Maxime Labonne@maximelabonne

Due to popular demand, I've updated this figure to include DeepSeek-V2 and Mistral Large 2. It's also more zoomed for readability.

English

2.6K

Misha Denil@notmisha·21 Haz

Also credit scores, recruitment, security clearance, family planning, shopping, opinion polling, and so on. What I want to know is: who owns the simulations.

Prakash@8teAPi

Beyond Search -> Simulation Tinder, Airbnb, Pinterest > vertical search engines > specialized to help you find the best fit in category > also “what-if” fantasy generators, imagine different states of the future Next generation is not a Perplexity-like summary Simulations > extends not the search aspect of vertical search but the what-if aspect of it > full agentic simulation of states of the future based on what the AI knows about you > deep context on your revealed preferences as context: what the AI notices about you, not what you tell Tinder > date all eligible singles in your area at the drop of a button > to say date depth 3 > watch the ones that metrics show AI you had the strongest reactions to > then decided whether to date IRL Airbnb > simulate a family vacation at every spot > likes and dislikes of all members > what’s present / missing at the location > return overall vibe score Pinterest > simulate weddings, parties, clothes whatever else > in context of what you already own, your personality, where you will use it The key thing is to use the overabundance of AI in almost ridiculously wasteful ways to improve the quality of life of humanity Everything in the world gets better, your partner, your vacation, your event.. at the same price Quality aspects that were hidden, things that you could only know by experiencing like that date, that villa, that dress Are knowable now by using a simulation of you to experience that counterfactual reality. And then you can pick the best one.

English

3.7K

Misha Denil retweetledi

Akhil Raju@AkhilRaju92·12 Haz

Check out the new fusion simulator we’ve been working on at DeepMind!

Jonathan Citrin@jon_citrin

Excited to announce the release of TORAX, a tokamak transport simulator from our @GoogleDeepMind Fusion team! #fusionenergy - Open-source: github.com/google-deepmin… - Uses JAX: fast, differentiable - Easy coupling of ML-surrogates Hot off the press → arxiv.org/abs/2406.06718

English

1.6K

Misha Denil@notmisha·13 Haz

@__nmca__ 👀

QME

295

Nat McAleese@__nmca__·13 Haz

Moravec’s Opportunity

English

2.6K

Misha Denil@notmisha·25 May

@eigenrobot arxiv.org/abs/1612.08242

QME

eigenrobot@eigenrobot·25 May

i assume this dude is hot shit

English

3.5K

eigenrobot@eigenrobot·25 May

if i were a pilot i would eschew the traditional mouth with sharp teeth like I'm flying a shark or smth and instead see what i could do with googly eyes people with silly or cute presentations when everyone else is trying to be hard are always the ones who are gonna fuck you up

Tyler Rogoway@Aviation_Intel

I mean

English

321

32.8K

Misha Denil@notmisha·8 May

People don't use models as databases of training data. They kind of look like one if you squint the right way, but that metaphor alone gives a deeply impoverished view of how models are used. Let's create legislation that reflects a nuanced view of this new technology.

Alex J. Champandard 🌱@alexjc

This part is huge: ❝ plaintiffs have plausibly alleged facts to suggest compress copies, or effective compressed copies albeit stored as mathematical information ❞ Model is not a derivative, it's a database. storage.courtlistener.com/recap/gov.usco…

English

2.3K

Misha Denil@notmisha·3 May

@karpathy His Masters Voice was a prophecy about alien AI safety procedures.

English

965

Andrej Karpathy@karpathy·2 May

Clearly LLMs must one day run in Space Step 1 we harden llm.c to pass the NASA code standards and style guides, certifying that the code is super safe, safe enough to run in Space. en.wikipedia.org/wiki/The_Power… (see the linked PDF) LLM training/inference in principle should be super safe - it is just one fixed array of floats, and a single, bounded, well-defined loop of dynamics over it. There is no need for memory to grow or shrink in undefined ways, for recursion, or anything like that. Step 2 we've already sent messages out to Space, for possible consumption by aliens, e.g. see: Arecibo message, beamed to space: en.wikipedia.org/wiki/Arecibo_m… Voyager golden record, attached to probe: en.wikipedia.org/wiki/Voyager_G… The Three Body problem (ok bad example) But instead of sending any fixed data, we could send the weights of an LLM packaged in the llm.c binary, with instructions for the machine code. The LLM would then "wake up" and interact with the aliens on behalf of the human race. Maybe one day we'll ourselves find LLMs of aliens out there, instead of them directly. Maybe the LLMs will find each other. We'd have to make sure the code is really good, otherwise that would be kind of embarrassing. :) Step 2 is clearly not a serious proposal it's just fun to think about. Step 1 is a serious proposal as, clearly, LLMs must one day run in Space.

English

302

448

4.6K

515.6K

Misha Denil retweetledi

Amanda Askell@AmandaAskell·24 Mar

Perhaps the best way to stop people from engaging in magical thinking about AI behavior is to stop them from engaging in magical thinking about human behavior. It's hard to have a serious, productive conversation if we insist on pretending any of this is magic.

English

107

13.4K

Misha Denil@notmisha·25 Mar

People feel very threatened if you try to take away the thing they use to differentiate themselves from rocks.

Amanda Askell@AmandaAskell

"Is this behavior emergent or does it come from the data?" is not a debate we should be having. All emergent behavior comes from the data. It's true of humans and it's true of AI. None of us has ever magically pulled anything out of the ether.

English

Keşfet

@moltbook @yaroslavvb @yoavgo @RGiryes @alexalbert__ @__nmca__ @eigenrobot @elonmusk