David Liu

55 posts

David Liu

@davidliuneuro

PhD ML & Comp Neuro @Cambridge_Eng | BA/MSci Theoretical Physics @DeptofPhysics Prev: Meta Reality Labs, G-research London

Katılım Ekim 2021

129 Takip Edilen100 Takipçiler

David Liu retweetledi

Spencer Baggins@bigaiguy·6 Kas

🚨 MIT just humiliated every major AI lab and nobody’s talking about it. They built a new benchmark called WorldTest to see if AI actually understands the world… and the results are brutal. Even the biggest models Claude, Gemini 2.5 Pro, OpenAI o3 got crushed by humans. Here’s what makes it different: WorldTest doesn’t check how well an AI predicts the next word or frame. It measures if it can build an internal model of reality and use that to handle new situations. They built AutumnBench 43 interactive worlds, 129 tasks where AIs must: • Predict hidden parts of the world (masked-frame prediction) • Plan multi-step actions to reach goals • Detect when the rules of the environment suddenly change Then they tested 517 humans vs the top models. Humans dominated every category. Even massive compute scaling barely helped. The takeaway is wild: Today’s AIs don’t understand environments they just pattern-match inside them. They don’t explore, revise beliefs, or experiment like humans do. WorldTest might be the first benchmark that actually measures understanding, not memorization. And the gap it reveals isn’t small it’s the next grand challenge in AI cognition. (Comment “Send” and I’ll DM you the paper 👇)

English

257

676

2.6K

190.9K

David Liu retweetledi

AI at Meta@AIatMeta·14 Ağu

Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks. Learn more about DINOv3 here: ai.meta.com/blog/dinov3-se…

English

343

754

4.4K

899K

David Liu retweetledi

Andrej Karpathy@karpathy·21 Eki

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language person) is whether pixels are better inputs to LLMs than text. Whether text tokens are wasteful and just terrible, at the input. Maybe it makes more sense that all inputs to LLMs should only ever be images. Even if you happen to have pure text input, maybe you'd prefer to render it and then feed that in: - more information compression (see paper) => shorter context windows, more efficiency - significantly more general information stream => not just text, but e.g. bold text, colored text, arbitrary images. - input can now be processed with bidirectional attention easily and as default, not autoregressive attention - a lot more powerful. - delete the tokenizer (at the input)!! I already ranted about how much I dislike the tokenizer. Tokenizers are ugly, separate, not end-to-end stage. It "imports" all the ugliness of Unicode, byte encodings, it inherits a lot of historical baggage, security/jailbreak risk (e.g. continuation bytes). It makes two characters that look identical to the eye look as two completely different tokens internally in the network. A smiling emoji looks like a weird token, not an... actual smiling face, pixels and all, and all the transfer learning that brings along. The tokenizer must go. OCR is just one of many useful vision -> text tasks. And text -> text tasks can be made to be vision ->text tasks. Not vice versa. So many the User message is images, but the decoder (the Assistant response) remains text. It's a lot less obvious how to output pixels realistically... or if you'd want to. Now I have to also fight the urge to side quest an image-input-only version of nanochat...

vLLM@vllm_project

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping 97% OCR accuracy at <10×. 📄 Outperforms GOT-OCR2.0 & MinerU2.0 on OmniDocBench using fewer vision tokens. 🤝 The vLLM team is working with DeepSeek to bring official DeepSeek-OCR support into the next vLLM release — making multimodal inference even faster and easier to scale. 🔗 github.com/deepseek-ai/De… #vLLM #DeepSeek #OCR #LLM #VisionAI #DeepLearning

English

559

1.6K

13.3K

3.3M

David Liu retweetledi

AI at Meta@AIatMeta·5 Nis

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model with 16 experts. • Industry-leading context window of 10M tokens. • Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks. Llama 4 Maverick • 17B-active-parameter model with 128 experts. • Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image. • Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks. • Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters. • Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena. These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight. Read more about the first Llama 4 models, including training and benchmarks ➡️ go.fb.me/gmjohs Download Llama 4 ➡️ go.fb.me/bwwhe9

English

824

2.4K

12.8K

3.7M

David Liu retweetledi

Matthew Berman@MatthewBerman·1 Nis

We knew very little about how LLMs actually work...until now. @AnthropicAI just dropped the most insane research paper, detailing some of the ways AI "thinks." And it's completely different than we thought. Here are their wild findings: 🧵

English

1.3K

10.3K

1.5M

David Liu@davidliuneuro·28 Mar

@Innerdevcrypto 😂

QME

Innerdevcrypto@Innerdevcrypto·27 Mar

I hear some voices coming out of the blissful darkness…. “Hello…hello…are you ok…should we call an ambulance?”…three uniformed supermarkt-workers stand next to me I slowly open my eyes…. ´Uhm…hello…no…i mean yes, i am ok…´ “But you have been standing still here for half an hour in front of the milk, another customer warned us, are you ok?” ´Sorry, just lost track of time, all ok, thanks´ I start laughing They look at me as if am totally insane Guess i should remind myself to not turn inside in public spaces, or enter stores dressed like the big lewbowski blissfully drooling with my eyes closed like some happy zombie, before you know it they take me to an insane asylum. Did not have to play the autism card this time so that was nice 😉 Good night

English

127

9.2K

David Liu retweetledi

David D. Baek@dbaek__·4 Şub

1/9 🚨 New Paper Alert: Cross-Entropy Loss is NOT What You Need! 🚨 We introduce harmonic loss as alternative to the standard CE loss for training neural networks and LLMs! Harmonic loss achieves 🛠️significantly better interpretability, ⚡faster convergence, and ⏳less grokking!

GIF

English

519

3.9K

1.2M

David Liu retweetledi

James Campbell@jam3scampbell·24 Oca

The Road to AGI along with @Emiliano_GLopez (who's awesome, go follow), I built an interactive timeline of everything in AI the past few years we're living through the most exciting time in history and this site hopes to document it! go visit: ai-timeline dot org (link below)

English

561

92.4K

David Liu retweetledi

AK@_akhaliq·13 Mar

Block Diffusion Interpolating Between Autoregressive and Diffusion Language Models

English

240

1.6K

160.5K

David Liu retweetledi

Akira Yoshiyama ⁂@yoshiyama_akira·6 Mar

Happy to announce we outperformed @OpenAI o1 with a 7B model :) We released two self-improvement methods for verifiable domains in our preliminary paper -->

English

104

245

3.6K

505.5K

David Liu retweetledi

Bindu Reddy@bindureddy·8 Mar

Mercury Is The First Diffusion LLM! AI simply groks the patterns of the universe. Diffusion LLMs literally manifest the LLM response and are so next generation This is Mercury! The world’s first diffusion LLM

English

371

49.3K

David Liu retweetledi

Miles Cranmer@MilesCranmer·26 Şub

Why 'I don’t know' is the true test for AGI—it’s a strictly harder problem than text generation! This magnificent 62-page paper (arxiv.org/abs/2408.02357) formally proves AGI hallucinations are inevitable, with 50 pages (!!) of supplementary proofs.

English

133

921

91.1K

David Liu retweetledi

David Duvenaud@DavidDuvenaud·27 Şub

LLMs have complex joint beliefs about all sorts of quantities. And my postdoc @jamesrequeima visualized them! In this thread we show LLM predictive distributions conditioned on data and free-form text. LLMs pick up on all kinds of subtle and unusual structure: 🧵

English

197

1.6K

193.6K

David Liu retweetledi

Aleksander Madry@aleks_madry·6 Şub

Do current LLMs perform simple tasks (e.g., grade school math) reliably? We know they don't (is 9.9 larger than 9.11?), but why? Turns out that, for one reason, benchmarks are too noisy to pinpoint such lingering failures. w/ @josh_vendrow @EdwardVendrow @sarameghanbeery 1/5

English

237

48.9K

David Liu retweetledi

Brian S. Kim@itchdoctor·20 Şub

Cancer neuroimmunology is real. Nociceptive neurons promote gastric tumour progression via a CGRP–RAMP1 axis | Nature nature.com/articles/s4158…

English

177

21.2K

David Liu retweetledi

Guillaume Bellec@BellecGuill·7 Haz

Pre-print: machine learning for neuroscience We build interpretable biological network reconstructions from electrode recordings with ML and optimal transport. Towards models of mechanisms driving behavior, we focus on single-trial neural activity and trial variability 1/6

English

305

67.8K

David Liu retweetledi

Darshan 🦖@darshan·27 Ara

The most misunderstood condition: Brain fog. It's not just fatigue. It's not just stress. Here's what's really happening inside your body:

English

122

1.4K

9.3K

1.9M

David Liu retweetledi

Kevin Patrick Murphy@sirbayes·9 Ara

I am happy to announce that the first draft of my RL tutorial is now available. arxiv.org/abs/2412.05265

English

724

4.4K

320.7K

David Liu retweetledi

Scott Linderman@scott_linderman·9 Ara

I'm excited to share our #NeurIPS2024 paper, "Modeling Latent Neural Dynamics with Gaussian Process Switching Linear Dynamical Systems" 🧠✨ We introduce the gpSLDS, a new model for interpretable analysis of latent neural dynamics! 🧵 1/10

English

136

16.7K

David Liu retweetledi

Laura Ruis@LauraRuis·20 Kas

How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️

English

208

987

197.7K

Keşfet

@AnthropicAI @Innerdevcrypto @Emiliano_GLopez @OpenAI @jamesrequeima @josh_vendrow @EdwardVendrow @sarameghanbeery