David Liu

55 posts

David Liu banner
David Liu

David Liu

@davidliuneuro

PhD ML & Comp Neuro @Cambridge_Eng | BA/MSci Theoretical Physics @DeptofPhysics Prev: Meta Reality Labs, G-research London

Katılım Ekim 2021
129 Takip Edilen100 Takipçiler
David Liu retweetledi
Spencer Baggins
Spencer Baggins@bigaiguy·
🚨 MIT just humiliated every major AI lab and nobody’s talking about it. They built a new benchmark called WorldTest to see if AI actually understands the world… and the results are brutal. Even the biggest models Claude, Gemini 2.5 Pro, OpenAI o3 got crushed by humans. Here’s what makes it different: WorldTest doesn’t check how well an AI predicts the next word or frame. It measures if it can build an internal model of reality and use that to handle new situations. They built AutumnBench 43 interactive worlds, 129 tasks where AIs must: • Predict hidden parts of the world (masked-frame prediction) • Plan multi-step actions to reach goals • Detect when the rules of the environment suddenly change Then they tested 517 humans vs the top models. Humans dominated every category. Even massive compute scaling barely helped. The takeaway is wild: Today’s AIs don’t understand environments they just pattern-match inside them. They don’t explore, revise beliefs, or experiment like humans do. WorldTest might be the first benchmark that actually measures understanding, not memorization. And the gap it reveals isn’t small it’s the next grand challenge in AI cognition. (Comment “Send” and I’ll DM you the paper 👇)
Spencer Baggins tweet media
English
257
676
2.6K
190.9K
David Liu retweetledi
AI at Meta
AI at Meta@AIatMeta·
Introducing DINOv3: a state-of-the-art computer vision model trained with self-supervised learning (SSL) that produces powerful, high-resolution image features. For the first time, a single frozen vision backbone outperforms specialized solutions on multiple long-standing dense prediction tasks. Learn more about DINOv3 here: ai.meta.com/blog/dinov3-se…
English
343
754
4.4K
899K
David Liu retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language person) is whether pixels are better inputs to LLMs than text. Whether text tokens are wasteful and just terrible, at the input. Maybe it makes more sense that all inputs to LLMs should only ever be images. Even if you happen to have pure text input, maybe you'd prefer to render it and then feed that in: - more information compression (see paper) => shorter context windows, more efficiency - significantly more general information stream => not just text, but e.g. bold text, colored text, arbitrary images. - input can now be processed with bidirectional attention easily and as default, not autoregressive attention - a lot more powerful. - delete the tokenizer (at the input)!! I already ranted about how much I dislike the tokenizer. Tokenizers are ugly, separate, not end-to-end stage. It "imports" all the ugliness of Unicode, byte encodings, it inherits a lot of historical baggage, security/jailbreak risk (e.g. continuation bytes). It makes two characters that look identical to the eye look as two completely different tokens internally in the network. A smiling emoji looks like a weird token, not an... actual smiling face, pixels and all, and all the transfer learning that brings along. The tokenizer must go. OCR is just one of many useful vision -> text tasks. And text -> text tasks can be made to be vision ->text tasks. Not vice versa. So many the User message is images, but the decoder (the Assistant response) remains text. It's a lot less obvious how to output pixels realistically... or if you'd want to. Now I have to also fight the urge to side quest an image-input-only version of nanochat...
vLLM@vllm_project

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping 97% OCR accuracy at <10×. 📄 Outperforms GOT-OCR2.0 & MinerU2.0 on OmniDocBench using fewer vision tokens. 🤝 The vLLM team is working with DeepSeek to bring official DeepSeek-OCR support into the next vLLM release — making multimodal inference even faster and easier to scale. 🔗 github.com/deepseek-ai/De… #vLLM #DeepSeek #OCR #LLM #VisionAI #DeepLearning

English
559
1.6K
13.3K
3.3M
David Liu retweetledi
AI at Meta
AI at Meta@AIatMeta·
Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model with 16 experts. • Industry-leading context window of 10M tokens. • Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks. Llama 4 Maverick • 17B-active-parameter model with 128 experts. • Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image. • Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks. • Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters. • Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena. These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight. Read more about the first Llama 4 models, including training and benchmarks ➡️ go.fb.me/gmjohs Download Llama 4 ➡️ go.fb.me/bwwhe9
AI at Meta tweet media
English
824
2.4K
12.8K
3.7M
David Liu retweetledi
Matthew Berman
Matthew Berman@MatthewBerman·
We knew very little about how LLMs actually work...until now. @AnthropicAI just dropped the most insane research paper, detailing some of the ways AI "thinks." And it's completely different than we thought. Here are their wild findings: 🧵
Matthew Berman tweet media
English
81
1.3K
10.3K
1.5M
Innerdevcrypto
Innerdevcrypto@Innerdevcrypto·
I hear some voices coming out of the blissful darkness…. “Hello…hello…are you ok…should we call an ambulance?”…three uniformed supermarkt-workers stand next to me I slowly open my eyes…. ´Uhm…hello…no…i mean yes, i am ok…´ “But you have been standing still here for half an hour in front of the milk, another customer warned us, are you ok?” ´Sorry, just lost track of time, all ok, thanks´ I start laughing They look at me as if am totally insane Guess i should remind myself to not turn inside in public spaces, or enter stores dressed like the big lewbowski blissfully drooling with my eyes closed like some happy zombie, before you know it they take me to an insane asylum. Did not have to play the autism card this time so that was nice 😉 Good night
English
14
1
127
9.2K
David Liu retweetledi
David D. Baek
David D. Baek@dbaek__·
1/9 🚨 New Paper Alert: Cross-Entropy Loss is NOT What You Need! 🚨 We introduce harmonic loss as alternative to the standard CE loss for training neural networks and LLMs! Harmonic loss achieves 🛠️significantly better interpretability, ⚡faster convergence, and ⏳less grokking!
GIF
English
73
519
3.9K
1.2M
David Liu retweetledi
James Campbell
James Campbell@jam3scampbell·
The Road to AGI along with @Emiliano_GLopez (who's awesome, go follow), I built an interactive timeline of everything in AI the past few years we're living through the most exciting time in history and this site hopes to document it! go visit: ai-timeline dot org (link below)
English
32
62
561
92.4K
David Liu retweetledi
AK
AK@_akhaliq·
Block Diffusion Interpolating Between Autoregressive and Diffusion Language Models
English
23
240
1.6K
160.5K
David Liu retweetledi
Akira Yoshiyama ⁂
Akira Yoshiyama ⁂@yoshiyama_akira·
Happy to announce we outperformed @OpenAI o1 with a 7B model :) We released two self-improvement methods for verifiable domains in our preliminary paper -->
Akira Yoshiyama ⁂ tweet mediaAkira Yoshiyama ⁂ tweet media
English
104
245
3.6K
505.5K
David Liu retweetledi
Bindu Reddy
Bindu Reddy@bindureddy·
Mercury Is The First Diffusion LLM! AI simply groks the patterns of the universe. Diffusion LLMs literally manifest the LLM response and are so next generation This is Mercury! The world’s first diffusion LLM
English
70
54
371
49.3K
David Liu retweetledi
Miles Cranmer
Miles Cranmer@MilesCranmer·
Why 'I don’t know' is the true test for AGI—it’s a strictly harder problem than text generation! This magnificent 62-page paper (arxiv.org/abs/2408.02357) formally proves AGI hallucinations are inevitable, with 50 pages (!!) of supplementary proofs.
Miles Cranmer tweet mediaMiles Cranmer tweet mediaMiles Cranmer tweet media
English
46
133
921
91.1K
David Liu retweetledi
David Duvenaud
David Duvenaud@DavidDuvenaud·
LLMs have complex joint beliefs about all sorts of quantities. And my postdoc @jamesrequeima visualized them! In this thread we show LLM predictive distributions conditioned on data and free-form text. LLMs pick up on all kinds of subtle and unusual structure: 🧵
English
30
197
1.6K
193.6K
David Liu retweetledi
Aleksander Madry
Aleksander Madry@aleks_madry·
Do current LLMs perform simple tasks (e.g., grade school math) reliably? We know they don't (is 9.9 larger than 9.11?), but why? Turns out that, for one reason, benchmarks are too noisy to pinpoint such lingering failures. w/ @josh_vendrow @EdwardVendrow @sarameghanbeery 1/5
Aleksander Madry tweet media
English
12
46
237
48.9K
David Liu retweetledi
Brian S. Kim
Brian S. Kim@itchdoctor·
Cancer neuroimmunology is real. Nociceptive neurons promote gastric tumour progression via a CGRP–RAMP1 axis | Nature nature.com/articles/s4158…
English
2
44
177
21.2K
David Liu retweetledi
Guillaume Bellec
Guillaume Bellec@BellecGuill·
Pre-print: machine learning for neuroscience We build interpretable biological network reconstructions from electrode recordings with ML and optimal transport. Towards models of mechanisms driving behavior, we focus on single-trial neural activity and trial variability 1/6
Guillaume Bellec tweet mediaGuillaume Bellec tweet media
English
1
71
305
67.8K
David Liu retweetledi
Darshan 🦖
Darshan 🦖@darshan·
The most misunderstood condition: Brain fog. It's not just fatigue. It's not just stress. Here's what's really happening inside your body:
Darshan 🦖 tweet mediaDarshan 🦖 tweet media
English
122
1.4K
9.3K
1.9M
David Liu retweetledi
Scott Linderman
Scott Linderman@scott_linderman·
I'm excited to share our #NeurIPS2024 paper, "Modeling Latent Neural Dynamics with Gaussian Process Switching Linear Dynamical Systems" 🧠✨ We introduce the gpSLDS, a new model for interpretable analysis of latent neural dynamics! 🧵 1/10
English
2
17
136
16.7K
David Liu retweetledi
Laura Ruis
Laura Ruis@LauraRuis·
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this: Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢 🧵⬇️
Laura Ruis tweet media
English
24
208
987
197.7K