Taiming Lu

18 posts

Taiming Lu banner
Taiming Lu

Taiming Lu

@TaimingLu

Ph.D student at @Princeton | Formerly @JohnsHopkins ’25, @HopkinsDSAI @JHUCompSci @jhuclsp @CCVLatJHU | AI/ML/NLP/CV

Princeton, NJ Katılım Haziran 2024
601 Takip Edilen358 Takipçiler
Muhan Gao
Muhan Gao@muhan_gao·
🤖 We often talk about “context rot”: LLMs get worse as context grows. But once distracting information enters, is it just “a bit more noise → a bit worse performance”? Our #ICML2026 paper finds: no! 🤯 Instead, we reveal a striking "First Drop of Ink" effect: the first very few hard distractors do almost all of the damage, exactly like how one drop of ink clouding clear water. Paper link: arxiv.org/abs/2605.10828
Muhan Gao tweet media
English
6
15
45
10K
Kenton Murray
Kenton Murray@kentonmurray·
I'm excited to announce that this Fall I will be joining the Computer Science Department at George Mason University as an Asst. Prof. I'll be expanding my lab and looking for PhD students to work on Multilingual AI problems text, video, and speech. cs.gmu.edu
English
28
23
210
16.6K
Taiming Lu
Taiming Lu@TaimingLu·
Teacher–student compatibility matters more than raw teacher strength. This changes how you pick a teacher: both for frontier training (where the best available teacher is often a prior generation) and for efficient small models, where "bigger teacher is better" isn't the right rule. Thanks @liuzhuang1234 for the support! arxiv: arxiv.org/abs/2605.23857 code: github.com/zlab-princeton…
Taiming Lu tweet media
English
1
2
10
961
Taiming Lu
Taiming Lu@TaimingLu·
Distillation improves generalization more readily than in-domain fit. Out-of-distribution perplexity and downstream accuracy improve more consistently than in-domain perplexity, where some configurations help OOD/downstream while doing nothing for in-domain.
Taiming Lu tweet media
English
1
2
7
1K
Taiming Lu
Taiming Lu@TaimingLu·
Knowledge doesn't always flow downhill. We find that in LLM pretraining, a weaker teacher can improve a stronger student, and pushing the teacher further can actually hurt. New paper: Strong Teacher Not Needed? On Distillation in LLM Pretraining.
Taiming Lu tweet media
English
7
57
348
46.4K
Daniel Khashabi 🕊️
Daniel Khashabi 🕊️@DanielKhashabi·
Very honored and excited to receive the NSF CAREER Award! HUGE thank you to my amazing students, collaborators, mentors, and advisors, who helped make this happen. And to my family who are the real heroes in my story! ♥️
Daniel Khashabi 🕊️ tweet media
English
24
10
198
11.1K
Taiming Lu retweetledi
Zhuang Liu
Zhuang Liu@liuzhuang1234·
Stronger Normalization-Free Transformers – new paper. We introduce Derf (Dynamic erf), a simple point-wise layer that lets norm-free Transformers not only work, but actually outperform their normalized counterparts.
Zhuang Liu tweet media
English
19
175
1.1K
166.2K
Taiming Lu retweetledi
Jieneng Chen
Jieneng Chen@jieneng_chen·
🤯 Think better visuals mean better world models? Think again. 💥 Surprise: Agents don’t need eye candy— they need wins. Meet World-in-World, the first open benchmark that ranks world models by closed-loop task success, not pixels. We uncover 3 shocks: 1️⃣ Visuals ≠ utility 2️⃣ Action data > bigger models 3️⃣ Scaling test-time compute = more success 🤗 huggingface.co/papers/2510.18… 🌍 world-in-world.github.io 📄 arxiv.org/abs/2510.18135 github.com/World-In-World…
Jieneng Chen tweet media
English
2
38
153
42.5K
Taiming Lu retweetledi
Zhuang Liu
Zhuang Liu@liuzhuang1234·
Excited to share our lab’s first open-source release: LLM-Distillation-JAX supports practical knowledge distillation configurations (distillation strength, temperature, top-k/top-p), built on MaxText designed for reproducible JAX/Flax training on both TPUs and GPUs
Zhuang Liu tweet media
English
4
30
222
20.7K
Taiming Lu retweetledi
Jieneng Chen
Jieneng Chen@jieneng_chen·
Thrilled to introduce GenEx: Generating an Explorable World. ✨ ✨ GenEx takes a single image 🖼️ and create a 3D generative world 🌍 — you can dive in for interactive exploration, and so as embodied AI agent. Follow our X for more demos: x.com/genex_world Paper on huggingface: huggingface.co/papers/2412.09… Tech details: genex.world (1/n)
English
2
31
103
10.1K
Taiming Lu retweetledi
GenEx
GenEx@genex_world·
Introducing GenEx: Turn any image into a 3D world adventure! 1️⃣ Create a fully explorable 360° world in 3D from just a single image! 2️⃣ Explore interactively or with GPT assistance. 3️⃣ Advance embodied AI with this imagined world! Check out our website: genex.world
English
2
11
37
10.5K
Taiming Lu retweetledi
Jieneng Chen
Jieneng Chen@jieneng_chen·
Introducing Genex: Generative World Explorer. 🧠 Humans mentally explore unseen parts of the world, revising their beliefs with imagined observations. ✨ Genex replicates this human-like ability, advancing embodied AI in planning with partial observations. (1/6)
English
6
49
164
37K
Taiming Lu retweetledi
Muhan Gao
Muhan Gao@muhan_gao·
🤖LLMs know more long-context information than they show! 🔍Probing reveals higher accuracy than generation output. #LLMs know but don't tell.🤐 The earlier relevant information is learned within the layers, the higher the final output accuracy! 📈 (arxiv.org/abs/2406.14673)
Muhan Gao tweet media
English
5
6
14
2.2K