Victoria X Lin

1.4K posts

Victoria X Lin

@VictoriaLinML

MTS @thinkymachines | Native Multimodal Intelligence Prev: @AIatMeta @SFResearch • PhD @uwcse

San Francisco Bay Area Katılım Aralık 2010

1K Takip Edilen4K Takipçiler

Sabitlenmiş Tweet

Victoria X Lin@VictoriaLinML·12 May

✨We are showing some experiments with interaction models @ThinkyMachines: models that could see and hear continuously while processing tasks in the background and generating responses in real-time. Interaction models offer a glimpse into a future where people collaborate with AI the same way we do with other people. Read our announcement post to explore the capabilities this model unlocks.

Thinking Machines@thinkymachines

People talk, listen, watch, think, and collaborate at the same time, in real time. We've designed an AI that works with people the same way. We share our approach, early results, and a quick look at our model in action. thinkingmachines.ai/blog/interacti…

English

5.6K

Victoria X Lin retweetledi

Steven Feng@stevenyfeng·2 Nis

We’re bringing back Stanford’s CS25 Transformers course tomorrow! 🤖 It’s open to everyone (in-person + online). Weekly talks (every Thursday) from top AI researchers. One of Stanford’s most popular AI seminar courses. Don’t miss out! More info below 👇 (1/7)

English

645

58.2K

Victoria X Lin retweetledi

Thinking Machines@thinkymachines·5d

We are offering grants of $100,000 + Tinker credits to researchers advancing the field of human-AI interactivity. Submit your proposals by June 19th! thinkingmachines.ai/news/interacti…

English

192

1.6K

564.3K

Victoria X Lin retweetledi

Thinking Machines@thinkymachines·11 May

English

459

1.9K

15.6K

7.6M

Victoria X Lin retweetledi

Alexander Kirillov@_alex_kirillov_·11 May

Working on the interaction models is a lot of fun at TML! I can't imagine doing that in a turn-based world. Building it from scratch makes a lot of things so much easier. I am very excited about the future of natively multi-modal, multi-stream, multi-task models.

English

178

21.4K

Victoria X Lin@VictoriaLinML·30 Nis

Could @Waymo roll out a “working pod” series where you can comfortably set up your laptop and get work done during the ride?

English

1.6K

Victoria X Lin retweetledi

Yuandong Tian@tydsh·24 Nis

My solo paper is accepted in ICLR'26 in Brazil. It discovers the training dynamics of grokking behaviors (phase transition memorization -> generalization) in basic settings, and derives provable scaling laws that enables such dynamics to happen. Unfortunately I won't be able to come to Brazil and present. Here is the poster I made, if people are interested to check: yuandong-tian.com/posters/poster… Will mention that paper in the upcoming invited workshop talks as well. Enjoy~

Yuandong Tian@tydsh

🚨New work: Provable Scaling Laws of Feature Emergence from Learning Dynamics of Grokking (arxiv.org/abs/2509.21519) In this work we propose a mathematical framework, named Li2, that explains the dynamics of grokking (i.e., delayed generalization) in 2-layer nonlinear networks. Specifically, it 1️⃣ Tells exactly what features will emerge during training. 2️⃣ Gives provable scaling laws of generalization/memorization, i.e. O(M log M) data samples suffice for generalization behavior of group arithmetic task of order M group. 3️⃣ Provides a more fundamental explanation for the popular empirical hypothesis that "generalization circuits learn slower but is more efficient than memorization circuits". So how?

English

417

65.1K

Victoria X Lin retweetledi

Lisha@lishali88·23 Nis

x.com/i/article/2046…

ZXX

252

138.2K

Victoria X Lin@VictoriaLinML·22 Nis

@weiyaow1 @thinkymachines Welcome Weiyao! 🙌

Filipino

301

weiyaow@weiyaow1·21 Nis

After 8 years at Meta (FAIR/MSL) working on multi-modal perception and generations — Gradient-Blending, UVO, SAM3D — I've joined @thinkymachines this week to keep working on multi-modal. Excited for what's ahead.

English

337

37.7K

Victoria X Lin@VictoriaLinML·22 Nis

@ysu_nlp @NeoCognition Congratulations @ysu_nlp 🎉

English

392

Yu Su@ysu_nlp·21 Nis

Introducing @NeoCognition, the agent lab for specialized intelligence. Everyone needs experts, but human expertise does not scale. Backed by $40M seed funding, we build self-learning agents that specialize across domains to make expertise abundant.

English

135

875

182.8K

Victoria X Lin retweetledi

Akari Asai@AkariAsai·17 Nis

Not many PhD students know about compute grants, but they can make a huge difference. During my PhD, I got access to Stability AI's HPC cluster through a small proposal and used it for Self-RAG training. Great practical post by @_emliu!

Emmy Liu@_emliu

wrote a guide on getting compute grants as a student, something I wish I did more at the beginning of my PhD. It's honestly one of the highest ROI things you can do as a student (we've gotten 100k+ gpu hrs for roughly 2 weeks of work writing). nightingal3.github.io/blog/2026/04/1…

English

441

82.5K

Victoria X Lin retweetledi

Stonehenge U.K@ST0NEHENGE·14 Nis

Stunning crescent moon rising above Stonehenge this morning 🤩😍🌙 Photo credit Nick Bull 🙏 #moon #crescentmoon #sunrise #april #spring

English

372

22.9K

Victoria X Lin@VictoriaLinML·11 Nis

@tydsh Thank you @tydsh for your fantastic collaboration and support 🙌

English

629

Yuandong Tian@tydsh·10 Nis

Our work on post-training models for parallel thinking (ThreadWeaver) is now open sourced! Our Data Gen/SFT/RL recipes are now fully open😀. The idea is to1️⃣rewrite the sequential thinking traces to be parallel with LLMs,2️⃣design efficient kernels for training/inference and3️⃣smartly design the reward signal for RL. Thanks @LongTonyLian and @VictoriaLinML for the great work!

Long Lian@LongTonyLian

Our parallel reasoning project ThreadWeaver is now open-sourced 🎉! Check out our Data Gen/SFT/RL recipe at github.com/facebookresear… In case you don't know, ThreadWeaver 🧵⚡️ is the first parallel reasoning method to achieve comparable reasoning performance to widely-used sequential long-CoT LLMs, with up to 3x speedup across 6 challenging tasks.

English

244

32.2K

Victoria X Lin retweetledi

Long Lian@LongTonyLian·8 Nis

AK@_akhaliq

ThreadWeaver Adaptive Threading for Efficient Parallel Reasoning in Language Models

English

127

55.7K

Victoria X Lin retweetledi

Mira Murati@miramurati·10 Mar

Grateful to Jensen and @nvidia team for their support. Together, we’re working to deploy at least 1GW of Vera Rubin systems, bringing adaptable collaborative AI to everyone. thinkingmachines.ai/nvidia-partner…

English

168

284

3.9K

558.7K

Victoria X Lin retweetledi

Thinking Machines@thinkymachines·10 Mar

We are partnering with @nvidia to power our frontier model training and platforms delivering customizable AI. thinkingmachines.ai/news/nvidia-pa…

English

101

166

2.4K

660.7K

Victoria X Lin@VictoriaLinML·8 Mar

☕ Society will reward tremendously those who can effortlessly spot mistakes made by autonomous agents.

English

3.7K

Victoria X Lin retweetledi

Tri Dao@tri_dao·26 Şub

This was a wild bug hunt, weeks of effort from @MayankMish98 to track down. The wrong init of Mamba2 in many reimplementations causes the layer to decay its states too quickly, focusing in short context instead. Pretraining is mostly about getting these little things right

Mayank Mishra@MayankMish98

We identified an issue with the Mamba-2 🐍 initialization in HuggingFace and FlashLinearAttention repository (dt_bias being incorrectly initialized). This bug is related to 2 main issues: 1. init being incorrect (torch.ones) if Mamba-2 layers are used in isolation without the Mamba2ForCausalLM model class (this has been already fixed: github.com/fla-org/flash-…). 2. Skipping initialization due to meta device init for DTensors with FSDP-2 (github.com/fla-org/flash-… will fix this issue upon merging). The difference is substantial. Mamba-2 seems to be quite sensitive to the initialization. Check out our experiments at the 7B MoE scale: wandb.ai/mayank31398/ma… Special thanks to @kevinyli_, @bharatrunwal2, @HanGuo97, @tri_dao and @_albertgu 🙏 Also thanks to @SonglinYang4 for quickly helping in merging the PR.

English

374

32.3K

Victoria X Lin retweetledi

Boris Cherny@bcherny·1 Şub

I'm Boris and I created Claude Code. I wanted to quickly share a few tips for using Claude Code, sourced directly from the Claude Code team. The way the team uses Claude is different than how I use it. Remember: there is no one right way to use Claude Code -- everyones' setup is different. You should experiment to see what works for you!

English

924

5.9K

50.9K

9.2M

Victoria X Lin retweetledi

Long Lian@LongTonyLian·27 Oca

Love seeing parallel thinking & subagents pushing efficiency and performance on Kimi K2.5! 🚀 Also nice to see shared takeaways with our parallel reasoning work ThreadWeaver: 1️⃣ an auxiliary parallelization reward prevents collapse, and 2️⃣ the critical path is the key🔑

Kimi.ai@Kimi_Moonshot

🥝 Meet Kimi K2.5, Open-Source Visual Agentic Intelligence. 🔹 Global SOTA on Agentic Benchmarks: HLE full set (50.2%), BrowseComp (74.9%) 🔹 Open-source SOTA on Vision and Coding: MMMU Pro (78.5%), VideoMMMU (86.6%), SWE-bench Verified (76.8%) 🔹 Code with Taste: turn chats, images & videos into aesthetic websites with expressive motion. 🔹 Agent Swarm (Beta): self-directed agents working in parallel, at scale. Up to 100 sub-agents, 1,500 tool calls, 4.5× faster compared with single-agent setup. - 🥝 K2.5 is now live on kimi.com in chat mode and agent mode. 🥝 K2.5 Agent Swarm in beta for high-tier users. 🥝 For production-grade coding, you can pair K2.5 with Kimi Code: kimi.com/code - 🔗 API: platform.moonshot.ai 🔗 Tech blog: kimi.com/blogs/kimi-k2-… 🔗 Weights & code: huggingface.co/moonshotai/Kim…

English

5.3K

Keşfet

@Waymo @weiyaow1 @thinkymachines @ysu_nlp @NeoCognition @_emliu @tydsh @LongTonyLian