Deqing Fu (@DeqingFu) - Twitter Profili | Zamantika Mersobahis Locabet

Deqing Fu@DeqingFu·12 Mar

Check out EPSVec: how the linear representation hypothesis and dataset vectors could lead to efficient and private synthetic data generation.

Qingchuan Yang@qcyang20xx

𝗣𝗿𝗶𝘃𝗮𝘁𝗲 𝘀𝘆𝗻𝘁𝗵𝗲𝘁𝗶𝗰 𝘁𝗲𝘅𝘁 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 has had the same problem for a while: privacy, quality, or efficiency - pick two 😵‍💫 We think 𝐄𝐏𝐒𝐕𝐞𝐜 changes that 🚀 Paper: arxiv.org/abs/2602.21218

English

0

1

6

1.3K

Deqing Fu retweetledi

Qingchuan Yang@qcyang20xx·12 Mar

𝗣𝗿𝗶𝘃𝗮𝘁𝗲 𝘀𝘆𝗻𝘁𝗵𝗲𝘁𝗶𝗰 𝘁𝗲𝘅𝘁 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝗼𝗻 has had the same problem for a while: privacy, quality, or efficiency - pick two 😵‍💫 We think 𝐄𝐏𝐒𝐕𝐞𝐜 changes that 🚀 Paper: arxiv.org/abs/2602.21218

English

1

5

12

2.7K

Deqing Fu retweetledi

Qinyuan Ye@qinyuan_ye·27 Oca

Now accepted to ICLR 2026! Looking back, stepping into mechanistic interpretability in my final PhD year was such a risky bet. But it turned out to be very rewarding and I enjoyed every bit of it. (Working on a blog post to share this winding journey...)

Qinyuan Ye@qinyuan_ye

1+1=3 2+2=5 3+3=? Many language models (e.g., Llama 3 8B, Mistral v0.1 7B) will answer 7. But why? We dig into the model internals, uncover a function induction mechanism, and find that it’s broadly reused when models encounter surprises during in-context learning. 🧵

English

1

4

80

9.2K

Deqing Fu retweetledi

Stanford NLP Group@stanfordnlp·26 Oca

Hi everyone! For this week's seminar, we are excited to host @johntzwei from USC! Title: The shape of AI accountability and its contours in copyright Abstract: How do we establish accountability for AI? While the shape of AI accountability at large remains amorphous, its contours are revealed in the ongoing copyright challenge to AI. In this talk, I’ll outline a legal theory of change and situate two works in this context. The first work focuses on the legal setup, theorizing how the judiciary can establish copyright accountability for LLMs by interrogating LLM training decisions and examining how they affect the model's memorization. Further progress in copyright then depends on deriving best practices for auditing and mitigating undesirable memorization. The second work focuses on scientific follow up and our release of Hubble, a model suite to advance the study of LLM memorization. Hubble models are trained on English but also with controlled insertions of text designed to emulate key memorization risks. I’ll summarize the main findings and conclude on the potential of controlled insertions for safety-critical concerns beyond copyright. Date and Time: Thursday, 01/29, 11:00AM — 12:00 PM PST. Zoom: stanford.zoom.us/j/93941842999?… Excited to see everyone at the seminar!

English

1

9

37

4.6K

Deqing Fu@DeqingFu·26 Oca

Fourier Number Embedding (FoNE) is accepted to #ICLR2026. Super excited! Check it out here: fouriernumber.github.io

Deqing Fu@DeqingFu

In our recent NeurIPS 2024 paper (openreview.net/forum?id=i4Mut…), we find pretrained LLMs use Fourier Features to add numbers (some called it helix recently). Is this representation truly powerful that LLMs naturally prefer it? Introducing FoNE (Fourier Number Embedding): one token is all you need to encode any number, precisely. 🖇️Blog post: fouriernumber.github.io

English

0

4

22

2.2K

Deqing Fu@DeqingFu·26 Oca

Zebra-CoT is accepted to #ICLR2026!

Deqing Fu@DeqingFu

Presenting Zebra-CoT: A large-scale dataset to teach models intrinsic multimodal reasoning: interleaving text and natively-generated images like a zebra's stripes. It moves beyond the limitations of external tool-based visual CoT. 🔗arxiv.org/abs/2507.16746 🤗huggingface.co/datasets/multi…

English

0

1

13

627

Deqing Fu@DeqingFu·23 Oca

@JeffDean I can see 10 types of people in the comments: computer scientists and those who are not.

English

0

2

53

Jeff Dean@JeffDean·21 Oca

Computer scientists love SAT problems. They leave us feeling very satisfied.

Sundar Pichai@sundarpichai

Helpful update for students, you can now take full practice SATs for free in the @GeminiApp. It uses vetted content from @ThePrincetonRev and gives you feedback straight away. Starting with the SAT today, but more tests are on the way!

English

54

70

1.5K

169.5K

Deqing Fu@DeqingFu·4 Ara

Feeling a bit overwhelmed by NeurIPS talks and posters? Check out our new release on cross-layer transcoders for Qwen3 models, and see how topological data analysis (TDA) could help interpretability!

BluelightAI@bluelightai

Today marks the first-ever release of Cross-Layer Transcoders for Qwen3. BluelightAI has trained CLTs for Qwen3-0.6B and 1.7B, creating an explorable set of interpretable features that capture how Qwen3 represents concepts and transforms information across its layers. The Qwen3 Explorer allows you to examine these features directly, identify structure in the model’s representations, and use this understanding to analyze behavior, diagnose failures, and guide adaptations of Qwen3-based systems.

English

0

7

668

Deqing Fu retweetledi

Jakob Hansen@_jakobhansen·3 Ara

I'm pretty proud of this: We trained cross-layer transcoders for Qwen3 and built a dashboard for exploring the features using TDA-based graph visualizations.

English

2

6

11

565

Deqing Fu retweetledi

BluelightAI@bluelightai·3 Ara

Today marks the first-ever release of Cross-Layer Transcoders for Qwen3. BluelightAI has trained CLTs for Qwen3-0.6B and 1.7B, creating an explorable set of interpretable features that capture how Qwen3 represents concepts and transforms information across its layers. The Qwen3 Explorer allows you to examine these features directly, identify structure in the model’s representations, and use this understanding to analyze behavior, diagnose failures, and guide adaptations of Qwen3-based systems.

English

1

7

24

435.1K

Deqing Fu@DeqingFu·3 Ara

Come and see VisualLens tomorrow 11am-2pm at Exhibit Hall C,D,E! Poster #4804.

Wang Bill Zhu@BillJohn1235813

Presenting VisualLens on Wednesday 11–2 #4804 at NeurIPS, with @DeqingFu. We show how personal photo libraries can power task-agnostic personalization, no domain-specific data needed. We'll talk about two new benchmarks for task-agnostic visual recommendation. Stop by to chat!

English

0

1

9

762

Deqing Fu retweetledi

Wang Bill Zhu@BillJohn1235813·2 Ara

Presenting VisualLens on Wednesday 11–2 #4804 at NeurIPS, with @DeqingFu. We show how personal photo libraries can power task-agnostic personalization, no domain-specific data needed. We'll talk about two new benchmarks for task-agnostic visual recommendation. Stop by to chat!

English

0

2

5

979

Deqing Fu@DeqingFu·2 Ara

I'll be at #NeurIPS2025 from Dec 2-7. Looking forward to meeting old friends and making new ones!

English

0

1

26

1.2K

Deqing Fu@DeqingFu·3 Kas

@sanxing_chen @deliprao @chrmanning Unfortunate truth 😔

English

0

1

67

Sanxing Chen@sanxing_chen·3 Kas

@deliprao @chrmanning Sad enough that the majority of citations today come from people who don’t actually seek and read the paper but rely on quick feed. If people didn’t encounter your paper early enough, they would pretend not knowing it and republish through a broken peer-review system.

English

1

0

2

112

Delip Rao e/σ@deliprao·2 Kas

This is a terrible trend followed by serial paper poasters, who do none of the work except convert ChatGPT summaries to tweet threads. While they do it, they will not credit the actual authors but name the big orgs (“new Deepmind paper” even if the work was done by the intern first author from a mid-tier university), hoping name-dropping the big orgs will give them more engagement. This robs the students involved of their hard-earned recognition. I have pointed this out to many of these people, including the person Lisa is quoting, and instead of addressing it, they just block me on Twitter, proving this is pure engagement grift. At this point, saying all this feels like an old man yelling at the sky.

Lisa Dunlap@lisabdunlap

So is the formula to just name the most famous institutions and call it an X paper? Neither the first or last author are from Anthropic or Stanford. I get that reputation matters for publicity but it does seem a little disrespectful

English

10

8

136

23.1K

Deqing Fu retweetledi

Johnny Tian-Zheng Wei@johntzwei·24 Eki

Announcing 🔭✨Hubble, a suite of open-source LLMs to advance the study of memorization! Pretrained models up to 8B params, with controlled insertion of texts (e.g., book passages, biographies, test sets, and more!) designed to emulate key memorization risks 🧵

English

2

40

130

47.8K

Deqing Fu@DeqingFu·23 Eki

Excited to share this joint work with Qilin Ye, @robinomial , and Vatsal Sharan. Full paper now on arXiv! (14/14) arxiv.org/pdf/2510.19753

English

0

1

187

Deqing Fu@DeqingFu·23 Eki

This provides a concrete path to more reliable models: carefully curate training data to match the model's non-asymptotic capacity. (13/N)

English

1

232

Deqing Fu@DeqingFu·23 Eki

Why do Transformers fail at algorithmic reasoning? We find it's not a lack of power, but a capacity mismatch. Our new preprint proves a tight, non-asymptotic bound: an L-layer model can only solve graph connectivity on graphs with a diameter up to exactly 3^L. arxiv.org/abs/2510.19753 🧵(1/N)

English

1

9

42

58.2K

Deqing Fu

Keşfet