Lequn Chen (@abcdabcd987) - Twitter-Profil | Zamantika Mersobahis Locabet

Lequn Chen@abcdabcd987·10 Nis

@tskaerobot @Yuchenj_UW Upload all tax documents. Prompt "prepare my 2025 tax" and your information (like location, single or married, ...). Same as what you would send to CPA. (If you don't know which docs are needed, just ask it)

English

1

16

2.3K

tsk@tskaerobot·10 Nis

@abcdabcd987 @Yuchenj_UW Wow. Can you recommend a tutorial. I paid a cpa $2000 and I think he didn’t do a great job f

English

2

0

2.5K

Yuchen Jin@Yuchenj_UW·10 Nis

Anthropic killed this, Anthropic killed that, why cant Anthropic kill TurboTax

English

179

135

4.9K

306.7K

Lequn Chen@abcdabcd987·10 Nis

@iamup @AravSrinivas I uploaded all tax documents and also equity contracts. Same as what I sent to my CPA previously.

English

1

0

29

@iamup@iamup·10 Nis

@AravSrinivas Does one have to upload full tax documents showing SSN or just add the income etc and get the tax return prepared? @abcdabcd987

English

1

0

36

Aravind Srinivas@AravSrinivas·10 Nis

Perplexity Computer is more reliable than a CPA for filing taxes.

Lequn Chen@abcdabcd987

@Yuchenj_UW Perplexity Computer saved me $14k in tax. It found 2 double taxing errors and 2 form filling errors from my $2000-CPA's draft, which CPA fully agreed. In another thread, I let it compute tax from scratch. It's correct to the cents.

English

35

36

691

107.5K

Lequn Chen@abcdabcd987·19 Kas

Check out my talk at Ray Summit 2025 on RDMA Point-to-Point Communication for LLM Systems youtube.com/watch?v=Nl8iqY…

YouTube

English

0

1

20

3.8K

Lequn Chen@abcdabcd987·10 Kas

zhihu: zhuanlan.zhihu.com/p/197123241073…

Eesti

0

5

1.2K

Lequn Chen@abcdabcd987·10 Kas

Wrote a blog post on why collective communication feels awkward for newer LLM workloads (disaggregated inference, RL weight update, MoE), why people don’t just use raw RDMA, how we approached it, and some behind-the-scenes stories. le.qun.ch/en/blog/2025/1…

English

4

29

229

21.6K

Lequn Chen@abcdabcd987·5 Kas

Faster than DeepEP for Decode on ConnectX-7. First viable kernel on EFA. SM-Free RDMA transfer. Support prefill. (Maybe portable to other hardware as well)

Perplexity@perplexity_ai

Perplexity is the first to develop custom Mixture-of-Experts (MoE) kernels that make trillion-parameter models available with cloud platform portability. Our team has published this work on arXiv as Perplexity's first research paper. Read more: research.perplexity.ai/articles/enabl…

English

1

7

33

12.6K

Lequn Chen@abcdabcd987·2 Eki

Read more in the blog post! research.perplexity.ai/articles/weigh…

English

0

2

315

Lequn Chen@abcdabcd987·2 Eki

We divide the weight transfer process into pipeline stages to enable overlapped execution over different hardware resources (CPU->GPU memcpy, GPU computation, RDMA, Ethernet).

English

1

0

3

436

Lequn Chen@abcdabcd987·2 Eki

We recently achieved 1.3-second cross-machine parameter update for Kimi-K2 (1T parameters), as opposed to a few minutes in popular frameworks.

English

1

2

5

775

Lequn Chen retweetet

vLLM@vllm_project·29 Eyl

How does @deepseek_ai Sparse Attention (DSA) work? It has 2 components: the Lightning Indexer and Sparse Multi-Latent Attention (MLA). The indexer keeps a small key cache of 128 per token (vs. 512 for MLA). It scores incoming queries. The top-2048 tokens to pass to Sparse MLA.

DeepSeek@deepseek_ai

🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model! ✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context. 👉 Now live on App, Web, and API. 💰 API prices cut by 50%+! 1/n

English

11

107

699

102.4K

Lequn Chen retweetet

Perplexity@perplexity_ai·25 Eyl

Introducing Perplexity Search API We've built a search index of billions of webpages to provide real-time, quality information from the web. Now developers have access to the full power of our index, providing the most accurate results in milliseconds. perplexity.ai/hub/blog/intro…

English

98

244

2.2K

635.5K

Lequn Chen retweetet

Anyscale@anyscalecompute·18 Eyl

Just got a sneak peek of the breakout sessions lineup for #RaySummit2025 – and it’s 🔥 Sessions from: 🔹 @character_ai on Scaling LLM Post-Training 🔹 The State of @vllm_project in 2025 🔹 @Roblox on Training 3D Foundation Models with Ray 🔹 @xai on Scaling Image + Video Processing 🔹 @zoox on Reliable, Multimodal LLM Serving 🔹 @perplexity_ai on RDMA P2P for KvCache + MoE Looking forward to learning from the teams actually building these systems. Come join us. Save 25% with code ANYJOIN25 →anyscale.com/ray-summit/202…

English

1

3

12

5.5K

Lequn Chen@abcdabcd987·19 Eyl

@LigengZhu Glad that you enjoyed it! To be precise, it's EP64 on the inference side, around 30GB per inference GPU. So it's around 30GB / 1.3s = 23 GB/s.

English

0

2

7

737

Ligeng Zhu@LigengZhu·18 Eyl

Every RL infra resesarchers should read's @abcdabcd987 blog, 1T / 1.3s / 16 nodes = 49GB/s. Nearly fully reach the peak of the IB bandwidth! For Kimi-K2 (1T params), with 256 GPUs in BF16 training and 128 GPUs in FP8 inference, weight updates take less than 1.3 seconds.

English

2

0

24

1.6K

Lequn Chen@abcdabcd987·19 Eyl

@LiyuanLucas Glad that you enjoyed it!

English

0

81

Liyuan Liu (Lucas)@LiyuanLucas·18 Eyl

This is really impressive! Enjoyed the Chinese blog a lot, didn't know it has an English version as well :)

Lequn Chen@abcdabcd987

1.5 seconds is long enough to transfer model weights from training nodes to RL rollout nodes (as opposed to 100s). Here's the full story of how I made it (not just presenting the solution): le.qun.ch/en/blog/2025/0…

English

1

3

581

Lequn Chen@abcdabcd987·11 Eyl

@vwxyzjn Haha. Glad that you enjoy it :)

English

0

1

120

Costa Huang@vwxyzjn·11 Eyl

What a great blog by Lequn. Love the writing, too "So in my head, shipping weights should be about as simple as putting an elephant into a fridge."

Lequn Chen@abcdabcd987

1.5 seconds is long enough to transfer model weights from training nodes to RL rollout nodes (as opposed to 100s). Here's the full story of how I made it (not just presenting the solution): le.qun.ch/en/blog/2025/0…

English

1

24

3.4K

Lequn Chen@abcdabcd987·11 Eyl

@drisspg Glad that you like it 😀

English

0

2

905

driss guessous@drisspg·11 Eyl

@abcdabcd987 Great read!

English

1

0

2

1.1K

Lequn Chen@abcdabcd987·10 Eyl

1.5 seconds is long enough to transfer model weights from training nodes to RL rollout nodes (as opposed to 100s). Here's the full story of how I made it (not just presenting the solution): le.qun.ch/en/blog/2025/0…

English

8

91

470

63.9K

Lequn Chen

Entdecken