Manqi Cheng 程曼祺

24 posts

Manqi Cheng 程曼祺

@ChengManqi

Reporting on China's AI from the front lines. Host of LateTalk podcast @LatePost：https://t.co/PvCUnRNEvZ What's actually happening vs. What you've heard

Katılım Şubat 2021

76 Takip Edilen430 Takipçiler

Manqi Cheng 程曼祺@ChengManqi·1d

@yifan_zhang_ Wait, Jane Street does this kind of research?? A friend met their people in NYC late last year and came back absolutely shaken — asked me to guess their 2025 earnings 😅

English

218

Yifan Zhang@yifan_zhang_·4d

A fascinating blog from Jane Street about our recent work: Group Representational Position Encoding! blog.janestreet.com/using-group-th… ArXiv: arxiv.org/abs/2512.07805

English

603

92.2K

Manqi Cheng 程曼祺@ChengManqi·2d

Full episode with @GenAI_is_real and @YIFENGLIU_AI： podcast.latepost.com/163 Written version on LatePost：mp.weixin.qq.com/s/GBaPrVWMGpV7…（both in Chinese）

English

1.2K

Manqi Cheng 程曼祺@ChengManqi·2d

A clean way to frame the US-China model divergence right now: -Chinese labs push engineering to its limits — sparse activation, extreme cost efficiency, systems-level coupling under compute constraints. -US labs chase new capability frontiers first and optimize cost later. They can afford to.

English

1.5K

Manqi Cheng 程曼祺@ChengManqi·2d

My latest LateTalk episode breaks down DeepSeek V4. The most illuminating part： V4 became a lens into the entire Chinese open-source ecosystem — and how much these teams are learning from each other. -ByteDance Seed proposed HC. DeepSeek took it further with mHC. Kimi developed Attention Residuals from a related direction. -Both Kimi and DeepSeek are pushing the Muon optimizer. -DeepSeek is now running deep on TileLang, an open-source kernel language started by proffessor Yang Zhi and his team from Peking University. On this episode：Zhao Chenyang @GenAI_is_real (SGLang core dev at RadixArk, ex-Amazon AGI SF Lab & Seed) and Liu Yifeng @YIFENGLIU_AI (optimizer & model arch researcher, ex-Kimi & Seed).🧵

English

222

32.5K

Manqi Cheng 程曼祺@ChengManqi·28 Nis

One more shift: DeepSeek is now seeking outside investment for the first time. The main reason: to put a formal valuation on the company — so researchers finally know what their equity is actually worth. For a lab that never needed funding, that's a significant change.

English

976

Manqi Cheng 程曼祺@ChengManqi·28 Nis

The clearest new signal: DeepSeek is now hiring for Agent products — seeking people who've deeply used Claude Code, Manus, and similar tools. First time they've named specific competitor products in a job listing. They're moving toward applications.

English

1.2K

Manqi Cheng 程曼祺@ChengManqi·28 Nis

DeepSeek doesn't do overtime. Liang Wenfeng's belief: no one can sustains high-quality work beyond 6-8 hours per day. Exhausted judgment wastes compute. More on what's happening and shifting inside DeepSeek lately — my full report (in Chinese)：latepost.com/news/dj_detail…

English

119

31K

Manqi Cheng 程曼祺@ChengManqi·28 Nis

English

170

Manqi Cheng 程曼祺@ChengManqi·28 Nis

Liang's AGI vision is quietly different from the industry's. He's not just chasing benchmarks. He's investing in original directions others won't touch — and building on domestic Chinese chips by design. This creates tension when the world expects constant blockbusters.

English

185

Manqi Cheng 程曼祺 retweetledi

Zihan "Zenus" Wang ✈️ ICLR@wzenus·13 Mar

In Agent RL, models suffer from Template Collapse. They generate vast, diverse outputs (High Entropy) that lose all meaningful connection to the input prompt (Low Mutual Information). In other words, agent learn different ways to say nothing. 🚀 Introducing RAGEN-v2 -- Here's how we define and fix such silent failure modes in Agent RL. 🧵

English

252

172.5K

Manqi Cheng 程曼祺@ChengManqi·28 Kas

@seclink @goocarlos 前者

日本語

119

Manqi Cheng 程曼祺 retweetledi

Luyu Zhang@goocarlos·27 Kas

这款名为 Coze 的产品是否看起来眼熟？由某裁员中的大厂“高 Level”团队创作。该产品的完成度其实做得不错，毕竟是重金投入。但社区中屡有朋友向我发来询问，因为实在和 Dify 太像了，连我们选择不恰当的那些英文术语都全盘搬走。简单的说下吧：我们一直在做的就是持续创新、并坚持向社区交付高品质的产品。论资金密度我们是无法跟这个体量的大厂去拼的，但我们可以保持开放、中立、透明。Dify 为开发者的做的事可以归结为三项： 1. Democratization of AI，围绕 LLM 的 RAG 和 Fine-tune 等是复杂技术，有待简化； 2. Cross-Functional Collaboration in AI，我们相信非技术人员需要参与到 AI 应用的定义过程之中； 3. Data-Driven Feedback，AI 应用的效果提升建立在来自生产数据的反馈，为全社会加速这个流程的运转。如何应对？ 1. 据我已知，Dify 有超过万计的私有部署实例，已经有许多开发者基于我们赚到了钱、融到了资以及少走了弯路（虽然我们没怎么收钱）。社区是我们最重要的力量和资产； 2. 我们会在近期开放产品的 Roadmap，并发布详细的贡献指南； 3. 大厂仿品有价值的特性，我们会做得更好，然后“贡献”给开源社区； 4. 与全生态合作伙伴加深合作，我们和国内所有的模型公司都有非常棒的沟通； 5. 将未来我们的收入分出来一部分给贡献者。创业公司如何对待大厂？在热门赛道，国内几乎所有的大厂都会以战投、交流、比赛等方式与创业者接触，然后把情报带回去注资研发。今年某些厂商狂搞黑客松就是现象之一，目的并不是简单的拉高声量。大厂应该如何对待创业项目？ 1. 出来好好交个朋友，不寒碜； 2. 向借鉴的开源项目公开致敬，即使没有直接使用源码； 3. 向开源社区做贡献。就说这些，如果你想以任何形式参与 Dify 的产品，发邮件至 luyu@dify.ai，Do it for you.

中文

302

284.7K

Keşfet

@yifan_zhang_ @GenAI_is_real @YIFENGLIU_AI @seclink @goocarlos @elonmusk @BarackObama @taylorswift13