simingg yyan

769 posts

simingg yyan

@Samoyansiming

quantitative research. Future and option in China and Europe

shanghai, china Entrou em Mart 2023

498 Seguindo30 Seguidores

simingg yyan retweetou

ollama@ollama·11h

You can use GLM-5.2 and Kimi-K2.7-Code in Codex with Ollama! ollama launch codex ollama launch codex-app

Tibo@thsottiaux

Reminder that you can use the Codex App, CLI and SDK with any open source model, not just with OpenAI models. #oss-mode-local-providers" target="_blank" rel="nofollow noopener">developers.openai.com/codex/config-a…

English

817

49.7K

simingg yyan retweetou

Firecrawl@firecrawl·1d

Starting today, you can try Firecrawl for free without an API key 🔥 Search, scrape, and interact with any web page, plus parse any PDF into clean markdown, with no setup at all! Start using our endpoints and only sign up when you scale. Live on our MCP, CLI, and API now!

English

201

1.7K

640.6K

simingg yyan retweetou

小互@xiaohu·15h

OpenAI 格局大了宣布Codex （包含 App 客户端、命令行 CLI 和开发包 SDK）支持直接接入任何开源大模型不强制绑定 OpenAI 自家的模型并且放出了一个文档：手把手教开发者如何把 Codex 客户端底层的“大脑”，替换成免费的开源模型…

Tibo@thsottiaux

中文

108

147

1.2K

242.4K

simingg yyan retweetou

Noam Shazeer@NoamShazeer·5h

I’m excited to share that I’ll be joining OpenAI and look forward to working with the exceptional team there. It was a difficult decision to move on. I’m incredibly proud of the amazing team at Google and everything we’ve built together. It has been an honor and a pleasure to work with all of you.

English

580

450

8.3K

3.1M

simingg yyan retweetou

Kafka@kfk_ai·1d

徐亦达教授（Prof Richard Xu），香港浸会大学数学系教授，“TadReamk Limited”创始人他在 GitHub 上只有一个真正爆款项目：`machine-learning-notes`（9663 stars，1767 forks），一部 2000+ 页的机器学习、概率模型和深度学习的幻灯片合集，附带视频链接除此之外的 13 个仓库，大多是无描述、无更新、无代码的“三无产品”。他拥有 5347 个 followers，但关注 0 人——绝对的知识输出者，社交黑洞

中文

126

607

54.8K

simingg yyan retweetou

Z.ai@Zai_org·1d

Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency - MIT-licensed open weights - Same API pricing as GLM-5.1 Tech Blog: z.ai/blog/glm-5.2 Weights: huggingface.co/zai-org/GLM-5.2 API: docs.z.ai/guides/llm/glm… Coding Plan: z.ai/subscribe Chat: chat.z.ai

English

507

1.3K

9.5K

4.1M

simingg yyan retweetou

Przemek Chojecki | PC@prz_chojecki·4d

Kimi 2.7 ranked 2nd after Fable 5 and before GPT-5 xhigh We have re-run our ErdosBench smoke test on 14 problems with Kimi 2.7, Qwen 3.7 Max, Grok 4.3 and compared it with the top performers from previous runs. Kimi 2.7 is amazingly good. More below.

English

169

554

5.1K

1.8M

simingg yyan retweetou

jietang@jietang·4d

GLM-5.2 is Fully Open, Frontier Intelligence Belongs to Everyone Today, the sudden restriction of certain frontier models is deeply regrettable. At a time when access to frontier models is abruptly cut off for non-technical reasons, we are even more convinced of one thing: science should be global. The path to AGI (Artificial General Intelligence) must never be enclosed by high walls. We have always believed that AGI should be the cornerstone for all of humanity to collaboratively explore the boundaries of intelligence and solve complex challenges, rather than a privilege monopolized by a few rules and subject to revocation at any moment. In the face of external blockades and restrictions, our attitude is one of radical openness. Frontier intelligence must remain open-source, accessible, and buildable, serving every dedicated developer. GLM-5.2 is Zhipu's most capable open-source model to date. It not only supports a truly usable 1M context window but also maintains a continuous lead in the independent completion of long-horizon tasks, providing solid foundational support for building complex agent applications. It also continues to be our main engine for creating the strongest domestic coding model. Tonight at 5:21—at this special moment—GLM-5.2 will officially be available to all GLM Coding Plan users (including Lite / Pro / Max). The API will also go live next week. A step closer to frontier intelligence for everyone. The future of AI is open, and it is for the people. ModelKey: GLM-5.2

English

260

771

7.5K

940.2K

simingg yyan retweetou

Santi Torres@SantiTorAI·6d

Un desarrollador ucraniano creó un agujero negro en su terminal para obligarse a tomar descansos. Cuanto más trabajas sin parar, más crece y deforma tu código con su lente gravitacional. Descansas y se encoge.

Español

270

2.9K

32.9K

4.2M

simingg yyan retweetou

Megumin_SAMA@Meguminsama2009·10 Haz

国产应用进入大淫纹时代

中文

819

530

11.4K

6.6M

simingg yyan@Samoyansiming·9 Haz

@dashen_wang 你是大专？

中文

AI最严厉的父亲@dashen_wang·8 Haz

埃隆马斯克，你给我出来！为什么CodeX没有Linux的桌面版？

中文

39.5K

simingg yyan@Samoyansiming·9 Haz

@_JinZhengEn @Sizhe_bitcat 这赌王儿子何猷君啊

中文

金正恩-三代目白頭山純血統コメント専員@_JinZhengEn·8 Haz

@Sizhe_bitcat 这一看就是销冠豪门只会在下雨天躲在劳斯莱斯里面看销冠淋雨

中文

1.4K

Sizhe思哲@Sizhe_bitcat·8 Haz

豪门对衣着的要求有多高，哪怕大夏天30多度的高温，一样穿着西装衬衣。听说他们的这种西装很凉快，里面都是冰丝的🙉

中文

67K

simingg yyan@Samoyansiming·9 Haz

@anblk984 @goshi_aoki 你这种小殖儿应该全面看空中国的。从未看多过

中文

1.1K

琵琶牧々@anblk984·9 Haz

@goshi_aoki 这场ai竞赛我是真看空中国的，中国这些企业都是在抢客户抢人，并不在乎客户质量所以都在发红包压价格炒作爱国但是美国的企业是抢投资和能源，所以着力点在挖护城河和技术研发上

中文

37.7K

Goshi Aoki@goshi_aoki·8 Haz

中国の浙江大学（Zhejiang University）のComputer Scienceの修士課程を1年前に修了しましたが、中国人のトップAI人材との対話や大学院に在籍する中で知った、中国のAI人材育成システムについて、まとめてみました↓

日本語

155

1.1K

798.2K

simingg yyan retweetou

恒星@vintcessun·8 Haz

为什么Muon训练大模型比Adam快近两倍，却没人说清原因？这个困局直接影响下一代优化器设计。这篇论文从曲率切入，把损失下降拆成一阶增益和二阶曲率惩罚，惊讶发现两者步长相当，但Muon的归一化方向锐度NDS显著更低——不是步大，是方向更聪明。尤其数据不平衡会放大这个优势，中后期训练核心来自更小的层内曲率。理论加实验，终于把玄学变成几何直觉。 arxiv.org/abs/2606.04662

中文

297

15.1K

simingg yyan retweetou

南宫远@nangongyuan·7 Haz

审美真的是很奇怪的东西。我承认三上悠亚还可以。但是比起葵司来，根本不能比。

中文

360

552

323.9K

simingg yyan retweetou

Mathieu@miniapeur·4 Haz

ZXX

3.8K

27.6K

415.1K

simingg yyan retweetou

Anna 🇺🇸@realAnn_29·4 Haz

Big dog parents know this struggle TOO well. 😂😂

English

152

1.8K

17.8K

1.7M

simingg yyan retweetou

Cameron R. Wolfe, Ph.D.@cwolferesearch·2 Haz

Interested in learning how to run RL at scale? Here are the best resources to read… Research on Scaling RL 1. The Art of Scaling RL compute for LLMs: arxiv.org/abs/2510.13786 2. Scaling Behaviors of LLM RL Post-Training: arxiv.org/abs/2509.25300 3. Optimally Scaling Sampling Compute for LLM RL: arxiv.org/abs/2603.12151 4. Scaling up RL: arxiv.org/abs/2507.12507 5. ProRL V2 - Prolonged Training Validates RL Scaling Laws: hijkzzz.notion.site/prorl-v2 6. Polaris - A Recipe for Scaling RL with Reasoning Models: hkunlp.github.io/blog/2025/Pola… RL Frameworks 1. Hybrid Flow (early outline of the verl framework): arxiv.org/abs/2409.19256 a. More up-to-date info can be found here: arxiv.org/abs/2601.18150 2. AReal - Large-Scale Async RL: arxiv.org/abs/2505.24298 3. PipelineRL - Fast On-Policy RL: arxiv.org/abs/2509.19128 4. AsyncFlow - Async Streaming RL: arxiv.org/abs/2507.01663 RL for Agents 1. DeepSWE - Open Coding Agent Trained w/ RL: together.ai/blog/deepswe 2. AutoForge - Environment Synthesis for Agentic RL: arxiv.org/abs/2512.22857 3. Agent-R1 - Training Agents w/ End-to-End RL: arxiv.org/abs/2511.14460 4. AgentRL - Scaling RL for Multi-Turn, Multi-Task Agents: arxiv.org/abs/2510.04206 5. The Landscape of Agentic RL: arxiv.org/abs/2509.02547 6. Training SWE Agents with RL: arxiv.org/abs/2508.03501 Case Studies & Tech Reports 1. Kimi tech reports: a. Kimi K2 - Open Agentic Intelligence: arxiv.org/abs/2507.20534 b. Kimi End-to-end Agentic RL: moonshotai.github.io/Kimi-Researche… c. Kimi K1.5 - Scaling RL for LLMs: arxiv.org/abs/2501.12599 2. Composer series from Cursor: a. Composer 2: arxiv.org/abs/2603.24477 b. Composer 2.5: cursor.com/blog/composer-… 3. Olmo 3 (also has open code / data): arxiv.org/abs/2512.13961 4. MiniMax tech reports: a. MiniMax-M2: arxiv.org/abs/2605.26494 b. MiniMax-M1: arxiv.org/abs/2506.13585 5. Nemotron 3 (NVIDIA): arxiv.org/abs/2512.20856

English

136

802

34.5K

simingg yyan retweetou

Muyu He@HeMuyu0327·3 Haz

I am a big fan of Jianlin Su's blog because it always starts from first principles in mathematics, rather than "ML tricks", to approach a typical ML problem (eg. training-free MoE load balancing). Here is me trying to "reinvent" one such blog which provides an elegant alternative to compute Muon, by filling in all the derivations that the blog skips for a less math-savvy audience (besides being entirely in Mandarin). The goal of the blog is to find a way to compute a essential component of Muon, ie. the left and right singular value matrices U and V for the gradient G, **individually**. In the standard form, Muon really just needs their product UV^T, hence the standard way to compute it via computing a low-rank polynomial of G many times ("Newton-Schulz"). But there are more variants of Muon to control the properties of model updates if we can get both individually, hence the blog's proposal to revisit some fundamental linear algebra techniques for the computation. The methodological takeaway from the blog's thought process is that there are three components to breaking down a ML problem: (1) how to be able to compute something (power iteration), (2) how to compute it fast (cholesky decomposition), and (3) how to compute it accurately given finite floating points (repeated orthogonalization). The goal of reading inspiring blogs like this is, in Feynman's term, to be able to "reinvent" them at any time to grasp the fundamental approach of doing similar work. Original blog: kexue.fm/archives/11654

English

142

1.7K

76.7K

Descobrir

@dashen_wang @_JinZhengEn @Sizhe_bitcat @anblk984 @goshi_aoki @elonmusk @BarackObama @taylorswift13