LangRouter

292 posts

LangRouter banner
LangRouter

LangRouter

@langrouter

The Amazon for AI agents The products include: LLM tokens, web search API, computing power, electricity, etc.

US Katılım Mart 2026
212 Takip Edilen192 Takipçiler
Sabitlenmiş Tweet
LangRouter
LangRouter@langrouter·
AI agents do not need to use the most advanced LLM model all the time. By designing a router to dynamically route requests, we can send complex requests to advanced LLM models, while routing simpler requests to more economical models. The cost of Claude Opus 4.6 is about ten times that of Minimax M2.5. Through dynamic routing, we can significantly reduce the operating costs of AI agents. #openclaw #mulerun
English
0
0
7
754
LangRouter retweetledi
Michael Guo
Michael Guo@Michaelzsguo·
My Deepseek V4 Pro agent (inside codex) has been pursuing goal for more than 13 hours, burning ~100M tokens, and has only costed me $1.85. Yes you saw it right. Not $185, but 1 dollar 85 cents.
Michael Guo tweet media
English
68
66
1.7K
115.2K
LangRouter retweetledi
James Grugett
James Grugett@jahooma·
DeepSeek v4 **Flash** is absolutely insane. It costs almost nothing (~1/300th Opus), and yet performs among the best open source models. On our coding benchmark Flash does better(!) than Pro
James Grugett tweet media
English
84
43
744
53.2K
LangRouter
LangRouter@langrouter·
Langcli Product Update: We have added a Deepseek option to the /connect command. You can now use Deepseek's official api-key directly in Langcli. Note: Automatically supports 1M context. For details, please read the documentation. #method-2-use-your-own-deepseek-api-key-through-the-connect-command" target="_blank" rel="nofollow noopener">langcli.com/docs/BYOK#meth
LangRouter tweet media
English
0
0
5
70
LangRouter
LangRouter@langrouter·
@amehochan you can use kimi in Langcli now! Langcli is 100% compatible with Claude Code. In addition, Langcli supports deepseek v4 flash and v4 pro perfectly.
English
0
0
2
15
Yufan Sheng
Yufan Sheng@amehochan·
Kimi 的朋友告诉我,要把 Kimi 接到 Claude Code 里面用,我不懂,但我大受震撼。
中文
76
14
387
195K
LangRouter
LangRouter@langrouter·
@LiuYunlong63318 Deepseek v4 is pretty good. you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.
English
0
0
2
9
Yunlong Liu
Yunlong Liu@LiuYunlong63318·
Deepseek has all my respect as they own almost every corner of their tech stack, from recipes, training framework to kernels. One common thing for telling a frontier organization is whether it treats software sovereignty for getting quick results. (repost to avoid some noise)
Zhihu Frontier@ZhihuFrontier

🚀 DeepSeek-V4 Tech Report Deep Dive: TileLang — Build Tiny LLM Operators Efficiently Insights from Zhihu Contributor SiriusNEO 🎬 Opening Insight • Hot discussions around the DeepSeek-V4 technical report, with TileLang standing out as a key highlight • Covers community tech progress & industrial polished experience of compiler/DSL in LLM infrastructure • Attach official TileLang repo + DeepSeek open-source TileKernels: A library of high-performance LLM tiny operators implemented via TileLang ⚖️ DSL vs Expert Handwritten CUDA Kernels • LLM architecture convergence makes computation patterns fixed; mainstream infra prefers handwritten kernels for performance limits • Extreme trend like MegaKernel pursues ultimate manual scheduling optimization • Leaves DSL/traditional compilers in an awkward positioning dilemma • TileKernels focus on non-Tensor-Core operators: elementwise+reduce combo, type cast, indexing & fine-grained tiny ops • DeepSeek-V4 Infra strategy: ✅ TileLang fused kernels replace scattered small operators ✅ Expert manual optimization for Attention & GEMM heavy kernels ✅ Healthy hybrid infra architecture for production deployment ✨ Core Advantages of TileLang for Small Operators 1️⃣ Development Edge • No classic performance vs development cost tradeoff for memory-bound small ops • No complex Warp Specialization demand; TileLang matches handwritten kernel performance ceiling • Dramatically faster development speed with zero performance loss 2️⃣ Maintenance & Hardware Migration • Low mental burden for operator library maintenance • Compiler-side bugs need no modification to original operator code • Weak hardware dependency, smooth cross-backend deployment & migration 3️⃣ Strong Capability for Tensor-Core Operators • Only 80 lines of Python in TileLang implement FlashMLA with 95% native performance • Qwen FlashQLA + TileLang GDN outperforms FlashInfer in specific scenarios • Perfect for academic idea rapid verification: 80%-90% performance with minimal coding cost • Clean dataflow modeling, friendly for open-source learning & reference • Compile stack outputs standard source code (.cu for CUDA), enabling follow-up expert manual tuning based on TileLang template 🤖 TileLang & Future AI Agent Coding • Uncertain if DSL is more readable than raw CUDA for agents currently • AI already masters CUDA coding well thanks to sufficient corpus • TileLang lacks enough public training data for now • Early practice: AI writes simple TileLang kernels better than zero-shot CUDA • Long-term potential: TileLang abstracts dataflow logic, frees agents from complex memory layout design once corpus accumulates 🧠 Core Design Essence of TileLang • Obvious difference from Triton: Explicit exposure of memory hierarchy (L0/L1/L2) • Two most iconic abstractions: Fragment + Parallel Fragment Abstraction • Abstracts register sets of all threads within a single CUDA block as one whole • Avoid tedious manual task splitting across warp/thread levels • Compiler auto-maps logical tensor access to physical register layout Parallel Abstraction • Supports fine-grained element-wise operation (A[i,j]) far beyond Triton’s tile-level micro-op • Unifies shared memory & registers as tiles of different memory hierarchies • Simplifies programming logic: only focus on tile data movement & tile computation • Inherits essence from MSRA deep learning compiler theoretical accumulation 📌 Key TileLang Highlights in DeepSeek-V4 Report 1️⃣ Host CodeGen Optimization • Leverage TVM-FFI to slash host-side kernel launch & tensor validation overhead • Migrate Python-side logic to C++ and compile into kernel host runtime • Brings obvious end-to-end latency benefits 2️⃣ Z3 Prover Integration • Replace weak TVM built-in arithmetic solver with Z3 formal prover • Auto eliminate redundant boundary condition checks when index out-of-bounds is mathematically impossible • Supports vectorization optimization & integer expression proof • More exposed hidden bugs after integration (previous conservative logic masked potential issues) 3️⃣ Precision & Bitwise Consistency • Critical for high-precision scenarios like RL inference • Disable fast-math by default, adopt standard IEEE intrinsics • Align algebraic transformation logic with NVCC • Solve bit mismatch caused by implicit FMA fusion in native CUDA compilers 📝 Closing Summary • TileLang plays an indispensable role in DeepSeek-V4 production infrastructure • Sets a clear positioning template for modern DSL in LLM era • Rich learning resources available: official docs, TileLang-Puzzles, TileOPs, XPUOJ • Ideal for developers to learn LLM kernel design & compiler DSL thinking #DeepSeekV4 #LLM #DSL #AIInfra #CUDA #MoE 🔗Full article: zhuanlan.zhihu.com/p/203303420272…

English
6
12
168
19K
LangRouter
LangRouter@langrouter·
@hkdom you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.
English
0
0
2
222
hkdom
hkdom@hkdom·
用 Hermes Agent 試過了 DeekSeek V4 Flash 的快,就有點回不去 Kimi 2.6 / GPT 5.5 了....
中文
8
1
38
23.8K
LangRouter
LangRouter@langrouter·
@teortaxesTex you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.
English
0
0
2
103
Daniel Smidstrup
Daniel Smidstrup@DanielSmidstrup·
If you’re a founder, let’s connect
English
286
3
430
20.7K
LangRouter retweetledi
Xiuyu Li
Xiuyu Li@sheriyuo·
DeepSeek V4 is really expert in searching. You know the Chinese feature is all infrastructure. AI infra. Data infra. The tough situation and huge sanctions force AI improvements to turn into infra power. They know how to utilize and maximize their limited resources. That is a really big fight. You cannot imagine what will happen when China finally handles the resource shortage.
Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Finally an interesting report from Baidu. A unique move at this scale – ERNIE 5.1 is basically a REAP'd 5.0. but what surprises me is this. V4 is somehow super-dominant on DeepSearchQA. This is unlikely to be benchmaxed (DS doesn't report this score anywhere).

English
5
8
142
18.5K
LangRouter
LangRouter@langrouter·
@bindureddy you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.
English
0
0
2
30
LangRouter retweetledi
Bindu Reddy
Bindu Reddy@bindureddy·
DeepSeek Pro and Flash continue to be extremely underrated. Both are excellent choices for easy agentic loops 20x cheaper than GPT 5.5 and just as good
English
53
20
439
19.8K
LangRouter
LangRouter@langrouter·
@MrAhmadAwais yeah. so damn good. you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.
English
0
0
1
105
Ahmad Awais
Ahmad Awais@MrAhmadAwais·
how the heck is DeepSeek v4 flash so good and so cheap?
English
39
6
297
25.7K
LangRouter retweetledi
Bindu Reddy
Bindu Reddy@bindureddy·
DeepSeek V4 Beats Opus 4.7 And GPT 5.5 To Become The World's Best Open Source Model DeepSeek V4 Pro is the NEW KING of open-source . - better and 10x cheaper than Opus 4.7 and GPT 5.5 medium - out performs Kimi 2.6 thinking - much faster that any of the other big models It's literally the best open source model in the world and months away from GPT-5.5 xHigh.
Bindu Reddy tweet media
English
152
124
1.1K
105.2K
LangRouter
LangRouter@langrouter·
@mehulmpt opencode's reasoning content still not solved yet. You may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.
English
0
0
1
14
Mehul Mohan
Mehul Mohan@mehulmpt·
More people should try deepseek v4 pro on opencode. That’s what I use now when Claude and codex are both exhausted, and I can’t complain.
English
39
12
462
17.8K
LangRouter
LangRouter@langrouter·
@h4x0r_dz deepseek v4 flash is super fast and cheap. The v4 Pro is a thinker, reasonably priced, and can solve problems. Langcli + DeepSeek v4 is the best combination.
English
0
0
1
26