LangRouter

292 posts

LangRouter

@langrouter

The Amazon for AI agents The products include: LLM tokens, web search API, computing power, electricity, etc.

US Katılım Mart 2026

212 Takip Edilen192 Takipçiler

Sabitlenmiş Tweet

LangRouter@langrouter·17 Mar

AI agents do not need to use the most advanced LLM model all the time. By designing a router to dynamically route requests, we can send complex requests to advanced LLM models, while routing simpler requests to more economical models. The cost of Claude Opus 4.6 is about ten times that of Minimax M2.5. Through dynamic routing, we can significantly reduce the operating costs of AI agents. #openclaw #mulerun

English

754

LangRouter retweetledi

Michael Guo@Michaelzsguo·1d

My Deepseek V4 Pro agent (inside codex) has been pursuing goal for more than 13 hours, burning ~100M tokens, and has only costed me $1.85. Yes you saw it right. Not $185, but 1 dollar 85 cents.

English

1.7K

115.2K

LangRouter retweetledi

James Grugett@jahooma·3d

DeepSeek v4 **Flash** is absolutely insane. It costs almost nothing (~1/300th Opus), and yet performs among the best open source models. On our coding benchmark Flash does better(!) than Pro

English

744

53.2K

LangRouter@langrouter·5d

Langcli Product Update: We have added a Deepseek option to the /connect command. You can now use Deepseek's official api-key directly in Langcli. Note: Automatically supports 1M context. For details, please read the documentation. #method-2-use-your-own-deepseek-api-key-through-the-connect-command" target="_blank" rel="nofollow noopener">langcli.com/docs/BYOK#meth…

English

LangRouter@langrouter·5d

@amehochan you can use kimi in Langcli now! Langcli is 100% compatible with Claude Code. In addition, Langcli supports deepseek v4 flash and v4 pro perfectly.

English

Yufan Sheng@amehochan·6d

Kimi 的朋友告诉我，要把 Kimi 接到 Claude Code 里面用，我不懂，但我大受震撼。

中文

387

195K

LangRouter@langrouter·6d

@LiuYunlong63318 Deepseek v4 is pretty good. you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.

English

Yunlong Liu@LiuYunlong63318·6d

Deepseek has all my respect as they own almost every corner of their tech stack, from recipes, training framework to kernels. One common thing for telling a frontier organization is whether it treats software sovereignty for getting quick results. (repost to avoid some noise)

Zhihu Frontier@ZhihuFrontier

🚀 DeepSeek-V4 Tech Report Deep Dive: TileLang — Build Tiny LLM Operators Efficiently Insights from Zhihu Contributor SiriusNEO 🎬 Opening Insight • Hot discussions around the DeepSeek-V4 technical report, with TileLang standing out as a key highlight • Covers community tech progress & industrial polished experience of compiler/DSL in LLM infrastructure • Attach official TileLang repo + DeepSeek open-source TileKernels: A library of high-performance LLM tiny operators implemented via TileLang ⚖️ DSL vs Expert Handwritten CUDA Kernels • LLM architecture convergence makes computation patterns fixed; mainstream infra prefers handwritten kernels for performance limits • Extreme trend like MegaKernel pursues ultimate manual scheduling optimization • Leaves DSL/traditional compilers in an awkward positioning dilemma • TileKernels focus on non-Tensor-Core operators: elementwise+reduce combo, type cast, indexing & fine-grained tiny ops • DeepSeek-V4 Infra strategy: ✅ TileLang fused kernels replace scattered small operators ✅ Expert manual optimization for Attention & GEMM heavy kernels ✅ Healthy hybrid infra architecture for production deployment ✨ Core Advantages of TileLang for Small Operators 1️⃣ Development Edge • No classic performance vs development cost tradeoff for memory-bound small ops • No complex Warp Specialization demand; TileLang matches handwritten kernel performance ceiling • Dramatically faster development speed with zero performance loss 2️⃣ Maintenance & Hardware Migration • Low mental burden for operator library maintenance • Compiler-side bugs need no modification to original operator code • Weak hardware dependency, smooth cross-backend deployment & migration 3️⃣ Strong Capability for Tensor-Core Operators • Only 80 lines of Python in TileLang implement FlashMLA with 95% native performance • Qwen FlashQLA + TileLang GDN outperforms FlashInfer in specific scenarios • Perfect for academic idea rapid verification: 80%-90% performance with minimal coding cost • Clean dataflow modeling, friendly for open-source learning & reference • Compile stack outputs standard source code (.cu for CUDA), enabling follow-up expert manual tuning based on TileLang template 🤖 TileLang & Future AI Agent Coding • Uncertain if DSL is more readable than raw CUDA for agents currently • AI already masters CUDA coding well thanks to sufficient corpus • TileLang lacks enough public training data for now • Early practice: AI writes simple TileLang kernels better than zero-shot CUDA • Long-term potential: TileLang abstracts dataflow logic, frees agents from complex memory layout design once corpus accumulates 🧠 Core Design Essence of TileLang • Obvious difference from Triton: Explicit exposure of memory hierarchy (L0/L1/L2) • Two most iconic abstractions: Fragment + Parallel Fragment Abstraction • Abstracts register sets of all threads within a single CUDA block as one whole • Avoid tedious manual task splitting across warp/thread levels • Compiler auto-maps logical tensor access to physical register layout Parallel Abstraction • Supports fine-grained element-wise operation (A[i,j]) far beyond Triton’s tile-level micro-op • Unifies shared memory & registers as tiles of different memory hierarchies • Simplifies programming logic: only focus on tile data movement & tile computation • Inherits essence from MSRA deep learning compiler theoretical accumulation 📌 Key TileLang Highlights in DeepSeek-V4 Report 1️⃣ Host CodeGen Optimization • Leverage TVM-FFI to slash host-side kernel launch & tensor validation overhead • Migrate Python-side logic to C++ and compile into kernel host runtime • Brings obvious end-to-end latency benefits 2️⃣ Z3 Prover Integration • Replace weak TVM built-in arithmetic solver with Z3 formal prover • Auto eliminate redundant boundary condition checks when index out-of-bounds is mathematically impossible • Supports vectorization optimization & integer expression proof • More exposed hidden bugs after integration (previous conservative logic masked potential issues) 3️⃣ Precision & Bitwise Consistency • Critical for high-precision scenarios like RL inference • Disable fast-math by default, adopt standard IEEE intrinsics • Align algebraic transformation logic with NVCC • Solve bit mismatch caused by implicit FMA fusion in native CUDA compilers 📝 Closing Summary • TileLang plays an indispensable role in DeepSeek-V4 production infrastructure • Sets a clear positioning template for modern DSL in LLM era • Rich learning resources available: official docs, TileLang-Puzzles, TileOPs, XPUOJ • Ideal for developers to learn LLM kernel design & compiler DSL thinking #DeepSeekV4 #LLM #DSL #AIInfra #CUDA #MoE 🔗Full article： zhuanlan.zhihu.com/p/203303420272…

English

168

19K

LangRouter@langrouter·6d

Ring 2.6 1T from @TheInclusionAI is live in Langcli, free for a limited time. Give it a try now!

Ant Ling@AntLingAGI

We are launching Ring-2.6-1T, a trillion-parameter flagship thinking model engineered for real-world complex tasks and production env: 🚀 - Adjustable Thinking Effort: dynamic compute mechanism to flexibly balance cognitive depth, token cost, and execution speed; - Agent-Optimized: Built for high-frequency workflows, delivering rapid multi-step execution and tool orchestration with SOTA stability; - Deep Thinking: Unlocks the model's maximum capability ceiling for rigorous mathematical logic and scientific research;

English

108

LangRouter@langrouter·6d

@matteocollina @antirez you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.

English

Matteo Collina@matteocollina·8 May

Truly impressed by the work done by @antirez (again). If you have a 128GB Mac, the age of local AI is almost there. github.com/antirez/ds4

English

301

16.4K

LangRouter@langrouter·6d

@hkdom you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.

English

222

hkdom@hkdom·9 May

用 Hermes Agent 試過了 DeekSeek V4 Flash 的快，就有點回不去 Kimi 2.6 / GPT 5.5 了....

中文

23.8K

LangRouter@langrouter·6d

@teortaxesTex you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.

English

103

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·9 May

V4 Flash is a very interesting model. It's in the weight class of MiMo V2.5, Step 3.5 or such (<20B active, ≈300B total), it's 2x slower than them but 2-3x faster than flagships, has much better/longer context, and for many tasks it's de facto a cheaper V4.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media

hkdom@hkdom

用 Hermes Agent 試過了 DeekSeek V4 Flash 的快，就有點回不去 Kimi 2.6 / GPT 5.5 了....

English

188

14K

LangRouter@langrouter·6d

@DanielSmidstrup Hi @DanielSmidstrup We built @langrouter and Langcli. Langcli is something like claude code and supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code. Hoping to connect!

English

Daniel Smidstrup@DanielSmidstrup·9 May

If you’re a founder, let’s connect

English

286

430

20.7K

LangRouter retweetledi

Xiuyu Li@sheriyuo·9 May

DeepSeek V4 is really expert in searching. You know the Chinese feature is all infrastructure. AI infra. Data infra. The tough situation and huge sanctions force AI improvements to turn into infra power. They know how to utilize and maximize their limited resources. That is a really big fight. You cannot imagine what will happen when China finally handles the resource shortage.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex

Finally an interesting report from Baidu. A unique move at this scale – ERNIE 5.1 is basically a REAP'd 5.0. but what surprises me is this. V4 is somehow super-dominant on DeepSearchQA. This is unlikely to be benchmaxed (DS doesn't report this score anywhere).

English

142

18.5K

LangRouter@langrouter·9 May

@bindureddy you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.

English

LangRouter retweetledi

Bindu Reddy@bindureddy·8 May

DeepSeek Pro and Flash continue to be extremely underrated. Both are excellent choices for easy agentic loops 20x cheaper than GPT 5.5 and just as good

English

439

19.8K

LangRouter@langrouter·7 May

We have added a Guide that shows you how to use your own Deepseek API key in Langcli. For more info, check this: langcli.com/docs/BYOK #deepseek-v4

English

LangRouter@langrouter·7 May

@MrAhmadAwais yeah. so damn good. you may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.

English

105

Ahmad Awais@MrAhmadAwais·6 May

how the heck is DeepSeek v4 flash so good and so cheap?

English

297

25.7K

LangRouter retweetledi

Bindu Reddy@bindureddy·4 May

DeepSeek V4 Beats Opus 4.7 And GPT 5.5 To Become The World's Best Open Source Model DeepSeek V4 Pro is the NEW KING of open-source . - better and 10x cheaper than Opus 4.7 and GPT 5.5 medium - out performs Kimi 2.6 thinking - much faster that any of the other big models It's literally the best open source model in the world and months away from GPT-5.5 xHigh.

English

152

124

1.1K

105.2K

LangRouter@langrouter·4 May

If you had used Langcli vibe coding for this project yesterday, you would be a millionaire now. #Langcli #vibe_coding #sato

English

133

LangRouter@langrouter·4 May

@mehulmpt opencode's reasoning content still not solved yet. You may give a try to @langrouter 's Langcli which supports deepseek v4 flash and v4 pro perfectly. 100% compatible with Claude Code.

English

Mehul Mohan@mehulmpt·3 May

More people should try deepseek v4 pro on opencode. That’s what I use now when Claude and codex are both exhausted, and I can’t complain.

English

462

17.8K

LangRouter@langrouter·4 May

@h4x0r_dz deepseek v4 flash is super fast and cheap. The v4 Pro is a thinker, reasonably priced, and can solve problems. Langcli + DeepSeek v4 is the best combination.

English

H4x0r.DZ 🇰🇵@h4x0r_dz·3 May

deepseek-v4-pro 😮

English

344

31K

Keşfet

@amehochan @LiuYunlong63318 @TheInclusionAI @matteocollina @antirez @hkdom @teortaxesTex @DanielSmidstrup