Dango233
153 posts

Dango233
@dango233max
Baking open AI system Garnishing open weights https://t.co/Y5yWy3Hn2K



Unstructured intelligence = chaos Most agent frameworks ship without a nervous system: deadlocks, context loss, vacuum hallucinations. We built Common Ground to fix this, agents coordinate on a shared protocol.

你好,中国的朋友们! 《The Last Economy》中文版现已上线,可在我们网站免费阅读。 “The Last Economy” by @EMostaque is now available in Chinese What language should we do next?

II-Agent V1 is here. The AI agent built for real work is finally out of beta. Faster, smarter, and production-ready. It’s time to change how you build. 👇 Let’s see what’s new.


DeepSeek just dropped a banger paper to wrap up 2025 "mHC: Manifold-Constrained Hyper-Connections" Hyper-Connections turn the single residual “highway” in transformers into n parallel lanes, and each layer learns how to shuffle and share signal between lanes. But if each layer can arbitrarily amplify or shrink lanes, the product of those shuffles across depth makes signals/gradients blow up or fade out. So they force each shuffle to be mass-conserving: a doubly stochastic matrix (nonnegative, every row/column sums to 1). Each layer can only redistribute signal across lanes, not create or destroy it, so the deep skip-path stays stable while features still mix! with n=4 it adds ~6.7% training time, but cuts final loss by ~0.02, and keeps worst-case backward gain ~1.6 (vs ~3000 without the constraint), with consistent benchmark wins across the board

昨天我俩讨论了一下这个paper,首先它的突破性和解决问题的漂亮是没问题的。但有意思的地方是: 1 它是从微分几何获得了一个证明,然后找到了解法,还是 先在工程上凑到了一个解法,然后用流型做证明? 2 沿着这个思路,还有什么可推导的其他用途?








