Dango233

157 posts

Dango233

@dango233max

Baking open AI system Garnishing open weights https://t.co/Y5yWy3Hn2K

Shenzhen Katılım Kasım 2011

328 Takip Edilen944 Takipçiler

Dango233 retweetledi

virushuo@virushuo·5h

20 years ago, my first startup was all about enterprise search. Two decades later, we’re still building search engines. The technology has shifted from NLP to NN and the users from humans to agents. but searching is still the core. opensource the fastest bm25 engine:

Intelligent Internet@ii_posts

so we built psql_bm25s. exact BM25 retrieval. native Postgres access method. ~23x faster than pg_search on the standard benchmark. retrieval stops being a budget item. the harness stops rationing. the agent gets to look things up like it should have the whole time.

English

11.8K

Dango233 retweetledi

Sixia "Leask" Huang@LeaskH·2d

我們開源了這顆星球🌎上速度最快的低成本 bm25 引擎。

Intelligent Internet@ii_posts

中文

226

43.6K

Dango233@dango233max·4d

github.com/Dango233/ds4

ZXX

Dango233@dango233max·4d

DS4 is geart! I made a temporary fork with my weekend patches while the PRs are under review - unlock q4 on 192GB MAC - llama.cpp-style raw completions endpoint: enable Pre-filling and custom templates in SillyTaverns etc. Pre-merge convenience fork only :)

antirez@antirez

Welcome to DS4, a specialized inference engine for DeepSeek v4 Flash. github.com/antirez/ds4 This project would have been impossible without the existence of llama.cpp and GGML and the work of @ggerganov and all the other contributors. Thanks!

English

248

Dango233@dango233max·21 Şub

@karminski3 我召回测试用的都是苏丹的游戏的文本...

中文

829

karminski-牙医@karminski3·21 Şub

感觉大模型召回都已经不用测了? Fiction.LiveBench 作者刚在X上更新了最新的测试结果, 目前来看过年前后这一波大模型长上下文召回都很不错. 120K 长度来看, 最好的是 claude-opus-4.6, 达到了93.8%, 然后是 GLM-5 的85.7%, 以及 Kimi-K2.5 的78.1%, Qwen3.5-plus 的76.2. 不过 MiniMax-M2.5 则是40.6, 而且 MiniMax-M2.5 在8K就下降到60%以下了. 暂时不确定是什么问题. 我自己做的那个霍格沃茨测试新榜单几乎都毫无参考价值, 各个大模型训练语料都混入了非常多的哈利波特小说原文, 而且单次插桩目前来看召回效果都很好, 只有像 Fiction.LiveBench 这样的复杂召回测试能体现模型能力了. #召回 #长上下文 #大模型测试 #KCORES大模型竞技场

中文

17K

Dango233 retweetledi

virushuo@virushuo·17 Şub

我们始终还是相信 multi-agents 是必须的，尽管很多公司都认为它实现起来难度太大。我承认确实比预期困难一些，但是这应该是目前最“不一样”的多agent框架了。这个视频中每个节点都是agent，没有工作流，它们是自组织的，诞生，合作，互相攻击和死亡都是自主行为。

Intelligent Internet@ii_posts

Unstructured intelligence = chaos Most agent frameworks ship without a nervous system: deadlocks, context loss, vacuum hallucinations. We built Common Ground to fix this, agents coordinate on a shared protocol.

中文

19.6K

Dango233 retweetledi

Intelligent Internet@ii_posts·17 Şub

English

447

536.3K

Dango233 retweetledi

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr·16 Şub

Chinese New Year is rapidly becoming the AI researcher's favorite holiday

English

1.4K

140.8K

Dango233 retweetledi

virushuo@virushuo·13 Şub

我参与了中文版翻译工作。希望把关于 AI 时代经济与治理的讨论带给更多中文读者，欢迎大家指出任何翻译/术语建议。虽然AI已经能做大部分翻译任务，但翻译过程中还是有很大量的人类对齐工作，尤其一些概念中/英差距很大，又要兼顾原作者表达的语气和方式，整个工作体验还是很有意思的。

Intelligent Internet@ii_posts

你好，中国的朋友们！《The Last Economy》中文版现已上线，可在我们网站免费阅读。 “The Last Economy” by @EMostaque is now available in Chinese What language should we do next?

中文

397

62.5K

Dango233 retweetledi

Emad@EMostaque·4 Şub

Our state of the art open source general purpose agent hits V1 Feature equivalent to Replit / Manus / Genspark etc, to make websites to presentations and more connected to all your other tools Readying open repo update in a week or two, give it a try and give feedback!

Intelligent Internet@ii_posts

II-Agent V1 is here. The AI agent built for real work is finally out of beta. Faster, smarter, and production-ready. It’s time to change how you build. 👇 Let’s see what’s new.

English

326

30.5K

Dango233 retweetledi

Intelligent Internet@ii_posts·4 Şub

II-Agent V1 is here. The AI agent built for real work is finally out of beta. Faster, smarter, and production-ready. It’s time to change how you build. 👇 Let’s see what’s new.

English

215

132.1K

Dango233@dango233max·3 Oca

@bboczeng 在AI paper里面用manifold谈不上装hhhh

中文

373

勃勃OC@bboczeng·3 Oca

@dango233max 不是了，其实deepseek的工作估计也不那么在乎citation，反正员工也出不了国、当不了杰青院士，而且也不差钱。盲猜标题是DeepSeek CEO梁文峰自己要求的目的只有一个：装逼

中文

585

Dango233@dango233max·2 Oca

锐评：挂流形的羊头，卖运筹学的狗肉，论文命名的反向工程，给工程解法找理论爹千万别被“流形”这个词骗了。说是从流形理论推导的，我敢打赌这绝对是从运筹学“倒着来”的，想从微分几何去理解是南辕北辙。我工业工程的DNA动了，怪不得这么多人“看不懂”。说是指派问题我的IE同学们是不是能看懂？

alphaXiv@askalphaxiv

DeepSeek just dropped a banger paper to wrap up 2025 "mHC: Manifold-Constrained Hyper-Connections" Hyper-Connections turn the single residual “highway” in transformers into n parallel lanes, and each layer learns how to shuffle and share signal between lanes. But if each layer can arbitrarily amplify or shrink lanes, the product of those shuffles across depth makes signals/gradients blow up or fade out. So they force each shuffle to be mass-conserving: a doubly stochastic matrix (nonnegative, every row/column sums to 1). Each layer can only redistribute signal across lanes, not create or destroy it, so the deep skip-path stays stable while features still mix! with n=4 it adds ~6.7% training time, but cuts final loss by ~0.02, and keeps worst-case backward gain ~1.6 (vs ~3000 without the constraint), with consistent benchmark wins across the board

中文

6.8K

Dango233@dango233max·2 Oca

值得想沿着这个思路往下走的朋友们想的问题：这个思路的核心，是微分几何，还是运筹学/系统工程？

virushuo@virushuo

昨天我俩讨论了一下这个paper，首先它的突破性和解决问题的漂亮是没问题的。但有意思的地方是： 1 它是从微分几何获得了一个证明，然后找到了解法，还是先在工程上凑到了一个解法，然后用流型做证明？ 2 沿着这个思路，还有什么可推导的其他用途？

中文

614

Dango233@dango233max·21 Kas

#nanobanana

QME

209

Dango233@dango233max·20 Kas

Pure text prompt, no image init. The race is over. (?) #nanobanana #Gemini3

English

473

Dango233 retweetledi

Intelligent Internet@ii_posts·23 Eki

Find research faster with II-Commons Search arXiv + PubMed (web app + Agent demo, API/MCP/A2A) Beta now live for II-Accounts

English

502K

Dango233 retweetledi

Intelligent Internet@ii_posts·4 Tem

While building II’s open stack, we put together a Gemini-CLI → MCP+OpenAI a bridge to access the tools + model in our tests. This lets any MCP-savvy agent tap Gemini + its tools via your gemini-cli instance!

English

32.5K

Dango233 retweetledi

Intelligent Internet@ii_posts·17 Haz

II-Medical-8B-1706 is our latest state of the art open medical model 💡 Outperforms the latest @Google MedGemma 27b model with 70% less parameters 🤏 Quantised GGUF weights, works on <8 Gb RAM 🚀 One more step to the universal health knowledge access that everyone deserves ⚕️

English

114

577

250K

Keşfet

@karminski3 @bboczeng @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA