DataLearner

1.2K posts

DataLearner

@DataLearnerAI

关注数据科学关注科技行业关注人工智能关注一切促进人类生活美好的新技术业界主流大模型列表：https://t.co/H4FUDd7Gfb 国产开源大模型生态现状：https://t.co/q5KU9WhPuE

Hefei, China Katılım Mart 2023

823 Takip Edilen549 Takipçiler

Sabitlenmiş Tweet

DataLearner@DataLearnerAI·20 May

要是单看gemini 3.5和前代gemini 3的价格对比，那是翻了3倍了，但是如果和sonnet 4.6对比的话，从评测结果看是有优势的，而价格方面也是低于 sonnet 4.6的。估计背后模型的规模可能是变大了不少，没准和上代Gemini 3.1 Pro差不多。不过谷歌的模型都是已发布很惊艳，用着用着就不对劲了。

中文

140

DataLearner@DataLearnerAI·1d

转需

Jiayuan (JY) Zhang@jiayuan_jy

MiniMax M3 即将发布，想邀请一些中文开源社区的 contributor 来评测，阿岛 @SkylerMiao7 建了一个飞书群，可以第一时间体验到！另外希望申请者有一些开源项目的贡献经验（贡献过开源项目或者有自己的开源项目），在验证信息里面注明就行。

中文

DataLearner@DataLearnerAI·1d

Opus已经从4.6更新了2代到Opus 4.8了，然而Sonnet依然停留在4.6版本，Haiku则停留在4.5版本，这是担心更便宜的模型已经足够强所以迭代速度变慢了吗，非常奇怪

DataLearner@DataLearnerAI

Anthropic说Opus 4.8的默认思考模式是high，效果和成本的最佳均衡。从OS-World-Verified看，比Opus 4.7的Max模式少一点tokens，比Opus 4.7 xhigh高一点tokens就能有更好的效果。但是实际体验上不知道是不是5h限额缩小的原因，做了2从信息搜索任务并生成md和接口请求，单次都是消耗5小时限额的10%！

中文

DataLearner@DataLearnerAI·1d

中文

147

DataLearner@DataLearnerAI·3d

@RyanLeeMiniMax When M3?

English

2.7K

RyanLee@RyanLeeMiniMax·3d

Recently, we took time to consolidate all of the work behind M2 and published it here: our M2 paper on arXiv It’s been just over six months since we first open-sourced M2 on December 23 last year. During that time, a number of our ideas and systems have been broadly adopted by the open-source community — including CISPO, Forge RL System, Self-Evolution. Over the past six months, we’ve felt incredible enthusiasm from the open-source community. Nearly every model release reached the #1 spot on the Hugging Face leaderboard. Now it’s time for a new chapter. We’re getting ready for M3. MSA paper is on the road. arxiv.org/abs/2605.26494

English

667

181.8K

DataLearner@DataLearnerAI·4d

minimax工程师说有“sth big”要发布了莫非是M3要来了🤔

Skyler Miao@SkylerMiao7

Something BIG is coming

中文

133

DataLearner@DataLearnerAI·4d

更合理的方式应该是针对手机侧或者端侧的应用场景单独设计评测集，通用评测集容易和更大模型形成鲜明的差距对比，会让人觉得很差。但是如果评分接近其实很容易让人怀疑

BuBu@BooBooCola

有些不明白为什么小参数模型要搞各种 benchmark，感觉小参数模型主要是验证用，还有就是搞创新，还有就是给没卡的研究生发论文用。

中文

DataLearner@DataLearnerAI·6d

codex很多体验也是很好，比如结束对话或者等待输入会出现声音提示，下方的状态栏有数值提示，但是Claude Code经常做完啥东西也没有，做其他事情就容易错过

宝玉@dotey

Codex 交互做的真的挺好的，你可以方便的看当前运行的 SubAgents，以及每个 SubAgent 在做的事、用的提示词

中文

157

DataLearner@DataLearnerAI·6d

codex又又又又重置用量了！最近重置很频繁啊，GPT-5.5 Extra-High用起来也很安心啊

Tibo@thsottiaux

Some of you noticed limits drained faster in Codex, we root caused it to an optimization that we rolled back that had an impact on cache hit rates when compacting across long running sessions. We fixed this and have now reset usage limits for all accounts. Enjoy the weekend.

中文

134

DataLearner@DataLearnerAI·23 May

感觉大模型语种可能真的不再是一个障碍，Claude非常奇怪的表现是即使用中文输入，思考过程和回复很多时候都是英文，甚至答案还会引用输入的中文，记得曾经Anthropic发过一个研究说大模型内部大概率是一种非人类语言思考，但是能迁移其它语种的知识。所以输出支持小语种表示这种简单能力也就够了。

中文

DataLearner@DataLearnerAI·23 May

OpenAI的Codex工程负责人表示OpenAI的产品流量中有5%的流量来自Pi，还有5%左右来自OpenCode，也就是这些人在上述两个产品中使用ChatGPT的账户里面的GPT的额度。曾经觉得OpenAI有点落后且封闭，现在对比一下Anthropic，OpenAI反倒显得更Open一点了

Tibo@thsottiaux

A little secret. About 5% of our production traffic is on the Pi harness, about another 5% is on OpenCode. Reminder you can use your ChatGPT account in a flourishing set of other tools. We’ll continue to make Codex awesome, but you have options.

中文

190

DataLearner@DataLearnerAI·22 May

@lewangx 这个是自己做的网页对接本地模型？

中文

235

LE@lewangx·22 May

看着 token 的价格，GX10 很有性价比了

LE@lewangx

等了一个礼拜，周末前华硕GX10到货了。看来看去，还是买这个比较合适，统一内存显存够大，放办公室桌子上也比较也小巧。

中文

13.7K

DataLearner@DataLearnerAI·22 May

关键是模型辣鸡真是没法用之前那么多人用很多都是冲着有Claude模型去的 Gemini额度高其实就Gemini 3.5 flash拉垮表现看没有啥吸引力

思维怪怪@0xLogicrw

谷歌 Antigravity 负责人（原 Windsurf 创始人）Varun Mohan 宣布，即日起再次将所有付费订阅计划的每周 Gemini 模型调用额度上限提升 3 倍。加上前一日 3 倍的额度调整，目前的基准配额已累计达到最初版本的 9 倍。同时，官方已将所有付费用户的当周用量清零重置，以期为开发者提供更充足的算力余量。然而，这一「加量」声明备受吐槽。有开发者在评论区指出，Antigravity 此前曾经历过一次严重的「配额缩水」（rug-pull），当时的调用限制严苛到哪怕只是偶尔使用侧边栏对话的轻度用户都会迅速触发限制，导致产品陷入完全不可用的「窒息状态」。官方此举本质上只是在修复之前极度不合理的严苛限制，如今却将其包装成慷慨的「免费福利」来进行营销。

中文

127

DataLearner@DataLearnerAI·22 May

智谱发布了GLM-5.1-Highspeed版本了，这个版本的输出速度达到了400 tokens/s，唐杰老师说这个接口价格很贵，但是官网没有查询到，不知道是华为硬件还是N卡。不过按照时间算，GLM-5.2这个新版或者5.5之类的新版本是不是应该快了

jietang@jietang

GLM-5.1-highspeed is coming, 400 tokens per second. Very expensive, but bring a new possibility.

中文

323

DataLearner@DataLearnerAI·20 May

估计后台直接使用的现有的Antigravity逻辑，所以模型列表可能还有Claude，不过迟早会被删除，毕竟Opus 4.7发布这么久也没有更新。

LotusDecoder@LotusDecoder

anti-gravity CLI 里面竟然有 opus-4.6，那估计大多数人首选用 opus 了。

中文

275

DataLearner@DataLearnerAI·20 May

3.5 Flash 的输出速度超过每秒 280 个 token，是 GPT-5.5 和 Opus 4.7（约 60–70 token/秒）的四倍。这个速度差距在实时交互场景和高并发 Agent 调度中具有实际价值，还是可以的，详情：datalearner.com/ai-models/pret…

中文

DataLearner@DataLearnerAI·20 May

中文

140

DataLearner@DataLearnerAI·19 May

Anthropic真是如虎添翼啊希望Karpathy即使加入Anthropic 未来也能继续贡献智慧给全人类而不仅仅是Anthropic

Andrej Karpathy@karpathy

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

中文