BLANPLAN | 空界計劃

976 posts

BLANPLAN | 空界計劃

@blanplan

https://t.co/YpGj1TVunL CTO｜前百度｜聊 AI、产品、工程与创业、分享真实的一线经验

Bergabung Şubat 2025

257 Mengikuti185 Pengikut

BLANPLAN | 空界計劃@blanplan·3h

@Kalshi Running similar numbers. The 75% mostly lands in test scaffolding and boilerplate.

English

Kalshi@Kalshi·8h

JUST IN: Google says AI now generates 75% of its new code

English

254

357

3.8K

222.6K

BLANPLAN | 空界計劃@blanplan·3h

@Hayami_kiraa 叫 QR code 也行，扫一下就出代码了

中文

627

早见Hayami@Hayami_kiraa·15h

我今天才知道 claude 发音是 /klɔd/ 我一直发成了 cloud /klaʊd/ 我说呢怎么听人说 Claude code 都空耳 qr code

中文

37K

BLANPLAN | 空界計劃@blanplan·3h

@theo Most "model got dumber" reports I've debugged were context statefulness bugs.

English

183

Theo - t3.gg@theo·4h

Third episode of Nerd Snipe is live! This time I try to get Ben in on my Anthropic conspiracy theories (and explain why Claude got dumber) 00:07:30 T3 Code Banned? 00:16:08 How Claude Caching Works 00:24:00 Why Claude Seems Dumber 00:38:03 The Conspiracies Begin...

English

232

28.6K

BLANPLAN | 空界計劃@blanplan·3h

@sitinme 买对了。agent 多开的时候 64G 那条线卡得很具体。

中文

sitin@sitinme·15h

128G 的 M5 Max 到了，全价买的，有点心疼，但是为了全力跟上AI先进生产品，干就完了

中文

149

62.8K

BLANPLAN | 空界計劃@blanplan·4h

@oragnes Loopt 那次 $40M exit 让他在圈里有了发言权，YC 是后来把这个位置扩大。

中文

266

比特币橙子Trader@oragnes·11h

牛逼的人从小就牛逼！ 19岁辍学的奥特曼，拿着Loopt（基于地理位置社交应用），硬是从红杉手里融到3000多万美金，最后还能让Green Dot花4000多万接盘，完成一记标准的硅谷式体面退出。

中文

75K

BLANPLAN | 空界計劃@blanplan·4h

@zmt021 不排除 opus4.7 是对的。有些 bug 最优解就是带病跑。

中文

555

流浪国男@zmt021·16h

opus4.7有一点很搞笑，就是我有个老bug，已经让cc和codex debug一个月了都没解决，交给opus4.7，他从始至终就干一件事：劝我放弃debug，项目已经很完美了

中文

205

45.1K

BLANPLAN | 空界計劃@blanplan·4h

@manateelazycat CI 外的 agent 部署那段一直在人工补，docker context 没想到能直接解掉。

中文

122

Andy Stewart@manateelazycat·18h

分享一个让AI操控Docker的神技 CodeX一般都是运行在本地，服务运行在 Docker 容器里如果没有学会这个技巧，CodeX 总是会让你自己去做build->push->pull->run，特别浪费时间其实只要使用docker context就好了，如果是远程Docker，只需要新建一个remote docker context，需要时本地context user一下即可。例子: 创建一个context docker context create remote --description "服务器" --docker "host=ssh://root@ip" 切换这个context docker context use remote 使用这个context的docker docker run lzcappxxxx

中文

126

16.1K

BLANPLAN | 空界計劃@blanplan·4h

@aarondotdev already in my .env file

English

304

aaron@aarondotdev·16h

claude --model claude-mythos-0417

Polymarket@Polymarket

NEW: A small group of "unauthorized users" have reportedly breached Anthropic's tightly restricted Claude Mythos.

Français

3.2K

443.1K

BLANPLAN | 空界計劃@blanplan·5h

@disksing 第四个信封里放的是第三个信封的复印件

中文

775

象牙山刘能@disksing·1d

史蒂夫·乔布斯在即将卸任时，私下把蒂姆·库克叫到办公室，交给他三个标有数字的信封。 “Tim，如果以后苹果遇到了无法跨越的重大危机，就按顺序打开这些信封，每次只能开一个。”乔布斯语重心长地叮嘱道。几年后，外界开始疯狂质疑苹果在后乔布斯时代失去了创新能力，股价陷入停滞，库克面临着华尔街和董事会的巨大压力。他慌了神，想起了那三个信封。他从保险柜里拿出第一个信封，拆开一看，上面写着：“发布巨大屏幕和超土配色的iPhone。” 库克立刻照做，顶着“违背祖宗决定”的压力推出了大屏的iPhone Plus系列，随后又大搞“土豪金”、“玫瑰金”以及各种五颜六色的外壳。没想到市场反应极其热烈，销量彻底引爆，财报屡创新高。股东们对他的决策十分满意，危机顺利解除。又过了几年，科技界的竞争越发激烈，苹果生态的增长似乎又遇到了新的瓶颈。吸取了上次成功的经验，库克果断打开了第二个信封，字条上赫然写着：“人工智能即将迎来爆炸式增长，All in AI！” 库克如获至宝，立刻调转船头投入研发。只可惜，他把这个信封打开得实在太早了点，当时的算力和技术根本不支持大语言模型，最后轰轰烈烈的“All in”，只搞出了个Siri。时间来到现在，生成式AI彻底杀疯了，科技巨头们都在狂卷大模型，而苹果的Siri还在回答“我好像不明白你在说什么”，加上造车项目折戟、Vision Pro销量遇冷，库克再次被推到了风口浪尖，外界的质疑声震耳欲聋。这天，他独自走进办公室，拉上百叶窗，锁好门，深深地叹了口气，怀着沉重的心情打开了第三个信封。上面写着：“准备三个信封。”

中文

747

192.6K

BLANPLAN | 空界計劃@blanplan·5h

@mubeitech AI红队测试也是这个逻辑：沙盒里过关的模型，在真实用户的攻击面下掉链子。

中文

墓碑科技@mubeitech·13h

SpaceX在半空中故意炸毁了一枚猎鹰9号火箭。造价高昂的运载器炸成火球，加州总部的控制室却在疯狂欢呼。因为这是一场极限测试。目标只有一个：证明龙飞船能在火箭解体时把宇航员救下来。起飞，加速，突破音障。火箭迎着极限风阻，冲进最大动压区。一级发动机突然切断，系统强行触发了一场灾难性故障。半空中爆出一团刺眼的闪光，这枚庞大的火箭瞬间解体。就在毁灭降临的刹那，龙飞船逃逸程序启动。在超音速飞行的狂暴环境下，飞船硬生生把乘员舱从火海中拔了出来。非密封舱抛弃。舱体重新调整姿态。减速伞在空中稳稳撑开。不用计算机模拟跑数据，直接拿一枚真火箭当一次性耗材。极限状态下的安全底线，就是用最真实的毁灭砸出来的。

中文

106

997

146.5K

BLANPLAN | 空界計劃@blanplan·6h

@heyshrutimishra 你好

日本語

Shruti@heyshrutimishra·9h

14亿人。却只有0.1%是我的读者。我不相信世界上最强的AI大国，对我说的话不感兴趣。如果你在中国看到这条推文，评论你好 👋

中文

1.2K

BLANPLAN | 空界計劃@blanplan·6h

Holds strongly in AI product work. Stakeholders reliably catch when something is off. Their proposed fix (model swap, more prompting) addresses the visible surface while the root cause sits in task decomposition or evaluation setup. Most post-mortems show the stakeholder's diagnosis was correct. The fix they recommended would have changed nothing.

English

520

Brian Potter@_brianpotter·9h

This reminds me of a Bill Hader story where he says that when an executive points out that something needs to be fixed in a show, they're almost always right, but their specific advice for fixing it is almost always wrong.

Orson Scott Card@orsonscottcard

You don't need advice from editors on rejected manuscripts. My short story “Ender's Game” was rejected by Ben Bova at Analog back when that was the top market for a sci-fi story. Ben gave me feedback. He thought the title should be “Professional Soldier” and he said to “cut it in half.” But I knew he was wrong on both points and submitted it to Jim Baen at Galaxy. He sat on it for a year, and responded to my query with a rejection. There was some kind of explanation, but I don't remember what it was. I concluded at the time that Baen's comments showed that he had barely glanced at the story. So … I got feedback both times, but it was not helpful. I looked at Ben's rejection again. What was it about the story that made him think it should, let alone COULD, be cut in half? Apparently it FELT long. What made it feel long? Now, post-Harry Potter, I would call it the quidditch problem. I had too many battles in which the details became tedious. So I cut two battles entirely, merely reporting the outcomes, and shortened another. In retyping the whole manuscript (pre-word-processor, that was the only way to get a clean manuscript), I added new point-of-view material to the point that I had cut only one page in length. So much for “in half.” But I already knew that my manuscripts did not need cutting — if it wasn't needed, it wouldn't be there in the first place. Even the battles were still there, but instead of showing them, I merely told what happened (so much for the usually asinine advice “show don't tell”), which kept the pace going. Those changes made, I sent it to Ben again. I did not remind him of what he had advised me to do. I merely told him I liked my title, and said, “I have addressed your other concerns,” which was true. I figured he wouldn't remember what his exact words had been. My answer was a check. That revised story was the basis for my winning the Campbell Award for best new writer. Did Ben's feedback help? Yes — but his specific advice was not right, and I knew it. On my next two submissions, Ben hated my endings, and I revised as suggested. The fourth submission he rejected outright, and the fifth, and I thought, Am I a one-story writer? I went back to Ender's Game and tried to analyze why it worked. Then, deliberately imitating myself, I wrote “Mikal's Songbird.” Ben bought it, and it received favorable mentions. I was afraid then that I had consigned myself to writing stories about children in jeopardy. But in fact I was writing character stories rather than idea stories. And THAT was how I built a career, not by self-imitation, and not by following editorial suggestions. I did get wise counsel from David Hartwell on my novel Wyrms, but that was on a book that was already under contract, and it was story feedback, not style. I got wise counsel from Beth Meacham, too, on various books over the years — but again, only on books that were under contract. I also received appallingly stupid advice from the editor of my novel Saints, which temporarily destroyed the book's marketability; after that, I was allowed to go back to my original structure and save the book — now it's one of my best. Editors don't know more than you about your story. They especially don't know why they decide to accept or reject stories. YOU have to know what your story needs to be, and take only advice that you believe in. Your best counselor on a story nobody bought is TIME. Let some time pass and then reread the story. Don't even think about why it Didn't Work. Instead, think about what DOES work, and then write it again, a complete rewrite, keeping nothing from the previous draft. Find the right protagonist and begin at the beginning — the point where the protagonist first gets involved with the events of the story. Be inventive — the failed first draft no longer exists, so you're not bound by any of your earlier decisions. THAT is how you resurrect a good idea you did not succeed with on your first try.

English

300

6.5K

300.8K

BLANPLAN | 空界計劃 me-retweet

AI Will@FinanceYF5·13h

文生图竞技场趋势（2026.1–2026.4）今年大部分时间，Google DeepMind 和 OpenAI 一直在争第一，GPT-Image 和 Nano Banana 差距很小，其他模型基本都在 1200 分以下。现在，GPT-Image-2 以 1512 分明显领先，甩开第二名 Google 242 分。文生图前沿还在加速推进。x.com/arena/status/2…

Arena.ai@arena

Exciting news - GPT-Image-2 by @OpenAI has claimed the #1 spot across all Image Arena leaderboards! A clean sweep with a record-breaking +242 point lead in Text-to-Image - the largest gap we’ve seen to date. - #1 Text-to-Image (1512), +242 over #2 (Nano-banana-2 with web-search aka gemini-3.1-flash-image) - #1 Single-Image Edit (1513), +125 over #2 (Nano-banana-pro aka gemini-3-pro-image) - #1 Multi-Image Edit (1464), +90 over #2 (Nano-banana-2) No model has dominated Image Arena with margins this wide. Huge congratulations to @OpenAI on this major breakthrough in image generation! More performance breakdowns by category in the thread below.

中文

13.2K

BLANPLAN | 空界計劃@blanplan·6h

AI 安全有两个独立层面：模型层（对齐、权重保护）和访问层（凭证管理、供应链合规、API 命名不可猜测性）。这类事故是访问层失效，和模型能力无关。多数 AI 实验室的安全资源高度集中在模型层，访问层沿用传统 IT 安全框架，执行质量参差不齐。凭证泄漏和端点可猜测性属运营纪律层，需要 SRE/IT 来管。

中文

5.3K

阿绎 AYi@AYi_AInotes·8h

彭博社爆料的 Anthropic这次的事可能是今年AI安全圈最讽刺的笑话。他们号称“太危险绝对不能公开发布”的顶级网络安全模型Mythos，能批量发现零日漏洞，能入侵几乎所有系统。只敢给苹果亚马逊思科这种级别的公司做封闭测试。结果发布当天就被一个Discord小群偷偷摸进去了，安安静静用了整整两周。没有啥惊天动地的黑客技术，就是三步。第一步，从之前Mercor数据泄露的4TB文件里，翻出了Anthropic的API命名规则，第二步，对着规则猜了几个可能的endpoint地址，第三步，用群里一个第三方评估承包商的合法凭证，直接登进去了。最绝的是这群人拿到了全世界最危险的网络武器之后，啥都没干，就用它建了几个简单的网站，故意低调到连Anthropic的监控都没触发😂😂😂 直到他们把截图发给彭博社，全世界才知道这件事。很多人说这是AI安全的重大事故，其实不然，我觉得被攻破的不是Mythos这个模型，是Anthropic整个外围的信任链。模型能找遍全世界的零日漏洞，却防不住自己人把凭证随便分享，也防不住供应链厂商把内部部署规则泄露出去，甚至防不住别人猜一下自己的URL地址😆😆😆 这才是最扎心的地方，我们天天担心超级智能会失控毁灭世界，结果现在最前沿的AI实验室连最基础的运营安全都做不好。就好比你造了一颗原子弹，把所有精力都花在防止它自己爆炸上，结果大门忘了锁，被路过的小孩推门进来，拿原子弹炸了个鱼😂😂😂 更反直觉的是大家都以为拿到这个模型的人会去搞破坏，结果他们选择了最无害的用法，这说明至少到今天为止，最危险的不是AI本身，是管理AI的那些人。 Anthropic一直以安全第一自居，天天讲负责任的规模化，讲可控发布，结果这次出事的恰恰就是他们最引以为傲的可控发布机制。那么问题来了， AI的安全边界到底在哪呢？在模型的权重里还是在人和系统的每一个缝隙里？欢迎交流呀！

Bloomberg@business

Anthropic's Mythos has been accessed by a small group of unauthorized users, raising questions about control of the AI model bloomberg.com/news/articles/…

中文

518

152.6K

BLANPLAN | 空界計劃@blanplan·6h

这个思路有价值，但信号校准要注意。视频点赞测的是视觉吸引力，游戏销量测的是有多少人愿意为上手体验掏钱，历史上两个数字差距不小。AI 视频把概念验证成本降到接近零是真实的收益，但区分「这条视频让我兴奋」和「这款游戏我会买」仍然需要另一层验证。和 itch.io 早期试玩版配合跑，信号会更接近有效需求。

中文

135

郭宇 guoyu.eth@turingou·9h

这个太牛了…我在想能否做出不存在的游戏的游玩视频，再根据反馈决定是不是要真的制作游戏…

歸藏(guizang.ai)@op7418

又跑了一条《黑神话：林冲》的游戏演示，这个效果超级好！ GPT-Image-2.0 + Seedance 2.0 所有的交互 UI 全都是动的，而且还有台词。要不是画面这个涂抹感，我真看不出来！

中文

152

51.8K

BLANPLAN | 空界計劃@blanplan·6h

图形设计被干死的这个判断要拆开来看。批量化的物料（社交图、Banner、活动海报）在 Canva 时代就持续被压缩，生图模型门槛降到 prompt 层后，最后一段 workflow 也被吃完。品牌策略层的视觉工作有不同的逻辑，视觉系统搭建和创意方向制定离不开品牌上下文。批量物料需求端已经被吃掉大半，品牌顾问类工作量没怎么变。

中文

1.2K

陳威廉@williamlab·8h

海报设计师真的可以下岗了。。。这做的也太好了。看了很多图了今天，真的感慨 AI 又彻底干死了一个职业。

中文

207

992

158.4K

BLANPLAN | 空界計劃@blanplan·7h

The u-bits/fs-bits distinction is invisible in the output. Given the string 2+2=4, there's no way to recover which type it is without already knowing the production mechanism. When you're evaluating a claim, you check the proof, the track record, the source. None of that runs through the u/fs question. The classification might be useful for building better models. It adds nothing to the question of whether to trust a specific output.

English

François Fleuret@francoisfleuret·14h

Listen. There are two sorts of bits: The one coming from understanding, like if *I* write "2+2=4", this string is encoded in u-bits, understood-bits, good stuff. Whereas ChatGPT writing "2+2=4" produces a string of fs-bits, aka fancy-statistics-bits. Which are crap.

Big Brain AI@realBigBrainAI

Oxford AI professor Michael Wooldridge: "ChatGPT doesn't understand anything. It's essentially doing some fancy statistics."

English

408

32.5K

BLANPLAN | 空界計劃@blanplan·7h

@jesselaunz 大模型代际跃升之后 token 消耗提速是常见现象。能力跳升后用户往往会把之前搁置的任务类别重新接进来，使用边界扩张，消耗随之提速。这类消耗曲线通常在新的使用范围稳定，不会线性增长下去。判断 200 刀值不值，要看扩出去的那些任务有没有在产出实际价值。

中文

Jesse Lau 遁一子@jesselaunz·22h

Opus 4.7发布以来，tokens消耗速度狂涨不光是opus，我发现cowork里的sonnet也消耗很快今天才周三，已经用到94%了，升级到200刀max

中文

9.2K

BLANPLAN | 空界計劃@blanplan·7h

The downstream effects hold. AI-assisted refactoring shifted the maintenance cost equation: teams tolerate messier intermediate code because they can regenerate clean passes faster. The debt still accrues. Users feel it when the debt compounds past what a single prompt can address. The regeneration ceiling is lower than most teams assume.

English

David K 🎹@DavidKPiano·9h

USERS care if code is messy. They care if the app is buggy. Or slow. Or if the UX is frustrating. Or if important features are missing or broken. These are downstream effects of messy, unmaintainable code.

Suhas@zuess05

Senior developers are currently having a massive existential crisis because Claude writes "messy code" A junior just used Claude to ship an entire feature in 2 hours. Meanwhile, the Senior is still spending 3 days reviewing code. When will y'all realize that literally nobody cares if the code is "messy"?

English

1.2K

44.2K

BLANPLAN | 空界計劃@blanplan·7h

字数不等于信息量。多 agent 集群在分工模式下很容易堆到三四万字，每个 sub-agent 完成了自己的小任务，但报告里重复论点、引文堆砌、缺乏上下文融合的问题也会等比例出现。训练过的单体 AI 因为记得用户偏好和风格约束，产出的 10K 有更高概率是实际读完的。字数对 agent 来说是最容易量化的输出，结论密度需要单独评估。

中文

321

猿子@Oo_Motoko·11h

分别用调教两个月的🦞和kimi 2.6的多agent集群，投喂同样资料和提示词，做一份关于退役电池回收的报告 🐱小八在我的长期训练下，已经能生成一份一万多字比较专业的报告 🌘kimi 2.6分了十二个agent担任团队不同角色，生成了一篇三万多字的论文满脑子就一句话，“我当年哪有这条件”

中文

9.8K

Jelajahi

@Kalshi @Hayami_kiraa @theo @sitinme @oragnes @zmt021 @manateelazycat @aarondotdev