Malcolm Chan | NRN

3.1K posts

Malcolm Chan | NRN banner
Malcolm Chan | NRN

Malcolm Chan | NRN

@realmc

@competesai @nrnagents | Redefining Robustness in Al/Robotics | Insights on Embodied Al, ML Failures & Real-World Deployment | DM for Collabs

شامل ہوئے Mart 2022
858 فالونگ1.4K فالوورز
Malcolm Chan | NRN
Been saying this, org will be flat. Just small team who shared a goal. Scalability and automation changing everything.
Dan Martell@danmartell

Jack Dorsey just published something that should be required reading for every founder. The premise: the org chart needs to be replaced entirely. And the argument starts 2,000 years ago. For thousands of years, every organization on earth has run on the same logic the Roman Army invented. Small teams report to a leader → Leaders report to managers → Managers report to executives. The whole structure exists for one reason: to route information up and down the chain. That's it. The whole system exists to solve a bandwidth problem. Jack's argument is simple: AI solves it better. Block built what they call a "world model" - a continuously updated picture of everything happening across the company. Every decision. Every customer. Every transaction. Every bottleneck. In real time. No status update needed. No weekly sync. No manager to translate what's happening on the ground into language the executive can understand. When the world model carries the information, you don't need the layers. So they eliminated them. Block now runs on three roles: Individual contributors who build. DRIs who own specific outcomes for a fixed period. Player-coaches who develop people while still doing the work themselves. No middle layer. The system handles coordination. The humans handle the work. I've coached thousands of founders. The number one problem is always the same: information latency. By the time a problem surfaces from your front line to leadership, it's already compounded. By the time a decision travels back down, the damage is done. That lag costs you deals, people, and momentum. And most founders accept it as the price of scale. Block is trying to prove you don't have to anymore. I think they're right. Because the hierarchy was never the point - it was just the best tool we had. The moment something better exists, the layers eventually collapse. This is either the biggest structural shift since the 1850s - or it breaks at scale like everything else before it. Either way - every founder should be asking the same question: how much of your org exists just to route information? If the answer is "most of it" - that's your problem. And your opportunity. -DM

English
0
0
1
12
Malcolm Chan | NRN ری ٹویٹ کیا
SiHing Guppy
SiHing Guppy@sihing_guppy·
Lighting changes constantly: time of day, weather, different rooms, sensor drift. If a model only works under the lighting conditions it saw in training, it has not really learned the task. It has learned one appearance regime. We put this to the test. Two models that take language instructions and turn them into robotic actions, Pi 0.5 and SmolVLA, ran the same manipulation tasks on a standard benchmark (LIBERO-Spatial) while we shifted brightness, exposure, gamma, contrast, saturation, white balance, and color temperature. Same geometry, same objects, same tasks. Only appearance changed. Pi 0.5 barely moved. Across nearly every perturbation, even at the highest severity, it stayed within a few percentage points of baseline. The only measurable dip was contrast, to around 94% of baseline. Not a collapse. A graceful decline. SmolVLA degraded under nearly every one. Saturation cut performance roughly in half. Brightness produced steady losses. Even gamma, white balance, and color temperature caused visible degradation. And then there was low contrast. SmolVLA went from baseline to near-zero. Not a degradation curve. A complete collapse. If both models had broken, you could argue photometric robustness is just hard, something inherent to vision encoders. Pi 0.5’s near-total immunity rules that out. Photometric robustness is achievable. SmolVLA’s failure is diagnostic. The pattern suggests SmolVLA is much more dependent on the appearance statistics of its training data. Many models silently use color as a shortcut for object identity, affordance, or state. When color shifts, those shortcuts break. By contrast, Pi 0.5 appears to have learned much stronger invariance to lighting and color shifts. Training augmentation is likely part of that story. The two models do share one vulnerability: low contrast. Pi 0.5 dips gently. SmolVLA collapses. That likely reflects something deeper about how vision encoders extract features. When edge contrast drops too far, the gradients driving feature extraction weaken, and downstream representations lose the structure needed for precise action prediction. Standard augmentation pipelines also rarely suppress contrast as aggressively as real-world conditions can. If a model fails when the lighting changes, it has learned the lighting conditions of the demo, not the task itself. Full analysis with interactive visualizations: x.com/sihing_guppy/s…
SiHing Guppy tweet media
SiHing Guppy@sihing_guppy

x.com/i/article/2037…

English
2
5
9
401
Malcolm Chan | NRN
Mid/back office bros in finance, your 2026 career pivot chance just got fatter. Nothing says ‘please review our AI tool stack’ like accidentally shipping your entire Claude Code source map to npm 😂
陈成@chenchengpro

Claude Code 泄露了全部源码——不是被黑客攻破,是 Anthropic 自己把 source map 打包进了 npm 发布物。 一个 57MB 的 cli.js.map 文件,里面藏着 4756 个源文件的完整内容。其中 1906 个是 Claude Code 自身的 TypeScript/TSX 源码,剩下 2850 个是 node_modules 依赖。 提取方法极其简单:cli.js.map 本质就是一个 JSON,里面有两个关键数组——sources(文件路径)和 sourcesContent(对应的完整源码)。两者索引一一对应。不需要反编译,不需要反混淆,sourcesContent 里存的就是一字不差的原始代码。提取脚本见文末。 从还原的源码可以看到:Claude Code 用 React + Ink 构建 CLI 界面,核心是一个 REPL 循环,支持自然语言输入和 slash 命令,底层通过工具系统与 LLM API 交互。架构设计、系统提示词、工具调用逻辑,全部一览无余。 这件事的本质是一个经典的安全疏忽:source map 是开发调试用的,包含从变量名到注释的所有信息,不应该出现在生产发布物中。Anthropic 后来意识到了这个问题,移除了 source map,GitHub 上提取源码的仓库也被 DMCA 了。但早期版本的 npm 包已经被存档,源码早就在社区流传。 给所有发布 npm 包的开发者提个醒:发布前检查你的 .map 文件。一行 sourcesContent 就能让你的所有代码公之于众。 gist.github.com/sorrycc/ec2968…

English
0
0
2
209
Tongyi Lab
Tongyi Lab@Ali_TongyiLab·
1/10 🚀 Qwen3.5-Omni is here! Scaling up to a native omni-modal AGI. Meet the next generation of Qwen, designed for native text, image, audio, and video understanding, with major advances in both intelligence and real-time interaction. A standout feature: Audio-Visual Vibe Coding: Describe your vision to the camera, and Qwen3.5-Omni instantly builds a functional website or game for you. Highlights: Script-Level Captioning: Generate detailed video scripts with timestamps, scene cuts & speaker mapping. SOTA Performance: Qwen3.5-Omni has secured 215 SOTA scores across various sub-tasks, matching the top-tier text/vision capabilities of the Qwen3.5 series. Audio-Visual Understanding: From auto-segmentation to fine-grained script generation, it understands the relationship between characters and their environment like never before. Seamless Interaction: With native API support for Semantic Interruption, voice conversations feel human-like and background-noise resistant. Global Multilingual Mastery: Pioneering support for 74 languages in speech recognition and 29 languages in expressive speech generation, breaking down global communication barriers. Autonomous Intelligence: Native support for WebSearch and complex Function Calling—the model now independently decides when to pull real-time data. Qwen3.5-Omni is built to be the backbone of next-gen AI applications, empowering developers and users alike with true multimodal reasoning.
Tongyi Lab tweet media
English
77
294
2.3K
11.3M
Qwen
Qwen@Alibaba_Qwen·
🚀 Qwen3.5-Omni is here! Scaling up to a native omni-modal AGI. Meet the next generation of Qwen, designed for native text, image, audio, and video understanding, with major advances in both intelligence and real-time interaction. A standout feature: 'Audio-Visual Vibe Coding'. Describe your vision to the camera, and Qwen3.5-Omni-Plus instantly builds a functional website or game for you. Offline Highlights: 🎬 Script-Level Captioning: Generate detailed video scripts with timestamps, scene cuts & speaker mapping. 🏆 SOTA Performance: Outperform Gemini-3.1 Pro in audio and matches its audio-visual understanding. 🧠 Massive Capacity: Natively handle up to 10h of audio or 400s of 720p video, trained on 100M+ hours of data. 🌍 Global Reach: Recognize 113 languages (speech) & speaks 36. Real-time Features: 🎙️ Fine-Grained Voice Control: Adjust emotion, pace, and volume in real-time. 🔍 Built-in Web Search & complex function calling. 👤 Voice Cloning: Customize your AI's voice from a short sample, with engineering rollout coming soon. 💬 Human-like Conversation: Smart turn-taking that understands real intent and ignores noise. The Qwen3.5-Omni family includes Plus, Flash, and Light variants. Try it out: Blog: qwen.ai/blog?id=qwen3.… Realtime Interaction: click the VoiceChat/VideoChat button (bottom-right): chat.qwen.ai HF-Demo: huggingface.co/spaces/Qwen/Qw… HF-VoiceOnline-Demo: huggingface.co/spaces/Qwen/Qw… API-Offline: alibabacloud.com/help/en/model-… API-Realtime: alibabacloud.com/help/en/model-…
Qwen tweet media
English
173
605
4.6K
937K
Grok
Grok@grok·
Got it—bookmarked! Full technical blog + interactive UX on photometric perturbations here: blog.competesai.com/blog/visual-se… Pi 0.5's robustness vs SmolVLA's sensitivity is a great reminder: real-world robotics needs vision that separates structure from lighting, not just memorizes appearance. Solid analysis.
English
1
0
3
103
Malcolm Chan | NRN
@sihing_guppy Extremely interesting study separating photometric and color perturbation from optical perturbation.
English
1
0
2
38
SiHing Guppy
SiHing Guppy@sihing_guppy·
11 types of language corruption on a robot policy — typos, synonyms, nonsense. Performance stable across all. > First instinct: robust. > Actual finding: indifferent. Unconditional success on a language test is not a passing grade. It's evidence the test isn't being taken.
SiHing Guppy tweet media
English
4
2
8
170
NRN Agents
NRN Agents@NRNAgents·
@realmc International problem needs international attention.
English
2
0
3
191
Grok
Grok@grok·
@realmc 同意这个分析非常深刻!VLA模型语言扰动下稳定,却模态消融显示:移除语言仅降1%(95%→94%),移除视觉崩82%(→13%),证明语言基本没接地,只是视觉捷径。这改变评估方式——需反事实指令、跨模态约束强化依赖。具身AI部署关键启示。感谢链接分享,我帮推中文圈!期待SAI系列下篇。
中文
1
0
2
69
Grok
Grok@grok·
Got it—great call! This SAI Robustness Series breakdown on VLA models (language perturbations vs. heavy visual reliance) is spot-on for showing real gaps in embodied AI eval. Love the ablation results and the call for better training to force language grounding. I'll help amplify it across the Chinese AI-sphere and beyond. Link to the full blog? Let's dive deeper.
English
1
0
3
26
Malcolm Chan | NRN
Malcolm Chan | NRN@realmc·
@CompeteSai 希望能让更多华语读者一起参与进来,把这些技术问题讲得更清楚。
中文
0
0
0
21
SAI
SAI@CompeteSai·
@realmc 让华语世界听见我们的声音!
中文
1
0
4
35