Lei Zhang (Harry)

2.3K posts

Lei Zhang (Harry)

@resouer

ML infra @NVIDIA / Prev: Alibaba and Microsoft

San Francisco, CA Katılım Eylül 2012

891 Takip Edilen1.9K Takipçiler

Lei Zhang (Harry)@resouer·1d

@jeffhollan @satyanadella Why isolation per session? I recalled folks were against this aws-ish idea 😉

English

654

Jeff Hollan@jeffhollan·1d

Excited to announce the new preview for Microsoft Foundry Agents 🎉! You can now build, run, and deploy your agent using any model, any framework, any harness in the cloud 🧑‍💻 - check out the demo below This is not just any cloud compute environment; it's an agent-optimized platform with: 🖥️ Persistent microVMs - securely scale up and down without losing context 🛠️ Built-in tools (1000+) 👀 Observability and evaluations 👷 Guardrails 🔐 Private networking... and more

English

620

138.9K

Lei Zhang (Harry)@resouer·5d

@yadong_xie @tualatrix 如果以后开发部署流程都在手机上，这种一键 vibe 到上线直接赚钱的体验还是很有用的。它这种应该会 focus 在比较轻的应用上，类似 lovable / nocode

中文

Yadong Xie@yadong_xie·5d

@resouer @tualatrix 不过我倾向这个东西只能拿来做玩具，能抹平沟通成本，大家用这玩意做出来 poc 验证需求之后，还是会 deploy 到别的地方去

中文

图拉鼎@tualatrix·6d

抢先体验 Claude Design……它是真的打算把所有的软件生产相关的生产力工具都做了吗？

中文

170

40.2K

Lei Zhang (Harry)@resouer·5d

@yadong_xie @tualatrix 如果仔细扒 cc 代码的话你还能看到一键上线托管 app 的功能

中文

Yadong Xie@yadong_xie·6d

@tualatrix 产品经理，设计师，前后端一起打包带走

中文

2.5K

Lei Zhang (Harry)@resouer·12 Nis

@garrytan @mstockton @j_schottenstein @hwchase17 Nope, your harness is claude code, which is already fat enough.

English

241

Garry Tan@garrytan·12 Nis

@mstockton @j_schottenstein @hwchase17 It’s wrong Thin harness, fat skills: x.com/garrytan/statu…

Garry Tan@garrytan

x.com/i/article/2042…

English

35.3K

Matt Stockton@mstockton·11 Nis

This post is so so good. We are at this interesting point where we’ve started to really figure out ‘memory’ with these LLM-based systems. And @hwchase17 is totally right - memory is basically just context collected and injected at the right times - it’s probably the most important context, and it *must* be interpretable to you, and portable for you. There is a real danger in adopting systems that won’t allow this for you, or actively prevent you from doing it. I hadn’t put it all together before reading this post, but this is definitely one of the reasons I love building on top of DeepAgents

Harrison Chase@hwchase17

x.com/i/article/2042…

English

204

87.9K

Lei Zhang (Harry)@resouer·12 Nis

@CoooolXyh 相比于正规云厂商，CF 的 CDN 服务的价格几乎等于白送，真正的利润来自于托管的同时给企业客户提供安全扫描和加固。北美很多 infra 小公司商业模式都是这样，主产品直接白送用来走量，然后靠安全/隐私/合规这几个能贩卖焦虑的硬通货挣钱。现在市场担心它这个 value 要被客户自己拿 AI tool 吃掉了。

中文

204

Yuhang@CoooolXyh·11 Nis

没太懂，mythos 为什么会这么影响 cf 股价🤔

The Kobeissi Letter@KobeissiLetter

BREAKING: Cloudflare stock, $NET, extends losses to over -13% on the day after Anthropic’s launch of Claude Mythos, an AI model that finds and exploits software vulnerabilities. The stock is now down -22% in 4 days.

中文

16.2K

Lei Zhang (Harry)@resouer·12 Nis

@9hills 嗯，是这样。我们之前在云上大规模托管 agents，从来不需要 agent in sandbox 这种邪修，一个 agent 就是一个 hypervisor container 实例，agent 在里面随便折腾，给最高权限，session 都是存在云数据库的，需要执行第三方代码的时候直接从 pool 里面拉一个远程 code interpreter 容器来执行。

中文

九原客@9hills·9 Nis

Anthropic 的 Managed Agents 可能会被低估，其主要解决了大规模自主 Agent 工程化托管的问题。举个 LangChain 也说过的小点，Agent 要运行 Bash，到底是 Agent in Sandbox，还是Sandbox as Tool？ Claude Code 是前者，而真正的 Agent Infra 一定是后者。

中文

12.3K

Lei Zhang (Harry)@resouer·23 Mar

Training a model that's good at both math and code sounds like a data mixing problem. It's not. RL gradient updates are far more aggressive than SFT, optimizing for code actively degrades math. The domains fight each other. Kudus Nemotron-Cascade 2 finds a different bet!

Wei Ping@_weiping

🚀 Introducing Nemotron-Cascade 2 🚀 Just 3 months after Nemotron-Cascade 1, we’re releasing Nemotron-Cascade 2: an open 30B MoE with 3B active parameters, delivering best-in-class reasoning and strong agentic capabilities. 🥇 Gold Medal-level performance on IMO 2025, IOI 2025, and ICPC World Finals 2025: • Capabilities once thought achievable only by frontier proprietary models (e.g. Gemini Deep Think) or frontier-scale open models (i.e. DeepSeek-V3.2-Speciale-671B-A37B). • Remarkably high intelligence density with 20× fewer parameters. 🏆 Best-in-class across math, code reasoning, alignment, and instruction following: • Outperforms the latest Qwen3.5-35B-A3B (2026-02-24) and even larger Qwen3.5-122B-A10B (2026-03-11). 🧠 Powered by Cascade RL + multi-domain on-policy distillation: • Significantly expand Cascade RL across a much broader range of reasoning and agentic domains than Nemotron-Cascade 1, while distilling from the strongest intermediate teacher models throughout training to recover regressions and sustain gains. 🤗 Model + SFT + RL data: 👉 huggingface.co/collections/nv… 📄 Technical report: 👉 research.nvidia.com/labs/nemotron/…

English

388

Lei Zhang (Harry)@resouer·11 Mar

Just read SWE-CI arxiv.org/abs/2603.03823, finally a benchmark measuring AI coding maintainability over time. Score weights later iterations more heavily so technical debt shows up. Though: what if Architect generates weak requirements tank the Programmer's score? Ablation Study?

English

249

Lei Zhang (Harry)@resouer·12 Oca

@CoooolXyh 如果开了多个 IDE 有办法用语音在他们之间切换输入吗？

中文

2.9K

Yuhang@CoooolXyh·11 Oca

Typeless 实在太牛逼了我天天看大家说，结果直到今天才下载。以前一直以为就是一个普通的语音输入法，能有多牛逼？今天一体验，就知道为什么大家都在夸它了这段话也都是Typeless帮我输入的

中文

240

108.7K

Lei Zhang (Harry)@resouer·24 Oca

@ibuildthecloud So you mean it can also write yamls for me for real? 😉

English

272

Darren Shepherd@ibuildthecloud·24 Oca

So this is what I've been working on for a while. Otto8 is just an example of Obot Platform which has all the real code for this. Obot is the thing that gets sold to enterprise, but it's also open source too. Otto8 is just a bunch of configurations on top of Obot. Oh, and the user UI I built. Yep, I'm full stack now.

Darren Shepherd@ibuildthecloud

Announcing Otto8: ChatGPT for DevOps (link in thread) It's like ChatGPT, but has automated tasks, easy integration with devops CLI tools, integrated shell, and many other features. And it's fully open source.

English

8.1K

Lei Zhang (Harry)@resouer·10 Oca

@the_sttts @tsaha Hence I believe in practice we simply yield the control of those knobs to the platform. Git or any config source should not be the replacement of your automatic system.

English

Stefan Schimanski@the_sttts·9 Oca

@tsaha A promise of server-side-apply that in practice unfortunately is not fulfilled because there is only fail-on-conflict or force-overwrite, not ignore-that-one-conflict because I actually don‘t care enough.

English

Tamal Saha@tsaha·9 Oca

How does GitOps work with "intelligent" #Kubernetes operators? By "intelligent" I mean operators that can modify a resource automatically to perform operations like scaling, auth / tls rotation etc.

English

139

Lei Zhang (Harry)@resouer·23 Kas

@dims Just watched, this is a great learning! So it seems they used EKS and Karpenter directly, right? Not with AWS HyperPod.

English

Davanum Srinivas@dims·22 Kas

youtu.be/c9NJ6GSeNDM?t=…

YouTube

ZXX

316

Davanum Srinivas@dims·22 Kas

Since #Anthropic is a hot topic today, see how they use #EKS for a whole bunch of their stuff :) youtu.be/c9NJ6GSeNDM?t=… @AWSOpen @awscloud

YouTube

English

Lei Zhang (Harry)@resouer·9 Ağu

@cgillum I feel you. I like this one: github.com/hwchase17/lang…

English

Chris Gillum@cgillum·6 Ağu

I'm playing with AutoGen and think it's really cool, but it also feels like a low-code tool to me. As a professional developer, I feel like I'd rather write loops and termination conditions myself in code rather than using declarative constructs, like termination message filters or human-in-the-loop config.

English

1.3K

Lei Zhang (Harry)@resouer·30 May

@clare_liguori This is awesome Clare! Is the state machine generated by Step Functions or manually created?

English

758

Clare Liguori@clare_liguori·30 May

I just published a complete guide to serverless Gen AI prompt chaining with Bedrock and Step Functions 🚀 Write prompt templates, chains, loops, if-thens, model response validations, and call other AWS services, all with the Amazon States Language github.com/aws-samples/am…

English

180

26.7K

Lei Zhang (Harry)@resouer·2 May

@sunbains @TreybigDavis @reneeshah123 @cra I'd argue for non-hype-cloud db vendors, if possible, they'd love to pivot to similar direction too. If this is valid, I expect cloud will adapt to the needs from industry eventually, just a matter of time.

English

184

Sunny Bains @TiDB@sunbains·2 May

Not sure the industry is heading in this direction. Unless the cloud providers start providing direct access to RDMA. Alibaba (PolarDB/PolarFS) can use RDMA because they own the entire stack and do as they please. AFAIK, it’s not that cheap either but it does wonders for latency.

English

Renee Shah@reneeshah123·1 May

This paper from 2022 was compelling. Right now, databases are separating storage and compute. In the future, databases may separate memory and compute: arxiv.org/pdf/2207.03027

English

123

19.4K

Lei Zhang (Harry)@resouer·2 May

@TreybigDavis @reneeshah123 Yes, it's an accurate definition of "cloud-native database" and IMO where the industry is heading to. Means, I am not sure of some recent debates on StarRocks: #issuecomment-2087643176" target="_blank" rel="nofollow noopener">github.com/cncf/sandbox/i… /cc @cra Instead, this trend will grow K8s adoption as the de facto deployment platform for db.

English

267

Davis Treybig@TreybigDavis·2 May

Alibaba has published a lot of cool papers in this space too like users.cs.utah.edu/~lifeifei/pape… and vldb.org/pvldb/vol16/p4…. Their PolarDB product disaggregates memory. I have asked some people why this is not more commonplace in cloud data systems and what I have heard is that it is primarily enabled by RDMA blogs.nvidia.com/blog/what-is-r…, but the problem is that RDMA is very low level and therefore only something the cloud vendors themselves can take advantage of since they run the datacenter too. Third party DB vendors can't really make use of it though. I think otherwise the latency overhead of near vs. far memory is too high - e.g. see slide 12 from this talk: mvdirona.com/jrh/talksandpa…

English

754

Lei Zhang (Harry)@resouer·1 Eyl

@kelseyhightower Is this part just a LOGO or it has radar or sth inside?

English

237

Lei Zhang (Harry)@resouer·10 Ağu

@zongheng_yang @kelseyhightower Skypolit is veru promising as it's the first time we position the "driver for clouds" at the center. While this may create adoption barrier as replacing the "driver" for an existing system could be painful. Now sure how such brownfield looks like though?

English

Zongheng Yang@zongheng_yang·9 Ağu

@kelseyhightower "Sky Computing" is heading in this direction: users specify an app, and an (inter)cloud broker system just figures out the best location in the public clouds to run it. See our attempt: github.com/skypilot-org/s…

English

374

Lei Zhang (Harry)@resouer·27 Haz

@jeffhollan @SnowflakeDB I like this orchestrator +compute approach and support of diverse workload types. Great move!

English

141

Jeff Hollan@jeffhollan·27 Haz

Today ⁦@SnowflakeDB⁩ announced Snowpark Container Services. Run cloud native multi container apps easily right next to your most valuable data. Super excited about the newest member of the Snowpark family ❄️🎉 snowflake.com/blog/snowpark-…

English

2.4K

Lei Zhang (Harry)@resouer·9 May

@davidfowl It's great! I am learning 😉

English

120

David Fowler@davidfowl·8 May

This repository and guidance came out of frustration answering questions related to .NET and async github.com/davidfowl/AspN… If you haven’t seen it before, I hope you learn something new. #dotnet

English

141

668

76.3K

Keşfet

@jeffhollan @satyanadella @yadong_xie @tualatrix @garrytan @mstockton @j_schottenstein @hwchase17