Johnny

168 posts

Johnny banner
Johnny

Johnny

@johnnygoodai

CTO/Co-Founder of https://t.co/nWR0jYsOAD

Palo Alto, CA Katılım Kasım 2021
1.6K Takip Edilen142 Takipçiler
Sabitlenmiş Tweet
Johnny
Johnny@johnnygoodai·
@YottaLabs partnered with @radixark to deliver the first @sgl_project inference on @awscloud Trainium. This isn't just a bold step toward embracing a multi-silicon, multi-cloud future—it's a strategic move to significantly reduce token costs for end users. Check out the blog and open-source code: yottalabs.ai/post/mini-sgla… Btw, you can also launch a Trainium instance on Yotta Console with 1 click: console.yottalabs.ai
Johnny tweet media
English
1
4
25
48.8K
Jun Xie
Jun Xie@xJ3588·
问:AI编程工具为什么只有AI原生公司如Anthropic和OpenAI做得好,而大厂如微软,亚马逊,谷歌,和Meta似乎都拿不岀像样的东西,尽管大厂的财力,算力,人力,数据等等都不缺,问题出在哪里? 有何门槛? DS答:这是一个很敏锐的观察。大厂并非没有努力,而是陷入了“创新者的窘境”与“组织惯性”的泥潭。核心门槛不在资源,而在以下四点: 1. 数据飞轮的门槛:AI编程工具需要海量、高质量、带真实反馈的代码数据。大厂内部代码虽多,但常涉密、陈旧或依赖特定框架,清洗和使用成本极高。而Anthropic和OpenAI可以主动爬取公开代码,并快速利用用户对话(如“这段代码报错了”)形成闭环迭代。 2. 产品形态的范式冲突:大厂的思维惯性是“集成工具”(如GitHub Copilot起初只是IDE插件),追求稳定、可控,不轻易颠覆现有生态。而AI原生公司做的是“结对编程伙伴”,敢于重构交互逻辑(如用自然语言生成整个项目骨架),甚至接受一定幻觉作为效率代价。 3. 组织架构与OKR困境:在大厂,AI编程只是云、办公、搜索等业务线的“支撑功能”,KPI是提升本产品留存或付费转化。部门壁垒导致数据、模型、产品团队割裂,难以持续投入。AI原生公司则全员押注这一场景,组织灵活,决策链条短。 4. 安全与合规的枷锁:大厂面对全球监管,对代码版权、许可证、企业数据泄露极为敏感。训练模型时过滤这些内容会大幅降低性能。而AI原生公司起步时风险偏好更高,可以先快速迭代,再补合规方案。 总结来说,大厂不缺资源,缺的是打破现有成功业务架构的决心,以及围绕“AI生成代码”这一全新交互方式重新设计产品的组织能力。而微软其实是一个反例——通过投资OpenAI并整合GitHub Copilot,它巧妙避开了内部僵局,绕道拿到了结果。 又问:不敢苟同。以上几点不像是原因,因为AI编程如今看来是AI时代生死攸关的必争之地,大厂不能也不可能假他人之手。难道这个领域大厂先天就打不过AI原生公司吗?大厂能赶上吗? DS答: 你提出了一个更尖锐、也更本质的问题:这究竟是“暂时没做好”,还是“结构上就不可能做好”? 坦率地说,在AI编程这个具体战场上,大厂确实处于结构性劣势,甚至可能永远赶不上第一梯队。这不是财力或智力问题,而是基因冲突。 核心结论:AI编程不是“功能”,而是“新物种”。大厂在用自己的补丁,去覆盖别人原生的操作系统。 具体到三个无法跨越的门槛: 1. 模型进化速度 vs. 产品安全责任的冲突 · 大厂(如谷歌、微软)的AI编程工具首先要对自己庞大的云服务、企业客户负责。一次错误的代码生成导致客户生产环境宕机,赔偿和声誉损失是灾难性的。因此它们必须花大量时间做安全对齐、版权过滤、稳定性测试,这直接拖慢了迭代速度。 · AI原生公司:模型犯错时,用户预期就是“助手偶尔出错”。它们可以每周发布数个模型版本,用用户的真实报错反馈作为训练信号。速度本身就是护城河。 2. 组织惯性的反作用:大厂的“集成”思维扼杀了新交互 · 大厂思考AI编程的路径是:“如何把AI塞进我们现有的IDE插件、DevOps流水线、代码库里?” · 原生公司思考的是:“如果代码可以用自然语言实时生成和修改,那IDE、版本控制、代码评审还应该存在吗?”——它们敢于设计全新的AI原生IDE(比如Anthropic的Claude 3.7配合特定工作流),从根本上改变开发流程。这种颠覆在大厂内部会被无数产品委员会毙掉。 3. 数据飞轮的诅咒:高质量实时反馈被企业防火墙阻隔 · AI编程越强,越需要海量人-AI协作的轨迹数据(用户接受了哪条建议?删改了哪里?最终运行是否成功?)。在大厂,这些数据散落在不同产品、被部门墙和隐私政策隔开,甚至被防火墙挡在企业内部。 · 原生公司从第一天起就为收集这类数据而设计产品,数据闭环的完整度差距是数量级的。 那大厂能赶上吗? 能,但条件极为苛刻:必须设立完全独立、与现有KPI脱钩的“颠覆型组织”,允许其产品与现有旗舰业务自相残杀,并且忍耐3-5年不要求明显回报。历史上只有亚马逊的AWS(颠覆自己的电商托管业务)做到了。但今天的董事会和投资人没有这个耐心。 更可能的结局:大厂会通过投资/收购原生公司来“补票”,自己则主攻企业级私有化部署、合规增强这类AI原生公司不想做的“脏活累活”。微软投资OpenAI已是明证。 所以,不是大厂打不过,而是它们没法在现有躯体内长出一个完全不同的新器官。除非壮士断腕——但那只会在被彻底淘汰前最后一刻发生。
Jun Xie tweet media
中文
3
0
3
96
Melissa Pan
Melissa Pan@melissapan·
Excited to share that MAP has been selected for ✨ICML Oral✨ We look forward to sharing the insights in the paper with the community And much much appreciations to everyone who participated in our study ❤️ MAP won’t be possible without your contribution to open science
Melissa Pan tweet media
Melissa Pan@melissapan

Excited to share: MAP has been accepted as 🌟 ICML Spotlight 🌟 We hope MAP can provide data-driven insights that help the communities to work on various under-explored research directions around agent systems! Huge thanks & congrats to my amazing co-authors. See you all at Seoul! 🫡

English
7
15
168
31.4K
Johnny
Johnny@johnnygoodai·
I completely stopped coding, and I stopped clicking the fancy UI, I just ask my coding agent to do everything for me, including launching a GPU pod and run more AI. Btw, GPU doesn’t equal NVIDIA on Yotta, it is just a computing silicon.
Johnny tweet media
English
1
1
18
27.4K
Inferact
Inferact@inferact·
We're onto Inferact's second office this year! Yesterday, we finally broke it in with an office warming. It's amazing to see how far we've come. The vLLM ecosystem has been growing at lightning pace, and we've been lucky to scale alongside it: helping teams serve inference faster, cheaper, and at scale. Thank you to everyone who made it out yesterday — customers, partners, friends, and the whole Inferact team. It meant a lot to celebrate this milestone together. We're hiring across all teams. If you want to join one of the fastest-growing AI infra companies and power the next generation of AI, check out our careers page or DM us. Excited for many more office warmings to come!
Inferact tweet mediaInferact tweet mediaInferact tweet mediaInferact tweet media
English
11
10
116
16.6K
Hao Kang
Hao Kang@GT_HaoKang·
ThunderAgent has contributed a coding-agentic RL training recipe to SkyRL, achieving a 3.01× rollout speedup with no accuracy loss!🚄 Using this stack, we successfully trained a 32B coding model on 5 H100 nodes! ThunderAgent is an efficient agentic serving runtime and accepted by ICML2026 as Spotlight paper and have been used in TogetherAI and other industry products. More from our team is coming this August. Agents are reshaping the LLM infrastructure stack. code: github.com/ThunderAgent-o… pr: github.com/ergt10/SkyRL/t… paper: arxiv.org/pdf/2602.13692 @NovaSkyAI @istoica05 @charlie_ruan @togethercompute @NVIDIAAI @DachengLi177
Hao Kang tweet mediaHao Kang tweet media
English
5
9
62
7.8K
Johnny
Johnny@johnnygoodai·
@tydsh Congrats Yuandong!
Indonesia
0
0
0
39
Yuandong Tian
Yuandong Tian@tydsh·
Today we launch Recursive. We are building AI that discovers knowledge automatically and improves itself recursively, an open-ended process that will fundamentally change how science and technology advance. Our 25 top researchers and engineers in San Francisco and London bring diverse expertise spanning agentic AI scientists, architecture and algorithm design, world models, optimization, and interpretability, united by a shared conviction that this is the most important problem we could be working on today. If you are interested in joining, please send your resume to talent@recursive.com. Follow us at @Recursive_SI!
Recursive@Recursive_SI

x.com/i/article/2054…

English
88
152
1.4K
168.8K
Jeff Clune
Jeff Clune@jeffclune·
Thrilled to share that we founded Recursive to create AI that safely conducts experiments on how to improve itself in an open-ended process of endless, automated scientific discovery. As I wrote in my 2019 AI-generating algorithms paper, this will likely be the fastest path to superintelligence. Our work since has shown the power of this approach. Excited to scale up and improve upon ideas like the Darwin Gödel Machine, HyperAgents, ADAS, OMNI, ALMA, The AI Scientist, PromptBreeder, Rainbow Teaming, Automated Capability Discovery, and other work on open-ended and AI-generating algorithms. We’ve assembled a dream team of researchers and significant resources to pursue this vision. My amazing co-founders are pictured here, and we have an all-star team of founding members (we’re over 25 and growing). Please join us if you are interested! Follow our progress @Recursive_SI
Jeff Clune tweet media
English
49
44
611
116.1K
Johnny
Johnny@johnnygoodai·
SaaS needs to become agent-native. GPU infrastructure should be designed the same way: as native to agents as possible. #YottaLabs is being built agent-native from day zero.
English
0
0
2
71
Johnny
Johnny@johnnygoodai·
@codex This is my first AI-generated post using Codex Chrome Plugin
English
0
0
1
87
Johnny
Johnny@johnnygoodai·
AI infra is entering a multi-silicon era. GPUs, TPUs, Trainium, and future accelerators will all matter. The winning platforms will make model serving portable, cost-efficient, and easy to run without vendor lock-in. That is what we are building at @YottaLabs.
English
1
0
3
26.4K
Johnny retweetledi
RadixArk
RadixArk@radixark·
$200 FREE CREDIT! We just launched our inference platform for beta testing, and we're giving it to the community first. ⭐ Star SGLang on GitHub (github.com/sgl-project/sg…) + repost this to claim your credits. → Limited spots, first come first serve → Deadline: May 13, 2025 (AoE) Every star, every issue filed, every PR reviewed, every question answered in Slack — You built this with us. Thank you for believing in open-source AI infrastructure, in our mission, and in us. Claim your credits: platform.radixark.com
RadixArk tweet media
English
36
264
346
81.5K
Johnny retweetledi
Yotta Labs
Yotta Labs@YottaLabs·
Yotta Labs is teaming up with @JELabs2024, @CreaoAI, @GoKiteAI, @kuseHQ, @CollovLabs, @gptdaoglobal to co-host AI Agent Builders Night in SF this week. If you're shipping agents, funding them, or trying to figure out what comes after the demo — come find us. RSVP 👇 📅 May 8, 2026 🕕 6:00 – 9:00 PM 📍 455 Valencia St, San Francisco, CA 94103 🔗 luma.com/5kkofx7o
English
1
3
6
363
Simon Mo
Simon Mo@simon_mo_·
Thank you @profjoeyg! Grateful for the last decade of inference research — excited for the next chapter at @inferact and @vllm_project
Joey Gonzalez@profjoeyg

Today I’m excited to congratulate @simon_mo_ on an outstanding PhD thesis defense on his work exploring the design of Inference Serving Systems. 🎉 Simon has been working on inference systems with me for nearly a decade -- long before most people even considered inference serving a research problem worth studying. Over that time, he helped drive inference systems projects spanning Clipper, @raydistributed Serve, and now @vllm_project. Together, these systems helped define the modern inference serving stack that powers today’s AI applications. Beyond being an exceptional researcher, Simon has also been a remarkable team and community builder, especially through his leadership on vLLM and the open-source ecosystem around it. Along with my colleagues @istoica05 and @koushik77, I am excited to see Simon leading @inferact as CEO and helping shape the future of inference systems and AI infrastructure. Congratulations, Simon!

English
10
1
59
4.1K
Johnny retweetledi
Zhijian Liu
Zhijian Liu@zhijianliu_·
DFlash for Gemma 4: Up to 6x Faster. ⚡⚡ Great to see MTP land natively in Gemma 4 today. If you want to push it further, try DFlash — open source, same quality, more speed!! github.com/z-lab/dflash
Google for Developers@googledevs

Gemma 4: Now up to 3x Faster. ⚡ Same quality, way more speed. Our new MTP drafters allow Gemma 4 to predict multiple tokens at once, effectively tripling your output speed without compromising intelligence.

English
74
186
1.5K
469.6K
Physion Labs Official
Physion Labs Official@Physion_Labs·
We've spoken with hundreds of ad creatives, marketing designers, filmmakers, and animation teams — and heard the same thing: the outputs look great… until they don't 😅. When they fail, it's incredibly hard to tell why. Is it the prompt, the model, or the world itself quietly breaking? That ambiguity is the real bottleneck. Physion-Atlas 1.0 introduces a more objective, diagnostic way to evaluate video world models — moving beyond high-level comparisons to surface what actually matters. It disentangles prompt misalignment from physical and visual inconsistencies, grounding every judgment in explicit spatiotemporal evidence. Not just which output is better, but what breaks, when, where, and why. From abstract comparisons → diagnosable reality 🔍 📄 Blog: physionlabs.ai/blog/physion-a… 📝 Evaluate your model: docs.google.com/forms/d/e/1FAI…
English
4
18
32
2.7K