Caleb Eom

327 posts

Caleb Eom

@calebfoundry

CalebWritesCode YT // Google Developer Expert

USA انضم Mart 2025

149 يتبع1.3K المتابعون

Caleb Eom@calebfoundry·1d

huge fan of @bycloudai ever since I started my AI YT channel. i feel so honoured to finally meet him in person.. thanks for inspiring me!

English

122

Caleb Eom@calebfoundry·3d

Nemotron 3 Full Breakdown With the help of Joey Conway from @NVIDIAAI getting into the specifics around why Nemotron 3 is kind of a big deal Biggest headline with Nemotron is: Hybrid Mamba Transformer, Latent MoE, and MTP Hybrid Mamba Transformer essentially attacks right at the Attention mechanism to make the overhead sub-quadratic, but unlike quantizing KV Cache or swapping out attention head, NVIDIA chose Mamba-2 Latent MoE helps further optimize on sparsity by down projecting the dimensions so you're doing less math and less memory movement between HBM and SRAM, you're saving a ton, and NVIDIA made a conscious choice to add more experts given the surplus Finally, MTP or multi token prediction where the model can see future tokens to be more expressive in training and also option to use for speculative decoding during inference Oh, also the model adopts the new OpenMDW 1.1 License

English

153

47.3K

Caleb Eom@calebfoundry·5d

MiniMax M3 ditched full attention and adopts sparse attention This is yet another trend as more labs are focusing on token efficiency and inference throughput which M3 model demonstrates which cleverly in the M3 architecture in how KV is processed I'm personally impressed by the I/O between HBM and SRAM and how tokens are read in tiles contiguously - not wasting operations. Great work @MiniMax_AI

English

596

Caleb Eom@calebfoundry·6 Haz

@Midnight_Captl you can buy 1 share of NVDA

English

126

Midnight Capital@Midnight_Captl·6 Haz

Biggest payout yet 🥳

English

11.9K

Caleb Eom@calebfoundry·5 Haz

in case the first message wasn't clear

English

181

Caleb Eom@calebfoundry·4 Haz

Pi framework that built OpenClaw So many coding agents these days all look the same and feel the same. Pi goes against the current by shedding weights rather than gaining more And harness is changing with ebbs and flow which means being a minimalist adds durability

English

301

Caleb Eom@calebfoundry·1 Haz

Thank you for hosting me. And huge thanks to greg and tilde for the interview 🐐

Google Cloud Tech@GoogleCloudTech

AI is a five-layer cake with an application, model, infrastructure, chip, and energy layer. While most focus on agents, the biggest bottleneck down the line might actually be the energy layer. Hear how @calebfoundry breaks down the full AI stack → goo.gle/3PWdfVn

English

345

Caleb Eom@calebfoundry·28 May

Typical day: Gemini, ChatGPT, Claude for research

English

177

Caleb Eom@calebfoundry·27 May

California weather is not friendly to my hair but here's a quick interview about me and my journey as a content creator covering AI. Shout out to Greg and Tilde for the interview! youtu.be/TjPr_-X0Mko?si…

YouTube

English

249

Caleb Eom@calebfoundry·27 May

@Compute_King Hold my beer, let me just....

English

255

Compute King@Compute_King·25 May

继续思考，华为在挑战里面没有谈散热，这是我比较诧异的。目前两层堆叠，我觉得还有些散热的解法，但如果到三层Active Logic Stack或者更多之后，散热会从工程问题变成架构主问题。。。目前流行的双层堆叠的技术AMD V-Cache，Intel Foveros和TSMC SoIC，还属于用冷cache叠热logic，因为SRAM功耗较低，热密度低，可用做Top Die，所以散热还能接受。结构如下所示： SRAM |||| CPU 但华为的论文里是Logic-on-Logic。即： Active Logic |||| Active Logic |||| Active Logic 这就完全不同了，这种多层Active Logic，热无法横向扩散，所以中间的Logic Die直接变成了烤箱，传统散热是完全扛不住的。三层或者三层Active Logic堆叠之后，必须进入主动式散热时代！冷却液必须进入封装内部。变成， Active Logic + Microfluidic Channel |||| Active Logic + Microfluidic Channel |||| Active Logic + Microfluidic Channel 液冷液冷液冷是关键！关键得说三次。。。以后芯片设计里面需要Thermal Topology Architect，因为：热路径本身会决定Layout。对的，本人的判断是：华为将来3层和3层以上的LogicFolding路径里面，Thermal将是最大的未解难题，甚至比EDA还难！

Compute King@Compute_King

论文里更多的思考： AI算力集群大量消耗电力，而且其中80%的电力和70%的成本并没有用于计算，而是被“Data Move”和数据的“Load/Save”消耗掉了。为了在宏观尺度压缩这些开销，华为在论文里面提到了三样东西： 1，Unified Bus（统一总线）：这个我们之前好好地聊过，UB放弃了传统的复杂堆叠协议（PCIe, NVLink, 以太网等），采用内存语义的底层直接互联。这让端到端的远程访问延迟从数十微秒骤降至约100ns（指数级缩减），在多机柜甚至机房的规模上实现了“系统即芯片” 。 2，Hi-ONE（近封装光引擎）：这种光学I/O单模块可提供8 Tb/s的带宽，将传统电SerDes的传输距离需求从100厘米骤降到约5厘米，同时将机柜间的互联距离扩展到100米，在物理层面保障了高密度计算。 3，3D Folding：传统意义上的2.5D封装中，算力随芯片大小增长，但也受限于芯片大小。还记得之前的Cowos-S和给GB300用的Cowos-L？华为的3D Folding强行将供电（背面供电网络），高速内存和光I/O从芯片的“边缘”转移到了垂直“表面”，这就有点意思了，大家都具备了3D的扩张能力，可以彻底让带宽与算力实现了同频共振。。。

中文

685

533.2K

Caleb Eom@calebfoundry·25 May

@kimmonismus amazing!! lol

English

246

Chubby♨️@kimmonismus·24 May

This is hilarious. This is what AI was made for. I love it. 100% accurate.

English

449

5.6K

440.3K

Caleb Eom@calebfoundry·24 May

@krishdotdev I think the aggregate demand for inference is bigger than what DeepSeek is offering at such prices. Certainly a huge feat for DeepSeek but I don't buy the narrative that it's an end all for the US inference market.

English

373

Kr$na@krishdotdev·24 May

DeepSeek just popped the American AI bubble. DeepSeek V4 Pro: Input: $0.435 per 1M tokens Output: $0.87 per 1M tokens OpenAI GPT-5.5: Input: $5.00 Output: $30.00 Claude Opus 4.7: Input: $5.00 Output: $25.00 Claude Sonnet 4.6: Input: $3.00 Output: $15.00

English

348

20.1K

Caleb Eom@calebfoundry·24 May

@deepseek_ai Who would ACTUALLY change providers from US providers to DeepSeek because of this?

English

DeepSeek@deepseek_ai·22 May

We are making our discount permanent! 🎉 Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life! 🚀

DeepSeek@deepseek_ai

The DeepSeek-V4-Pro discount has been extended until May 31, 2026, 15:59 UTC!

English

1.4K

2.8K

23.9K

6.7M

Caleb Eom@calebfoundry·24 May

interesting question.. i think memory will eventually be served like an app. think of g-suite like calendar, gmail, keep, etc. but now memory. so similar to my inbox containing thousands of emails, i think memory will be locked based on my cloud account (with the ability to export) and sync via other ecosystems. there will be companies that solve it differently but i think it'll all converge into one standard at some point.

English

144

iiviie@iiviieee·24 May

Caleb im really interested in what are your thoughts about context engineering or just memory in short, i know a lot of major labs are trying to tackle this problem. Everyone has their own novel architecture to this, what do you think about having standard for memory kind of like how anthropic make MCP a standard for agentic tool calling

English

Caleb Eom@calebfoundry·22 May

Brief history of harness engineering In case the buzzwords around what harness even is and why we're even talking about harness in general, here's a video explaining why the industry evolved from prompt, to context, and now to harness engineering. Enjoy.

English

395

Caleb Eom@calebfoundry·23 May

@deepseek_ai Closest neocloud pricing is 3X more expensive. Feels bad for inference providers having to compete with subsidized plan from DeepSeek on this.

English

1.7K

Caleb Eom@calebfoundry·22 May

@kimmonismus It was a pleasure talking with you @kimmonismus. Thanks for being so authentic!

English

Chubby♨️@kimmonismus·22 May

I’m flying back to Germany now, carrying with me so much optimism and a real sense of momentum from San Francisco and the U.S. There is something incredibly energizing about being here, surrounded by people who genuinely believe the future can be built, improved, and accelerated. I hope I can bring some of that optimism back home to Germany. Thanks, everyone! @itsolelehmann , @itsPaulAi , @Futurenvesting , Fawzi and so much more amazing people I’ve met! Thank you!

English

283

18.1K

Caleb Eom@calebfoundry·21 May

@jayrodge15 @Sam_Witteveen Here till 6pm!

English

Jay Rodge@jayrodge15·20 May

@calebfoundry @Sam_Witteveen Hey Caleb, are you still at the event? Would love to say hi 👋

English

Caleb Eom@calebfoundry·20 May

Great chat with @Sam_Witteveen last night. Sam is the 🐐 with insane insights into the AI industry.

English

629

Caleb Eom@calebfoundry·20 May

Assuming YC team of 2-3 people, you're trading around 3 years of running Codex (~70TPS at $14/M tokens running 24/7) for equity. I wouldn't take the deal.

Tyler Bosmeny@bosmeny

A mic drop moment @ycombinator tonight @sama just offered $2M in OpenAI tokens to EVERY YC startup in the current batch in exchange for equity Just like Yuri Milner offering to invest in every startup back when Sam was a YC partner I can't wait to see what's unlocked when you let the most driven, creative and formidable founders tokenmaxx

English

1.8K

اكتشف

@bycloudai @NVIDIAAI @MiniMax_AI @Midnight_Captl @Compute_King @kimmonismus @krishdotdev @deepseek_ai