Andrei Stan

459 posts

Andrei Stan

@andreiofstan

Katılım Ocak 2016

933 Takip Edilen44 Takipçiler

Andrei Stan@andreiofstan·1d

@KairosPraxis CXL will not work for inference for at least 3 years, 64GB/s theoretical maximum bandwidth is dog shit. The complexity and cost is way worse than ssds and flash

English

Kairos@KairosPraxis·2d

Kimi's paper went over my head so I asked my research assistant to connect the dots and find stonks that'll benefit from these trends. The answer will totally NOT surprise you: 1. $ALAB (and perhaps $PENG) - Kimi moves prompt history out of scarce GPU memory into larger CPU memory. Astera disclosed a 2027 design win for this using a customized Leo CXL controller 2. Phison ($8299.TWO) - Kimi uses NVMe as an overflow memory. Phison turns SSDs into managed AI memory, extending available memory to as much as 8 TB. 3. Interconnects ($SMTC/$CRDO) : Larger models require faster data transfer within a rack and across racks. We will see denser racks and more demand for AECs, ACCs, retimers, redrivers, etc. 4. $FROG — Kimi created 51 million sandboxes from 1.5 million software images. JFrog sells p2p distribution to launch thousands of environments quickly. 5. $RMBS — K3’s external cache pool increase the bandwidth of ordinary server memory. Rambus supplies the control and buffer chips installed on high-bandwidth server-memory modules. 6. $NET — Virtual computers and sandboxes. 7. $LITE and OCS - K3-like inference creates large, "predictable" cluster migrations. OCS lets you connect two clusters via optics.

English

9.7K

Andrei Stan@andreiofstan·2d

@StockMKTNewz More like 140B with compute as well

English

Evan@StockMKTNewz·2d

Meta Platforms $META and BlackRock $BLK will establish a joint venture to build and operate a 1-gigawatt data center complex in Texas Total development costs for the project spanning buildings, power, cooling and connectivity will come to ~$14B The campus will go online in 2028 BlackRock funds will hold an 80% interest in the project, while Meta retains the remaining 20% - Bloomberg

English

402

55.5K

Andrei Stan@andreiofstan·2d

@ryzerth @OwenBrakes Transceivers are not

English

Ryzerth 🐲@ryzerth·2d

@andreiofstan @OwenBrakes fiber is already cheaper and more reliable than DAC.

English

Owen Brake@OwenBrakes·4d

Datacenter interconnects are moving faster and faster. 224Gbps-PAM4 interconnects switch at 56 GHz, across a 1m copper wire yielding 36 dB attenuation! (4,000x drop in power). Your eye diagrams are useless here, you have to reconstruct signals from the noise.

English

261

20.3K

Andrei Stan@andreiofstan·2d

@ryzerth @OwenBrakes Also these are used for 2m< connections usually at most rack scale

English

Andrei Stan@andreiofstan·2d

@ryzerth @OwenBrakes Reliability and price, when CPO goes mainstream maybe

English

Andrei Stan@andreiofstan·18 Tem

@zephyr_z9 8 mi355x can too

Español

186

Zephyr@zephyr_z9·18 Tem

Huawei SuperPoDs or NVL72s

François Fleuret@francoisfleuret

What hardware can run Kimi 3 well? I read 8 H100s but that seems not much for ~3T parameters, no?

Nederlands

173

51.9K

Andrei Stan@andreiofstan·9 Tem

@rcx86 What is the tps?

English

Mr. Rc@rcx86·9 Tem

Grok 4.5 is too fast for how good it is!

English

759

Andrei Stan@andreiofstan·9 Tem

@SemiAnalysis_ 7 hops is bonkers

Nederlands

222

SemiAnalysis@SemiAnalysis_·9 Tem

With the upcoming TPUv8i BroadFly topology, it can scale up to 1,024 chips within a 7-hop radius, compared with the traditional 3D torus 16-hop radius. This enables lower latency, making it easier to overlap communication with compute without incurring the cost of a dedicated switch. 3/4🧵

English

21K

SemiAnalysis@SemiAnalysis_·9 Tem

Successfully training models on TPUs has been demonstrated by Anthropic through the past five-plus successful Claude releases. This is positive for the ML community, as Google TPUs continue to gain market share outside of internal Google workloads, giving frontier AI labs a viable alternative for training. 1/4🧵

English

520

114.2K

Andrei Stan@andreiofstan·3 Tem

@TrendSpider Going short 6 FW/PE is insane

English

631

TrendSpider@TrendSpider·3 Tem

BREAKING: Michael Burry just disclosed a new short position. It is... Micron $MU

TrendSpider@TrendSpider

BREAKING: Michael Burry has disclosed new positions. He is short: Nvidia $NVDA Tesla $TSLA Caterpillar $CAT Applied Materials $AMAT

English

194

141

1.8K

1.4M

Andrei Stan@andreiofstan·2 Tem

@zephyr_z9 @vikramskr What do you see the prime usage for CXL being in case of Google? Yes HBM is hard to get but is DRAM that much easier? Especially fast DRAM? You still need the memory for CXL.

English

195

Zephyr@zephyr_z9·2 Tem

@vikramskr No CXL is for something else

English

1.9K

Vikram Sekar@vikramskr·2 Tem

Why did Google have a change of heart?

Zephyr@zephyr_z9

"GFHK: MRVL is expected to generate $2.8B / $13.0B in revenue in 2027 / 2028, respectively, driven by Google CXL-related business." Did u listen to us, anon?? I'm pretty sure we were the first to break it down

English

131

40.9K

Andrei Stan@andreiofstan·30 Haz

@z4y5f3 @teortaxesTex What they are doing two tier scale-up that's such a waste

English

579

Yunfan Zhang@z4y5f3·30 Haz

@teortaxesTex These are almost definitely Huawei Ascend 910C. 910C superpod has 48 machines with 8 processors each. Each processor has two physical dies that could function as two logical processors. In that mode, each die has 64GB HBM and 200 Gbps RDMA networking so everything matches.

English

106

31.1K

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)@teortaxesTex·29 Haz

Damn, that's right. So: "≈48B" (thanks to N-Gram embedding, variable) active, 35T tokens. V4-tier, ≈8e24? Would be the biggest Chinese pretraining on domestic hardware. Some strange "Superpods": > "our accelerators" > "up to 48 machines each" > below 80 GB HBM per accelerator > "The device offers limited HBM bandwidth but a relatively large L2 cache" > "built-in 200 Gbps network adapter within the accelerator" what is this stuff Was the first truly big Chinese model trained on domestic compute also using some extremely obscure piece of hardware?

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) tweet media

tphuang@tphuang

@teortaxesTex @zugzwangg_ it said 35T+ tokens

English

225

77.6K

Andrei Stan@andreiofstan·30 Haz

@degentradingLSD You cannot memory pool without memory. DRAM is still the biggest bottleneck

English

1.4K

degentrading@degentradingLSD·29 Haz

I have said it in my notes for the last week. But i will say again. Memory pooling and compute will lead the next rally. Memory pooling - $ALAB, $CRDO, $PENG, $MRVL, $CRDO was hit hard by the rebalancing flow last friday - i think these levels offer good value.

degentrading@degentradingLSD

Pre Mkt Thoughts - 29 Jun 26 Asia markets had some early jitters before coming back to almost unchanged. KOSPI, NKY both traded with very strong correlation, selling off early in the morning before inching back into the close. Lets talk about the memory names. $285A had a dip to 82.5k levels before ending about -4% for the day. This represents a high r/r level for longs. $285A also has triggered some of the prior patterns i noticed in its price action. Meanwhile on the other side, news broke on $SAMSUNG and $SKYNIX planning to build 2 new massive chip fab sites in South Korea's southwest region as part of a "national project". Let's rerun back to Leopold's Situational Awareness article - the manhattan project. The early innings are coming through. Countries are treating this as existential. The sovereign buildout thesis is manifesting in real time. Aside, direct implications for Korea is that the rally in KOSPI should increasingly broaden out. I am keeping my eye out for Korea small caps. I believe that for small accounts, these can turn out to be some of the best opportunities. Personally, i am afraid of the illiquidity in those tickers and will probably sit it out. Coming into the US session, i indicated in my Sunday note that for end of quarter rebalancing, there is about 165b of equities to be sold into the close. Some have asked, how i would allocate into such a situation. There is no right answer. It hinges on 2 key pivots. 1 - What is your risk appetite? 2 - How much do you think flows have been front run? For those who are aggressive, that could mean allocating into one clip right into the close on tuesday. For others, that could mean a 2 day twap into the markets. I have been banging the drum in the last few notes that i see the best technical set up into July for longs. Let me take the time here to list it out again. First of all, the fundamental picture looks extremely strong. We had 2 articles dropping over the week. First was news that $AAPL was lobbying for the white house to allow them to buy from $CXMT. While this might sound bearish for memory. Please remember that the memory market is global. Paraphrasing Semianalysis Dylan - "It literally doesnt matter? CXMT cannot satisfy chinese demand this decade". Next, let me emphasize that $CXMT is selling at ~5% discount. After you account for quality differences, there is almost no difference. So why is $AAPL doing this? It's not about price, its that the supply is simply not there. If anything, this is a direct affirmation that indeed...supply is simply not there. Next, an article that $GOOG is capping $META's use of Gemini AI. How insane is this? $META cannot get enough compute that it is sourcing it from $GOOG. Crazier is that... $GOOG simply cannot meet the demand. Look, the hyperscalers will not stop capex any time soon, even as their equity prices take a hit. Because they see the revenues on the horizon. CAPEX hits the bottomline on t0, revenues and backlog hit the bottomline in the future. As these revenues arrive, the hyperscalers will start making money hand over fist. Remember...Anthropic's gross margins on inference is ~70%. By building now, they are securing a moat for the future. Where does this lead to? Again, the 2 themes i keep rehashing is compute (neos) and memory pooling. While i think memory will continue to be strong, the easy part of the rally and the valuation gap has been covered. On compute, $NBIS stands out as the leader of the pack. $SHAZ is an interesting up and coming player backed by Situational Awareness. i am expecting a 13G to hit the timeline tonight. On memory pooling, tickers include $ALAB, $CRDO, $PENG, $MRVL. On $CRDO, friday's sell off was very much influenced by the Russell rebalancing (as per my sunday note), i think the 240s area offer exceptional value for risk reward. In my notes last week, i talked about $BABA - I strongly think that now that even Burry has sold $BABA - sentiment cannot get worse than now. These levels also offer good risk reward. Sep calls especially are cheap. Good luck!

English

228

82.2K

Andrei Stan@andreiofstan·28 Haz

@jukan05 @__bag_h_dad Isn't samsung the most advanced on PIM tech?

English

792

Jukan@jukan05·28 Haz

@__bag_h_dad 퀄컴이 PIM 쓰려 한다는 얘기는 이전부터 있더라고요. 개인적으로 눈여겨 보고 있습니다. zdnet.co.kr/view/?no=20260…

한국어

13.8K

Jukan@jukan05·28 Haz

Yes, MU is not Nvidia. But going forward, it may become even more important than Nvidia. Think about it. Inference is now directly tied to money. But inference does not get better simply by adding more Nvidia GPUs. In fact, GPUs are often underutilized in inference, sitting idle due to memory bottlenecks. For inference, adding more memory is far more valuable. Ultimately, the ROI of inference depends less on GPUs and more on memory. So why are people still looking at Micron through Nvidia’s framework? Think bigger. Inference is memory.

emini tic@TicTocTick

MU is not NVDA!! MU goon crash to 700 soon (now 1200). Remember we had this at 80!!! RAM is NOT GPU fools !!!

English

132

240

2.7K

819.3K

Andrei Stan@andreiofstan·28 Haz

Expected it should appear, depends how they implement it but it may be very cool if the NIC has direct access to the whole NVLink domain. GPUDirect could work between any NIC and any GPU. This would also allow posting RDMA ops from any GPU/CPU to any NIC in the NVLink domain

Vengineerの妄想@Vengineer

ConnectX-10 NVLink-C2C なるものが登場しています。 Vera の PCIe は、Gen6 なので、1.6Tbps と想定する ConnectX-10 は Gen7 。となると、Vera + ConnectX-10 構成の BlueField-5 では、1.6Tbps を流せません。そこで、ConnectX-10側にNVLink-C2Cを追加することで1.6Tbpsを実現するの？

English

Andrei Stan@andreiofstan·27 Haz

@will_whang They use the lattice ecp5 for dsp

English

will whang🌻@will_whang·27 Haz

But it is interesting to see another use of MIPI CSI for things other than image data. Though I'm not ... exactly sure how much DSP work can RPI5 handle

English

633

will whang🌻@will_whang·27 Haz

crowdsupply.com/scale-rf/quadrf I must say that because there are only kit available, the cost is a bit higher than I expected at $499 (Complete Kit)

English

970

Andrei Stan@andreiofstan·25 Haz

@Qualcomm @Modular This is their last chance

English

174

Qualcomm@Qualcomm·24 Haz

Hardware plus software defines leadership in the AI era. Today we announced an agreement to acquire @Modular, advancing our evolution as a developer-first AI solutions company delivering generative and agentic AI from edge to cloud. bit.ly/44rBpe5

English

250

179.5K

Andrei Stan@andreiofstan·24 Haz

@jukan05 Marginally better than saying you are running 8x7b lmao

English

367

Jukan@jukan05·24 Haz

Is it just me, or does this feel bearish for Cerebras?

English

652

116.3K

Andrei Stan@andreiofstan·24 Haz

Nobody cares about TOP500 supercomputers. They are not even supercomputers anymore, it's just a cluster. Probably No. 1 submitted is not even TOP 50 if all clusters were considered

English

Andrei Stan@andreiofstan·19 Haz

@PolymarketSport This was the easiest bet ever, fucking missed it, literally free money

English

2.7K

Polymarket Sports@PolymarketSport·19 Haz

🚨JUST IN: A trader put $2.7k on Ronaldo to cry during the World Cup This pays out $3,800.00 on Polymarket

English

566

1.3K

29K

2.6M

Andrei Stan@andreiofstan·18 Haz

@DreadyBear @mzuhair123 If intel buys tenstorrent

English

102

DreadyBear@DreadyBear·17 Haz

@mzuhair123 Jim Keller is my money.

English

796

Muhammad Zuhair@mzuhair123·17 Haz

"hiring top CPU architect soon" I might know exactly who this is.

Alex@Alex_Intel_

Great tidbits from Lip-Bu in the pod 1) Almost became CEO in 2021, he declined 2) Lip-Bu boosting CAPEX (he said before that's tied to external customers) 3) Intel building an ARM CPU or making ARM AGI? 4) Lip-Bu hiring top CPU architect soon 5) 10 year plan, big company $INTC

English

14.9K

Keşfet

@KairosPraxis @StockMKTNewz @ryzerth @OwenBrakes @zephyr_z9 @rcx86 @SemiAnalysis_ @TrendSpider