UshakF

743 posts

UshakF

UshakF

@UshakF23

Utah, USA Katılım Aralık 2023
12 Takip Edilen31 Takipçiler
UshakF
UshakF@UshakF23·
@mofmof_investor HDD stocks have more than triple the forward PE of memory stocks. This is expected by the market.
English
0
0
5
3.6K
もふ社長@もふもふ不動産
1年前、AIの爆発的な需要でHBMが枯渇すると予想した。その次にFlashが枯渇すると言い続けた。 そして今、キオクシアの決算が凄まじいことになっている。 次に枯渇するのはHDDだと思っている。シーゲイトとウエスタンデジタルと東芝で独占していて、ほかに作れる会社がない。まだ火がついてないけど、どこかで売り上げが爆発する日を待っている。 目立つAIチップの裏で、地味なストレージ銘柄が次のフェーズに入ろうとしているかもしれない。半導体の研究開発の経験が、こういう先読みを可能にしてくれる。
日本語
27
67
899
131.6K
TheValueist
TheValueist@TheValueist·
$NVDA $MU $SNDK $LITE If you listened to the last $AEHR conference call, you’d know HBF is much closer to commercialization than the market comprehends. trendforce.com/news/2026/05/2… EXECUTIVE INVESTMENT VIEW The TrendForce report is strategically important because it reframes NVIDIA’s Vera Rubin cycle as more than a GPU, HBM4, and NVLink transition. The report points to a possible architectural shift in which the GPU becomes a more direct orchestrator of storage and near-memory resources, potentially allowing NAND-based high-bandwidth flash to move from peripheral storage into the AI memory hierarchy. The highest-conviction conclusion is not that HBF replaces HBM, but that the AI server memory stack is broadening into a tiered hierarchy: HBM remains the latency-critical, bandwidth-dense working memory; HBF or HBF-like NAND becomes a high-capacity read-mostly tier for model weights, cold experts, long-context state, and potentially selected KV-cache use cases; SSDs and networked storage remain lower-cost capacity tiers. This is directionally positive for NVIDIA’s platform control, positive for NAND vendors with credible HBF roadmaps, neutral-to-positive for HBM leaders over the medium term, and incrementally negative for CPU-centric data movement architectures and undifferentiated storage vendors. The key caveat is that the specific GPU-Initiated Direct Storage Access architecture described by TrendForce remains reported, not formally disclosed by NVIDIA as a named Vera Rubin feature. Official NVIDIA materials already confirm the broader strategic direction: Vera Rubin NVL72 integrates 72 Rubin GPUs, 36 Vera CPUs, ConnectX-9 SuperNICs, BlueField-4 DPUs, NVLink 6, and storage-oriented BlueField-4 STX/CMX infrastructure, with NVIDIA describing accelerated storage and context memory as part of the Vera Rubin platform. The distinction matters: official NVIDIA disclosures validate that storage is becoming a 1st-class AI-factory subsystem, while the TrendForce report extends that thesis into a more aggressive version in which the GPU initiates and controls storage access more directly. That incremental command-path shift would be technically material if confirmed. The investment significance is concentrated in 3 areas. 1st, NVIDIA’s moat would expand from accelerator and interconnect leadership into memory-tier orchestration, making the CUDA/runtime/software layer even more central to AI cost-per-token economics. 2nd, NAND may gain an AI-specific premium growth vector after years of cyclicality, particularly for SanDisk/Kioxia, SK hynix, Samsung, and Micron if their roadmaps can support high-bandwidth, high-endurance, thermally stable products. 3rd, HBM demand is not impaired in the investable 2026-2027 window; rather, HBF is more likely to reduce the severity of model-capacity bottlenecks and expand the use cases that justify high-end GPU clusters. The HBM cannibalization debate becomes more relevant in 2028-2030, when software, standards, packaging, and qualification can mature enough for HBF to become a mainstream architectural tier rather than a prototype or niche inference accelerator. WHAT THE TREND FORCE REPORT ACTUALLY SAYS TrendForce reports that NVIDIA and Amazon are advancing storage architectures that allow GPUs to directly control storage devices such as SSDs, and states that NVIDIA is said to plan GPU-Initiated Direct Storage Access, or GIDS, beginning with Vera Rubin. The article contrasts GIDS with existing GPUDirect Storage, under which CPUs still issue data requests before data is transferred to GPUs. Under the reported GIDS model, GPUs would access storage directly, bypassing CPUs and DRAM. The report also cites Yonsei University professor Song Ki-hwan to argue that CPU thread limits are increasingly mismatched with GPU-scale parallelism, and that GPU-HBM data transfer may account for roughly half of system power, supporting interest in NAND-based high-bandwidth flash placed closer to GPUs. The article’s most important claim is the capacity math around HBF. TrendForce states that NAND has roughly 30x higher bit density than DRAM and that replacing a conventional all-HBM package with a hybrid configuration of 6 HBF units and 2 HBM units could raise memory capacity from 192GB to 3,120GB, or 16.25x. That math is broadly consistent with SanDisk’s public HBF fact sheet, which describes a 1st-generation 16-die HBF stack with 512GB capacity and 1.6 TB/s read bandwidth. However, the comparison is not a full system-performance comparison. It compares capacity, not latency, endurance, random-access behavior, software scheduling complexity, or total cost of ownership. It also uses a 192GB baseline that maps to an 8-stack, 24GB-per-stack HBM configuration, whereas official NVIDIA Vera Rubin material points to 20.7 TB of HBM4 per 72-GPU NVL72 rack, or roughly 288GB per Rubin GPU. The 16x headline is therefore directionally useful but should not be applied mechanically to Vera Rubin economics without adjusting for HBM4 stack density and system configuration. The article also correctly highlights the central limitation of NAND-based memory: endurance. TrendForce cites around 100,000 program/erase cycles for NAND versus DRAM’s effectively far higher write tolerance, and therefore frames HBF as better suited to AI model parameters that are largely read-only during inference. That framing is technically important. HBF is most compelling when data is large, reused, bandwidth-sensitive, and not frequently rewritten. HBF is less compelling for optimizer states, activation scratchpads, high-frequency KV-cache writes, and latency-critical random access unless software can amortize NAND latency through prefetching, batching, and locality-aware scheduling. TECHNICAL INTERPRETATION Existing GPUDirect Storage already reduces the traditional CPU-memory bottleneck by enabling direct DMA transfers between storage and GPU memory, reducing CPU overhead and avoiding unnecessary CPU copies. NVIDIA’s documentation frames GDS as a way to move large amounts of data efficiently between storage and GPUs with lower latency, higher throughput, and fewer CPU resources. The reported GIDS architecture would be a deeper change: not merely a faster data path, but a more GPU-native control path. In practical terms, the difference is that GDS can still rely on CPU-side orchestration and file-system mediation, while GIDS implies that GPUs can issue or schedule storage requests directly enough to remove the CPU and system DRAM from a larger part of the I/O loop. That distinction matters because AI inference is increasingly bottlenecked by memory movement rather than peak FLOPS. Large language model decode, particularly for large dense models and trillion-parameter MoE systems, repeatedly streams weights and reads KV-cache state while generating relatively small amounts of compute per token. In agentic systems, the problem worsens because multi-step tool use, long context, and multi-agent workflows create growing state footprints and unpredictable memory reuse. NVIDIA’s own Vera Rubin materials emphasize trillion-parameter MoE models, long-context windows, accumulated KV cache, and high-concurrency serving as core platform targets, with Vera Rubin NVL72 delivering 20.7 TB of HBM4 and 1.6 PB/s of memory bandwidth per rack. This makes the storage-memory interface strategically relevant rather than peripheral. The most plausible implementation path is not literal GPU random access to commodity SSDs at HBM-like latency. The more realistic architecture is a software-managed hierarchy in which HBM is the hot working set, HBF is a near-memory read-mostly extension, BlueField or equivalent infrastructure manages security, virtualization, and data services, and the GPU runtime/compiler schedules prefetches based on model execution. The important product question is whether NVIDIA can hide NAND latency behind predictable model execution and large-scale parallelism. If model weights, MoE experts, or context blocks can be staged ahead of use, HBF can behave like a capacity-rich bandwidth tier. If access is fine-grained, random, and data-dependent, NAND latency will show through and HBF will degrade utilization. This also explains why HBF is more likely to matter first in inference than in training. Training has heavier write traffic, optimizer updates, activation checkpointing, gradient synchronization, and more stringent memory-consistency requirements. Inference has more read-dominant model-weight traffic and can tolerate more explicit placement if the serving stack is optimized. SanDisk explicitly positions HBF for AI inferencing and states that HBF can deliver 8-16x HBM capacity at similar bandwidth and similar cost, with simulated performance within 2.2% of unlimited-capacity HBM for reading 8-bit pretrained weights on Llama 3.1 405B. That benchmark is favorable but narrow: it is an internal simulation focused on read-only pretrained weights, and it does not prove general-purpose DRAM substitution. HBF VERSUS HBM HBF should be viewed as a complement to HBM rather than a replacement through at least 2027. HBM remains indispensable for low-latency, high-bandwidth, high-write-endurance operations. It holds the hottest model shards, activations, attention state, and latency-sensitive KV-cache segments. HBF would instead expand the addressable memory footprint for read-heavy data that cannot economically fit in HBM. The best analogy is not “NAND replaces DRAM,” but “NAND becomes a new tier between HBM and SSD.” SK hynix and SanDisk explicitly describe HBF as a new memory layer between ultra-fast HBM and high-capacity SSDs, designed to bridge HBM performance and SSD capacity for AI inference while improving scalability and power efficiency. The relative economics are potentially attractive because HBM scaling is constrained by DRAM die supply, TSV capacity, advanced packaging, yield, power, and customer qualification. HBF leverages NAND density and could create much higher capacity per package footprint. SanDisk states that 1st-generation HBF reaches 512GB per 16-die stack at 1.6 TB/s read bandwidth, while projected Gen 2 and Gen 3 products exceed 2 TB/s and 3.2 TB/s read bandwidth, respectively, with capacities up to 1 TB and 1.5 TB per stack and lower power consumption versus Gen 1. These projections, if achieved, would create a credible high-capacity memory tier for inference, but still not erase the latency and endurance gap versus HBM. The potential bear case for HBM is therefore long-dated and conditional. If HBF becomes standardized, production-qualified, and broadly supported by NVIDIA/AMD runtimes, future systems may require less HBM per parameter served, especially for sparse MoE inference where cold experts can reside off-HBM. However, larger models and longer contexts usually consume any memory efficiency dividend quickly. Historically, memory relief in AI tends to enable larger workloads rather than reduce total high-end memory spend. In the base case, HBF reduces the binding constraint that caps model scale and improves GPU utilization, thereby increasing total AI infrastructure return on invested capital and preserving HBM attach as the hot tier. The more immediate risk is not HBM unit displacement; it is HBM bargaining power. If NVIDIA can credibly supplement HBM capacity with HBF and storage-class context memory, NVIDIA’s dependence on any 1 HBM supplier is reduced at the margin. That would not remove HBM scarcity, but it could slightly weaken the long-term strategic leverage of HBM vendors if HBF becomes a standardized alternative for capacity expansion. Near term, this is outweighed by continued HBM4 demand for Vera Rubin and competing platforms. NVIDIA’s official Q1 FY27 release states that Data Center revenue reached $75.2 billion, up 92% YoY, and the company guided to $91.0 billion in Q2 FY27 revenue while not assuming any Data Center compute revenue from China; these figures indicate that near-term demand remains constrained by high-end AI platform supply rather than by insufficient end demand. WORKLOAD FIT The cleanest HBF use case is storing model parameters for inference, especially for large MoE models. MoE architectures activate only a subset of experts per token, creating a large inactive parameter pool that does not need to reside entirely in HBM if the active experts can be prefetched and staged quickly. HBF could materially reduce the number of GPUs required to host a frontier model or allow a larger model to run within a given rack footprint. The benefit is less clear for dense models, where the full weight set is read repeatedly and HBF bandwidth must be high enough to avoid throttling every token. Dense models can still benefit from larger memory capacity, but they are less forgiving if HBF becomes part of the critical decode path without excellent prefetching. Long-context inference is the 2nd major use case. Multi-100K and million-token contexts create large KV-cache footprints. NVIDIA’s CMX context memory platform is explicitly designed to hold latency-sensitive, reusable inference context and prestage it to increase GPU utilization, with NVIDIA claiming up to 5x higher tokens per second and 5x better power efficiency than traditional storage. This is highly aligned with the TrendForce thesis even if the exact GIDS mechanism is not confirmed. CMX/STX demonstrates that NVIDIA is already productizing context memory as a separate tier in Vera Rubin-era AI factories. RAG, vector search, recommender systems, and data-intensive training-adjacent workflows are additional beneficiaries. These workloads often involve large external corpora, embedding tables, sparse feature lookups, or retrieval steps that do not map neatly into HBM. GPU-directed storage could reduce CPU overhead, reduce data-copy latency, and make GPU clusters more efficient at mixed inference plus retrieval pipelines. However, latency variance and tail behavior are critical. An architecture that improves average bandwidth but worsens p99 latency would be less attractive for premium agentic services. NVIDIA’s broader Vera Rubin messaging is heavily focused on low-latency, long-context, high-throughput agentic inference, suggesting that any storage-tier innovation must be evaluated on end-to-end token latency and utilization, not raw bandwidth alone. CREDIBILITY AND TIMING The report has medium credibility as a directional technology signal and lower credibility as a fully specified NVIDIA product disclosure. The directional credibility is supported by 4 independent data points: NVIDIA’s existing GDS documentation, NVIDIA’s official BlueField-4 STX/CMX announcements, SanDisk/SK hynix HBF standardization activity, and multiple NAND vendors’ work on high-bandwidth or low-latency flash for AI. The lower-confidence element is the precise claim that Vera Rubin introduces GIDS in the form described by TrendForce and The Elec, because official NVIDIA materials reviewed do not use that specific GIDS nomenclature in the same way. Commercial timing is unlikely to be binary. The 1st monetization layer is already visible in high-performance SSDs, DPUs, NICs, and AI storage reference architectures such as BlueField-4 STX. The 2nd layer is prototype and sample-stage HBF in 2026-2027. TrendForce reported that SanDisk is moving to establish an HBF prototype production line, with prototypes targeted for 2H26, pilot operation around year-end, and commercialization targeted for 2027. SK hynix and SanDisk have launched an OCP workstream for HBF standardization, but SK hynix also states that demand for complex memory solutions including HBF is expected to pick up around 2030. This points to a staged adoption curve: early samples and hyperscaler qualification in 2026-2027, specialized deployments in 2027-2028, and broader standardization later if software support and production economics validate. The path is technically non-trivial. GPU-initiated storage must address command submission, memory protection, virtual addressing, multi-tenant isolation, file-system/object semantics, error handling, wear leveling, encryption, telemetry, and orchestration across thousands of GPUs. NAND page sizes and SSD optimal access sizes are not naturally aligned with GPU warp-level fine-grained memory operations. HBF can overcome some of this through massive parallelism, TSV-style stacking, controller logic, prefetching, and software-managed placement, but the runtime stack must be tier-aware. This is exactly the type of co-design problem NVIDIA is structurally advantaged in, but it also means adoption will likely be limited to NVIDIA-optimized serving stacks before becoming broadly portable.
English
7
8
73
16.4K
CK7
CK7@UtdMUFC92·
@Yujerino @graciehartie It’s correct, BTO owners pretty much 100% made $ and upgraded to condo. The problem now is, new launches are almost same price tag as resale, with better layout and superior finishes. Resale market is dead because new launch is simply the more attractive option now.
English
2
0
0
169
Negligible Capital
Negligible Capital@negligible_cap·
*ANTHROPIC EXPECTS REVENUE RUN RATE TO TOP $50B NEXT MONTH Jeez man. Mind-bending growth
Negligible Capital tweet media
English
18
29
477
104.4K
UshakF
UshakF@UshakF23·
@DRTnky Housing and car forms a way larger portion of expenses than food. I rather have more expensive food and cheaper house and car, than the other way around.
English
0
0
2
352
DarylTanky
DarylTanky@DRTnky·
This. I’ve always been a huge advocate for the “Singapore has very fair prices” notion Funnily enough, people that comment about Singapore being expensive to live in are usually expats and foreigners that made their deduction from watching “Crazy Rich Asians” Of course, if we were to take into consideration cars and housing, then yes, Singapore IS expensive But I think a more accurate index to evaluate “living” would be what we need - which is food Now, assuming an average meal cost of $10 (which is a heavy overestimate), the average monthly meal cost would be ~SGD$1000 Or a weekly cost of ~USD$200 That’s lesser than almost most tier-1 US cities With way more safety Food for thought (pun unintended)
DarylTanky tweet mediaDarylTanky tweet media
Bluebird@abigbluebird

That’s why I’ve never got the whole ‘Singapore is expensive’ stereotype, especially when it comes from locals. Singapore is expensive for expats, given that they have to rent private housing, are subject to expensive international school fees for their children etc. But for locals? In relation to the average salaries, it’s far cheaper than our neighbours and other global cities.

English
44
9
133
28.3K
UshakF
UshakF@UshakF23·
@Jsevillamol Logic isn’t the limiting factor, memory is. Idle compute is wasteful without sufficient memory feeding it. Also their customer base consist a larger portion of non-AI. If they hike prices, they simply get less demand.
English
0
0
0
13
UshakF
UshakF@UshakF23·
@RHouseResearch When you desperately need compute, you take whatever is available.
English
0
0
2
220
UshakF
UshakF@UshakF23·
@JimmyButlerCap Value and scale of inference compute likely 10-100x in the next 10 years as agentic AI proliferates across the economy. Memory prices can 5-10x yet inference margin can still expand.
English
0
0
0
49
UshakF
UshakF@UshakF23·
@MikeFritzell Its largest cap companies are trading at single digit forward PE.
English
0
0
0
534
UshakF
UshakF@UshakF23·
@0x_ZHUANG It provide excellent employment for the 10 top AI researchers in Singapore. For other Singaporeans, basically nothing changes.
English
1
0
6
431
Zhuang, 庄
Zhuang, 庄@0x_ZHUANG·
OpenAI Commits $234M for New AI Lab in Singapore 🇸🇬 making it the first applied AI lab outside of the U.S. Curious what does this mean for Singapore & everyday Singaporeans?
rachael de foe@unprofeshme

FORWARD DEPLOYED NATION 🇸🇬 @OpenAI 's first applied ai lab outside of the US is in singapore if you aren't bullish on that pink ic yet, you should be nowww 🩷

English
19
5
75
19.2K
UshakF
UshakF@UshakF23·
@illyquid High-teens % is basically stationary in the world of AI.
UshakF tweet media
English
1
0
6
455
Illiquid
Illiquid@illyquid·
I talk about the demand for Nand alot and that is often met with the admonishment that you have to look at supply for cyclical industries. Kioxia says that supply will grow high-teens % this year. So in light of all the incredible things being released this year, what do we think the % growth in demand is? 🫢
Illiquid tweet media
English
4
4
53
7.4K
Sal Goodarzi
Sal Goodarzi@macrotides·
Noticed these S&P / Real Rates divergences? Dot-com, GFC, 2015, COVID... 2026? Is it just me or it feels like this whole 2026 rally shouldn't have happened?
Sal Goodarzi tweet media
English
3
0
10
10.4K
UshakF
UshakF@UshakF23·
@0x_ZHUANG Posts like this are exactly what one would see only during a bubble. Singapore is a country built without natural resources, mainly via human intelligence, which is going to get deeply disrupted by AI. Also difficult to pivot to AI due to expensive energy and limited land.
English
1
0
4
309
Zhuang, 庄
Zhuang, 庄@0x_ZHUANG·
surprised this blew up with so many differing takes, some saying Singapore 🇸🇬property is a bubble Singapore private / landed is structurally destined to be up only > severely scarce land > safe capital haven > strong rule of law > confidence in government > constant wealth inflow > and if demand falls, the govt can unwind cooling measures to bring demand back it’s only a bubble if prices skyrocket with no fundamentals Singapore property has fundamentals, policy support, and land scarcity, all primed for an up only direction
Zhuang, 庄@0x_ZHUANG

easiest trade a Singaporean 🇸🇬 can take right now stockpile $SGD, downpay a private apartment / landed property, take the remaining 75% as a loan hold until SGD:USD moves to 1:1, while your land quietly appreciates 2–3% net per year and if you calculate the gain against your down payment instead of the full property value, the return starts looking a lot crazier Currency + Real estate appreciation

English
30
8
142
31.4K
Connor Bates
Connor Bates@ConnorJBates_·
Semis is close to 20% of all hedge fund net exposure.
Connor Bates tweet media
English
7
5
87
19.2K
Jukan
Jukan@jukan05·
KYE-HYUN KYUNG, ADVISOR TO SAMSUNG ELECTRONICS: MEMORY PRICES ARE EXPECTED TO DECLINE STARTING IN THE SECOND HALF OF NEXT YEAR.
English
47
124
1.3K
277.4K
The Transcript
The Transcript@TheTranscript_·
$SAP CEO: "“AI agents don't work without a brain. The brain is SAP.”
English
11
7
56
35.6K
UshakF
UshakF@UshakF23·
@dnystedt @jukan05 We started this supercycle with simple chatbots. We will end with long duration AI agents scaling across workflows powering the economy. The gap between demand and supply will only widen the next 5-10 years. Supply will never scale fast enough to catch demand.
English
0
0
0
353
Dan Nystedt
Dan Nystedt@dnystedt·
@jukan05 We started this supercycle primarily with 3 memory giants. We will end with at least 5 (incl. CXMT and YMTC), so yes, the next downturn is going to be bad.
English
9
11
105
13.1K
Jukan
Jukan@jukan05·
Memory companies need to raise prices in a disciplined way… if prices go up too fast, they tend to come down just as fast.
Jukan tweet media
English
34
41
615
179.8K