Tail Risk

3.6K posts

Tail Risk banner
Tail Risk

Tail Risk

@_tailrisk

curious

Katılım Eylül 2018
1K Takip Edilen373 Takipçiler
Tail Risk retweetledi
The Kobeissi Letter
The Kobeissi Letter@KobeissiLetter·
The AI investment cycle is only accelerating: Global data center CapEx driven by AI is projected to reach $5.2 trillion by 2030, according to McKinsey. IT equipment would represent ~$3.3 trillion of that total, followed by data center infrastructure at ~$1.6 trillion and power generation at ~$300 billion. This assumes 125 incremental gigawatts of new AI data center capacity added between 2025 and 2030, requiring as much electricity as ~125 nuclear reactors to power. In an accelerated demand scenario, total CapEx could rise to $7.9 trillion, with 205 incremental gigawatts of capacity added. A constrained scenario would require $3.7 trillion, with 78 incremental gigawatts added. The investment is expected to be driven by mass adoption of generative AI, enterprise integration across industries, competition between mega-cap tech and other firms, and governments investing heavily in AI infrastructure. The AI buildout is set to reach unprecedented scale.
The Kobeissi Letter tweet media
English
135
264
1.4K
234.6K
Tail Risk
Tail Risk@_tailrisk·
All roads lead to $AMZN
Jukan@jukan05

The most interesting comment I read in today’s report: All of Anthropic's unusual moves around the GPT-5.5 launch ultimately converge on a single conclusion: in order to secure compute, Anthropic must bind itself far more deeply — and far more dependently — to those who possess these physical resources. This conclusion was crystallized in the $100B deal formalized a few days ago. The most representative example is the deal with Amazon. The key terms are as follows: - Amazon will inject an additional $5B into Anthropic immediately, with up to $20B more in follow-on investment to come. Combined with the existing $8B, cumulative investment reaches up to $33B. - Up to 5GW of compute capacity secured, of which roughly 1GW is scheduled to come online by year-end 2026. - Anthropic has committed to spending $100B+ on AWS technology over the next 10 years (including usage requirements for AWS silicon such as Trainium and Graviton). - Purchase rights secured for Trainium2/3/4 and future Trainium generations. Trainium2 is currently sold out, and Trainium3 is largely booked as well. - The Claude Platform will be offered natively on AWS. Just how large is the 5GW Anthropic has committed to? It is equivalent to roughly five nuclear power plants. Given that Microsoft's total global data center footprint in 2024 is estimated at around 5–6GW, this means Anthropic alone is locking in incremental capacity — for AI training and serving — that rivals the entirety of MSFT's historical physical infrastructure. Coupled with Anthropic's announcement of having reached $30B ARR, the market is reading this deal as "Anthropic pre-signing Amazon's invoice in order to keep its growth going." It is also worth noting Dario Amodei's (Anthropic CEO) remarks accompanying the announcement: "Users tell us that Claude is becoming increasingly essential to how they work. We need to build infrastructure to keep up with rapidly growing demand." Last month's blog post from Anthropic — directly admitting that "surging enterprise/developer demand is placing unavoidable strain on our infrastructure, affecting reliability and performance" — was effectively a teaser for this deal. What matters is that the structure of this deal is more favorable to Amazon than to Anthropic. Amazon has already invested in OpenAI as well (up to $50B). In other words, the more fiercely OpenAI and Anthropic compete to eat each other's lunch, the more Amazon benefits simultaneously along three axes: cloud usage fees from both, adoption rates for its in-house silicon (Trainium as XPU, Graviton as CPU), and visibility on the recovery of data center CAPEX. This is structurally almost identical to the valuation premium Google historically enjoyed under the "full-stack player" framing. The market, however, has not yet fully come around to recognizing Amazon in this light. (Sentiment has admittedly improved compared to two weeks ago.) Amazon is still trading near a 10-year low on CY26 EV/EBITDA. But as the rivalry between the two AI labs escalates and Amazon begins to collect its "toll-gate revenue," we believe the market is likely to gradually move toward a re-rating. (Excerpt from Mirae Asset Securities’ AI Hot Issue report, dated April 24, 2026) $AMZN

English
0
0
0
15
Tail Risk
Tail Risk@_tailrisk·
Man, $INTC trading at 57 forward PE is nuts….. A little too much for my liking. I like the narrative but getting in now doesn’t have the same risk reward.
English
0
0
0
35
Tail Risk
Tail Risk@_tailrisk·
Tan said the CPU-to-GPU ratio is shifting as "it used to be 1 and 8, and now it's 1:4" $INTC
English
0
0
0
34
Tail Risk retweetledi
Oguz Erkan
Oguz Erkan@oguzerkan·
$AMZN CEO: “If you are building a big inference business and want decent margings, not having your custom silicons is a structural disadvantage.” Customers doesn’t just want better performance, they want better price/performance. Currently, we can’t deploy or use AI as expansively as we want because inference is still expensive. They run mostly on expensive $NVDA chips. The whole industry will eventually move away from $NVDA to better price/performance alternatives. This will accelerate when gains from training plateaus and most of the workloads shift to inference. $AMZN is well positioned to offer the best price performance as they are using their custom CPUs and XPUs at scale and the largest AI labs, Anthropic and OpenAI, are already their customers. Cloud growth is already expected to reach over 30% this year, proving that their strategy works. The market is not bullish enough on $AMZN.
English
29
43
470
102.8K
Tail Risk retweetledi
Patrick OShaughnessy
Patrick OShaughnessy@patrick_oshag·
Every conversation I have with @dylan522p, I'm really just trying to understand the supply and demand of tokens. This is a unique episode in that it's entirely dedicated to talking about both sides of that equation. We discuss: - The infinite demand for the newest models - @SemiAnalysis_ going from $10K on AI spend to $7M - Mythos and Anthropic's compute problem - Why TSMC spending $100B on CapEx could cause a shortage - Robotics as next demand wave - Why memory prices will double again This is my second conversation with Dylan and find myself needing to speak with him more and more often to make sense of it all. Enjoy! Timestamps: 0:00 Intro 1:00 Surging AI Spend 10:27 Token Demand 16:21 When Ideas Are Cheap and Execution is Easy 20:46 Model Hoarding 22:34 Robotics 27:03 The Compute Bottleneck 30:26 The AI Permanent Underclass 31:39 Supply Chain Reality 37:47 CPUs 42:54 Predictions: Public Backlash
English
34
117
1.1K
844.3K
Tail Risk retweetledi
Andy
Andy@andyyy·
Wow, KelpDAO comes out and says: > 2 of LayerZero’s RPCs were hacked > it was LayerZero internal compromise that led to the exploit > they took fast action to prevent another $75m vulnerability > the 1/1 DVN was the suggested setup from LayerZero & even after they asked further about it during the transition to L2s, it was kept the same > blames LZ for the setup My goodness. Absolutely no one taking any responsibility and no real detail on the loss socialization for Aave users still. I think we are all underestimating how long the WETH & stablecoin pools may be frozen.
Kelp@KelpDAO

x.com/i/article/2046…

English
43
46
508
69.4K
Tail Risk
Tail Risk@_tailrisk·
@MikeFritzell Why do you say it’s a bull market? Just from the tick up in prices? Just curious to get your thought.
English
1
0
0
21
Tail Risk retweetledi
Fishy Catfish
Fishy Catfish@CatfishFishy·
I'm dropping a thread of all the protocols that had to freeze their interop because of LayerZero being compromised. Let's go:
English
133
307
1.6K
324.4K
Tail Risk retweetledi
Omar (mainnet arc)
Omar (mainnet arc)@acceleratooooor·
Here's how to triage: 1. Go to admin.google.com 2. Security → Access and data control → API controls → App access control → Manage Third-Party App Access 3. Search for client ID: 110671459871-30f1spbu0hptbs60cb4vsmv79i7bbvqj if found → revoke / block
Vercel@vercel

Our investigation has revealed that the incident originated from a third-party AI tool with hundreds of users whose Google Workspace OAuth app was compromised. We recommend that Google Workspace Administrators check for usage of this app immediately. #indicators-of-compromise-iocs" target="_blank" rel="nofollow noopener">vercel.com/kb/bulletin/ve…

English
31
285
2.3K
563.1K
Tail Risk retweetledi
Ben Pouladian
Ben Pouladian@benitoz·
The chip is dead Long live the factory NVIDIA no longer ships a GPU. It ships a factory that gets better after it leaves the loading dock 2.7x more productive in 6 months On the same silicon Without changing the hardware Wrote up the full thesis $NVDA bepresearch.substack.com/p/the-chip-is-…
Ben Pouladian tweet media
English
12
13
139
19.8K
Tail Risk retweetledi
LADE BACKK
LADE BACKK@LadeBackk·
The Strait of Hormuz is open M-F 9:30-4pm ET.
English
518
3.4K
28.8K
1M
Tail Risk
Tail Risk@_tailrisk·
Clearly he hasn’t experienced an exploit or been hacked yet…. Too green Making money in crypto was never hard. It’s keep it lol
Game@game_for_one

Listened to a pretty interesting podcast, guest is a mid-frequency trader describing all the stupidity and his edges from that in the market. His core argument is simple: crypto has the worst counterparties in the world, by design. In equities humanity's full effort goes into correct pricing. Best math kids, best unis, best training, million $ salaries, multi-million bonuses. Competing against that as a trend follower gets you Sharpe 0.2 on a good day. Barely worth doing. And in crypto you're choosing between the XTX autist from Holland or the guy with an ape pfp in a boomer Facebook group who thinks Bitcoin replaces fiat. > Hist words. "There's no second-worst counterparties than crypto." Then 3 structural reasons the dumb money stays dumb: 1) Sticky capital. Money comes in, goes up 20-30%, now you've got a bunch of guys on house money playing loose at the casino. That money sloshes around within crypto but almost never leaves. Pull 100 friends who are in crypto, ask how many have an off-ramp plan. Vast majority don't. 2) Siloed capital within chains. Once you're in the Phantom wallet on Solana you're not bridging back to MetaMask and paying ETH fees. That capital is trapped in the ecosystem, sloshing between whatever horses are running, creating massive reflexive swings. 3) Price insensitive buyers and sellers on both sides. Bitcoin cultists buying at $120k because today is always the best day. That's your edge on the long side. VCs who got in at $5m valuation and are sitting on a $400m coin, slowly bleeding exit liquidity into thin markets for 90 days. That's your edge on the short side. North Korea who just hacked a bridge and needs to sell before anyone freezes the funds - doesn't care what price they get. Short that too. Now his edges, just simple stuff I think most of us know here but probably the majority doesn't execute on well or systematically. - Top 20 momentum Buy anything in the top 20 by market cap within 5 days of making a 20-day high. Sell when it goes 5 days without a new 20-day high. Equal weighted. Sharpe 1.3 through bull, bear. - Stack three things together That trend system plus cross-sectional momentum (rank everything, long top 50%, short bottom 50%, market neutral) plus carry (long highest funding rate coins, short most expensive to hold). Equal weighted. Daily execution from a spreadsheet. Comes out Sharpe 2. - Volume predicts price (volume attention price loop) Rank all coins by volume after stripping market noise. Long increasing volume, short decreasing volume. "Well over Sharpe 2." Known effect, reflexive, provable statistically. Higher volume predicts higher prices. Lower volume predicts lower prices. Sounds like nothing. - Short small caps that pump Momentum works on large caps. Flips negative by the third or fourth decile. Bottom 20% of Binance perps makes a 20-day high - short it. Strong edge because it's the market maker Dubai pump and dump lifecycle playing out mechanically every single time. - New Binance listing short Market maker contract is 90 days. Strike price set off 7-day VWAP after launch. From day 7, delta hedging mechanically pushes the coin down. Short it for 90 days. Every time. Edge comes entirely from understanding how the game is structured, not from any signal at all. Another case of simplicity winning. Will drop the podcast in the replies, worth listening.

English
0
0
0
43
Tail Risk retweetledi
Andrew Feldman
Andrew Feldman@andrewdfeldman·
How does memory determine inference speed? What are the types of memory? Why did Nvidia buy SRAM based Groq? What do AI processors do? Processors do three things. 1. they calculate results (Core). 2. they store the results (Memory). 3. move results to where they are needed (I/O communication). What determines Inference Speed for a Processor? Understanding speed is just understanding the bottleneck in each architecture. It’s a sort of engineering “Where’s Waldo.” The calculation in AI inference is trivial. Its mostly multiplication and addition. Processors in general are very good at multiplication and addition. Calculation is not the bottleneck. In fact, graphic processing units, GPUs, are so fast at calculation that they are often sitting idle waiting for the data required for the next calculation. This is an important hint. The reason GPUs can’t do fast inference is that the GPU cores can’t get data fast enough from memory. This isn’t the core’s problem. It’s the memory’s problem. What are the types of memory? There are two types of memory. Memory that can store a lot, but is slow at moving data. And memory that is fast, but can’t store very much data. The former is called DRAM (or HBM) and the latter is SRAM. Graphics Processing Units use DRAM. Why? Because graphics was the perfect use case for DRAM. In graphics, data is moved from memory to compute occasionally, and the calculation phase takes a long time. So it didn’t really matter if the movement of data was slow. The calculations took so long that the time it took to move data gets lost in the wash. But AI inference has the exact opposite characteristic. The compute time is small. While data moves constantly from memory to compute. In AI inference, the time it takes to move data from memory to compute is the lion’s share of the total time. And this is exactly where DRAM is slow. Lets see DRAM/HBM in action: To generate one word in an answer, all of the weight data need to move from memory to compute. And the weights are very large. Like hundreds of HD movies worth of data. Here is the kicker. The process is serial. So for the next word, all the weight data needs to again be moved from memory to compute. And again and again. For every single word in the answer, hundreds of HD movies worth of data need to move from memory to compute. Because DRAM/HBM is slow, moving data is time consuming. While the DRAM/HBM is moving data, the GPU is waiting. It is sitting idle. Pulling power. Doing no work. This is why GPUs can’t do fast inference. SRAM on the other hand moves data extremely quickly. So it is good exactly where DRAM/HBM is bad. It can deliver data extremely quickly to compute. This is why all high speed inference solutions use SRAM. But… and there is always a but. But SRAM can’t store much. And the weight data is very large and needs to be stored near the processing core. There are two options... What are the 2 options to solve DRAM/HBM’s weakness? If you want to use SRAM you really have two choices. 1. You can use hundreds or thousands of little chips. Each with a little bit of SRAM and a little bit of compute. This was Groq’s strategy. 2. Or you can use very big chips. This was Cerebras strategy. If you build an SRAM solution with little chips you need to break up the weight data, and put it in the SRAM of thousands of little chips. You divide and conquer. But then you need to link them together so they can communicate with each other. And this is where IO/communication comes in and rears its head, because chip-to-chip communication is slow. Thousands of little chips communicating over cables and through switches is slower and more power hungry than if all that traffic stayed on a big chip, or even several big chips. Communication between chips is slow, and communication on-chip is fast. So some of the gain won by using SRAM is lost as the hundreds or thousands of little chips need to communicate with each other hop by hop by hop. This is why wafer scale was the holy grail. It provided the benefits of SRAM while mitigating SRAMs weaknesses. By building a chip the size of a dinner plate, a chip that is 58x larger than the largest GPU, the designers could stuff it to the gills with SRAM. One can’t make SRAM store more data per square millimeter, but one can provide more square millimeters by building a bigger chip. And by building a bigger chip one avoids nearly all of the cost (in time and power) of communicating across chips. This is what we achieved at @cerebras. And why we are faster than Groq. And why we chose to build a wafer scale processor. And while we still sometimes need to communicate across chips, it is only across 4 or 8 or 12 chips rather than hundreds or thousands. The result is that Cerebras is the fastest AI processor in production.
English
15
24
170
17.9K