Sanjay Pamnani

75 posts

Sanjay Pamnani

Sanjay Pamnani

@PamnaniSanjay

Learn/Decode. Act/Invest. Reflect/Reevaluate. Repeat.

New York, USA Katılım Mart 2012
3.2K Takip Edilen264 Takipçiler
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
@BenBajarin Where you come out depends on your time horizon. In the short term curtailing compute for internal developers hurts productivity. However a large scale build out your proprietary chip ecosystem in the long run will pay huge dividends (full stack approach).
English
0
0
0
29
Pedro Domingos
Pedro Domingos@pmddomingos·
Give this man a Turing Award.
Pedro Domingos tweet media
English
26
29
822
73K
Sanjay Pamnani retweetledi
Gilfoyle
Gilfoyle@wangleineo·
It has theoretical value - it proves recursive transformer blocks can do some deep reasonsing. But I don't see the real-world use cases for such models. Accuracy-wise, it is worse than the search algorithms (100% accuracy for solvable problems like sudoku), so LLMs are better off using these algorithms as tools, not HRM/TRM.
English
1
1
9
1.5K
Sanjay Pamnani retweetledi
Jackson Atkins
Jackson Atkins@JacksonAtkinsX·
My brain broke when I read this paper. A tiny 7 Million parameter model just beat DeepSeek-R1, Gemini 2.5 pro, and o3-mini at reasoning on both ARG-AGI 1 and ARC-AGI 2. It's called Tiny Recursive Model (TRM) from Samsung. How can a model 10,000x smaller be smarter? Here's how it works: 1. Draft an Initial Answer: Unlike an LLM that writes word-by-word, TRM first generates a quick, complete "draft" of the solution. Think of this as its first rough guess. 2. Create a "Scratchpad": It then creates a separate space for its internal thoughts, a latent reasoning "scratchpad." This is where the real magic happens. 3. Intensely Self-Critique: The model enters an intense inner loop. It compares its draft answer to the original problem and refines its reasoning on the scratchpad over and over (6 times in a row), asking itself, "Does my logic hold up? Where are the errors?" 4. Revise the Answer: After this focused "thinking," it uses the improved logic from its scratchpad to create a brand new, much better draft of the final answer. 5. Repeat until Confident: The entire process, draft, think, revise, is repeated up to 16 times. Each cycle pushes the model closer to a correct, logically sound solution. Why this matters: Business Leaders: This is what algorithmic advantage looks like. While competitors are paying massive inference costs for brute-force scale, a smarter, more efficient model can deliver superior performance for a tiny fraction of the cost. Researchers: This is a major validation for neuro-symbolic ideas. The model's ability to recursively "think" before "acting" demonstrates that architecture, not just scale, can be a primary driver of reasoning ability. Practitioners: SOTA reasoning is no longer gated behind billion-dollar GPU clusters. This paper provides a highly efficient, parameter-light blueprint for building specialized reasoners that can run anywhere. This isn't just scaling down; it's a completely different, more deliberate way of solving problems.
Jackson Atkins tweet media
English
341
2K
11.8K
2.2M
Andrej Karpathy
Andrej Karpathy@karpathy·
And an example of some of the summary metrics produced by the $100 speedrun in the report card to start. The current code base is a bit over 8000 lines, but I tried to keep them clean and well-commented. Now comes the fun part - of tuning and hillclimbing.
Andrej Karpathy tweet media
English
23
47
880
189.6K
Andrej Karpathy
Andrej Karpathy@karpathy·
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.
Andrej Karpathy tweet media
English
688
3.4K
24.2K
5.8M
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
The AI Capex Disparity: Why GPU Obsolescence Creates a $700B Balance Sheet Risk for OpenAI Google's Structural Moat: The $700B Difference Between Internal TPUs and the External Nvidia Tax The AI infrastructure race is revealing a massive structural difference in financial risk between companies that use merchant silicon (OpenAI) and those that build their own (Google). Our revised model, using conservative estimates (40% of CapEx for compute, 5:1 TPU efficiency), shows the financial gap by 2030 is over $700 Billion. THE COMPUTE SPEND GAP (Market Price vs. Internal Cost) YearOpenAI (GPU Market Price)Google (TPU Internal Cost)Annual Delta2026$60 B$12 B$48 B2030$320 B$64 B$256 BCUMULATIVE$780 B$156 B$704 B Impending Balance Sheet Crisis (Depreciation) The core issue is that high-end AI chips are not infrastructure in the traditional sense; they are rapidly depreciating assets. Accelerated Obsolescence: While a data center building depreciates over 30 years, an Nvidia GPU can be functionally obsolete in 1-3 years due to generational jumps (H100 → Blackwell) and high failure rates under constant training load. The Mismatch: OpenAI must account for this fast depreciation, rapidly writing down multi-billion dollar CapEx as an expense against their income statement. Google's internal cost is 5x lower, so their depreciation hit for the equivalent compute is also 5x smaller. Structural Financial Gearing The $704B delta is not just a cost difference; it's a difference in risk exposure and strategic flexibility: OpenAI's Burden: To be on par with Google, OpenAI will need to generate $704B in revenue over five years just to compensate for the higher CapEx (without accounting for the company's higher ROI threshold as they are primarily VC-backed) and the impending depreciation of obsolete hardware. This creates extreme financial gearing on their business model. Google's Moat: Google, by avoiding the Nvidia Tax, retains that $704 billion to subsidize AI services, fund next-gen R&D, or simply absorb a slower revenue ramp without catastrophic balance sheet consequences. In Short: OpenAI has a massive bet on rapid adoption of AI will leads to a revenue explosion to outrun its massive, accelerating depreciation schedule. Google is simply building the future at cost. We note that while Google does and will continue to buy NVDA GPUs, those purchases are largely a function of demand from end customers of Google Cloud and is commiserate with customer workloads. So Google isn't spending significant amount of its capex to purchase NVDA GPUs for the buildout of its core AI services. #AICapex #Nvidia #OpenAI #Google #TPUvsGPU #BalanceSheet #TechFinance
English
2
1
3
80
Sanjay Pamnani retweetledi
Andrej Karpathy
Andrej Karpathy@karpathy·
Finally had a chance to listen through this pod with Sutton, which was interesting and amusing. As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea is sufficiently "bitter lesson pilled" (meaning arranged so that it benefits from added computation for free) as a proxy for whether it's going to work or worth even pursuing. The underlying assumption being that LLMs are of course highly "bitter lesson pilled" indeed, just look at LLM scaling laws where if you put compute on the x-axis, number go up and to the right. So it's amusing to see that Sutton, the author of the post, is not so sure that LLMs are "bitter lesson pilled" at all. They are trained on giant datasets of fundamentally human data, which is both 1) human generated and 2) finite. What do you do when you run out? How do you prevent a human bias? So there you have it, bitter lesson pilled LLM researchers taken down by the author of the bitter lesson - rough! In some sense, Dwarkesh (who represents the LLM researchers viewpoint in the pod) and Sutton are slightly speaking past each other because Sutton has a very different architecture in mind and LLMs break a lot of its principles. He calls himself a "classicist" and evokes the original concept of Alan Turing of building a "child machine" - a system capable of learning through experience by dynamically interacting with the world. There's no giant pretraining stage of imitating internet webpages. There's also no supervised finetuning, which he points out is absent in the animal kingdom (it's a subtle point but Sutton is right in the strong sense: animals may of course observe demonstrations, but their actions are not directly forced/"teleoperated" by other animals). Another important note he makes is that even if you just treat pretraining as an initialization of a prior before you finetune with reinforcement learning, Sutton sees the approach as tainted with human bias and fundamentally off course, a bit like when AlphaZero (which has never seen human games of Go) beats AlphaGo (which initializes from them). In Sutton's world view, all there is is an interaction with a world via reinforcement learning, where the reward functions are partially environment specific, but also intrinsically motivated, e.g. "fun", "curiosity", and related to the quality of the prediction in your world model. And the agent is always learning at test time by default, it's not trained once and then deployed thereafter. Overall, Sutton is a lot more interested in what we have common with the animal kingdom instead of what differentiates us. "If we understood a squirrel, we'd be almost done". As for my take... First, I should say that I think Sutton was a great guest for the pod and I like that the AI field maintains entropy of thought and that not everyone is exploiting the next local iteration LLMs. AI has gone through too many discrete transitions of the dominant approach to lose that. And I also think that his criticism of LLMs as not bitter lesson pilled is not inadequate. Frontier LLMs are now highly complex artifacts with a lot of humanness involved at all the stages - the foundation (the pretraining data) is all human text, the finetuning data is human and curated, the reinforcement learning environment mixture is tuned by human engineers. We do not in fact have an actual, single, clean, actually bitter lesson pilled, "turn the crank" algorithm that you could unleash upon the world and see it learn automatically from experience alone. Does such an algorithm even exist? Finding it would of course be a huge AI breakthrough. Two "example proofs" are commonly offered to argue that such a thing is possible. The first example is the success of AlphaZero learning to play Go completely from scratch with no human supervision whatsoever. But the game of Go is clearly such a simple, closed, environment that it's difficult to see the analogous formulation in the messiness of reality. I love Go, but algorithmically and categorically, it is essentially a harder version of tic tac toe. The second example is that of animals, like squirrels. And here, personally, I am also quite hesitant whether it's appropriate because animals arise by a very different computational process and via different constraints than what we have practically available to us in the industry. Animal brains are nowhere near the blank slate they appear to be at birth. First, a lot of what is commonly attributed to "learning" is imo a lot more "maturation". And second, even that which clearly is "learning" and not maturation is a lot more "finetuning" on top of something clearly powerful and preexisting. Example. A baby zebra is born and within a few dozen minutes it can run around the savannah and follow its mother. This is a highly complex sensory-motor task and there is no way in my mind that this is achieved from scratch, tabula rasa. The brains of animals and the billions of parameters within have a powerful initialization encoded in the ATCGs of their DNA, trained via the "outer loop" optimization in the course of evolution. If the baby zebra spasmed its muscles around at random as a reinforcement learning policy would have you do at initialization, it wouldn't get very far at all. Similarly, our AIs now also have neural networks with billions of parameters. These parameters need their own rich, high information density supervision signal. We are not going to re-run evolution. But we do have mountains of internet documents. Yes it is basically supervised learning that is ~absent in the animal kingdom. But it is a way to practically gather enough soft constraints over billions of parameters, to try to get to a point where you're not starting from scratch. TLDR: Pretraining is our crappy evolution. It is one candidate solution to the cold start problem, to be followed later by finetuning on tasks that look more correct, e.g. within the reinforcement learning framework, as state of the art frontier LLM labs now do pervasively. I still think it is worth to be inspired by animals. I think there are multiple powerful ideas that LLM agents are algorithmically missing that can still be adapted from animal intelligence. And I still think the bitter lesson is correct, but I see it more as something platonic to pursue, not necessarily to reach, in our real world and practically speaking. And I say both of these with double digit percent uncertainty and cheer the work of those who disagree, especially those a lot more ambitious bitter lesson wise. So that brings us to where we are. Stated plainly, today's frontier LLM research is not about building animals. It is about summoning ghosts. You can think of ghosts as a fundamentally different kind of point in the space of possible intelligences. They are muddled by humanity. Thoroughly engineered by it. They are these imperfect replicas, a kind of statistical distillation of humanity's documents with some sprinkle on top. They are not platonically bitter lesson pilled, but they are perhaps "practically" bitter lesson pilled, at least compared to a lot of what came before. It seems possibly to me that over time, we can further finetune our ghosts more and more in the direction of animals; That it's not so much a fundamental incompatibility but a matter of initialization in the intelligence space. But it's also quite possible that they diverge even further and end up permanently different, un-animal-like, but still incredibly helpful and properly world-altering. It's possible that ghosts:animals :: planes:birds. Anyway, in summary, overall and actionably, I think this pod is solid "real talk" from Sutton to the frontier LLM researchers, who might be gear shifted a little too much in the exploit mode. Probably we are still not sufficiently bitter lesson pilled and there is a very good chance of more powerful ideas and paradigms, other than exhaustive benchbuilding and benchmaxxing. And animals might be a good source of inspiration. Intrinsic motivation, fun, curiosity, empowerment, multi-agent self-play, culture. Use your imagination.
Dwarkesh Patel@dwarkesh_sp

.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew. 0:00:00 – Are LLMs a dead-end? 0:13:51 – Do humans do imitation learning? 0:23:57 – The Era of Experience 0:34:25 – Current architectures generalize poorly out of distribution 0:42:17 – Surprises in the AI field 0:47:28 – Will The Bitter Lesson still apply after AGI? 0:54:35 – Succession to AI

English
414
1.2K
9.5K
2M
Sanjay Pamnani retweetledi
Andy Constan
Andy Constan@dampedspring·
RIP Charlie Munger
English
19
29
630
57.4K
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
@josephwang Its both a balance sheet & cash flow story. HH still have decent built up savings and low interest mortgages have buttressed homeowner cash flows. Add the ability to WFH plus strong labor mkt, there are few motivated/forced sellers who would otherwise need to markdown on the sale
English
1
0
1
588
Joseph Wang
Joseph Wang@josephwang·
This is why homebuilders are soaring: Toll brothers call notes that 35% of homes on the market are new construction. Historically that number is 10 to 15%. Resale inventory is very low as homeowners don't want sell their homes and give up their low rate mortgages.
English
35
54
547
237.7K
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
@tarun_kalra Cloud stocks could buckle tomorrow because of the sharp slowdown AWS saw in April. ECI and PCE reports out in the am so could see a sharp move in rates if they come in hotter like some folks are expecting.
English
0
0
0
333
Tarun Kalra
Tarun Kalra@tarun_kalra·
April AWS slowdown mentioned on earnings call that would not be seen in the numbers
English
1
0
1
115
Tarun Kalra
Tarun Kalra@tarun_kalra·
AMZN opens at 107, closes at 110, pops to 123 AH, then sells to 109 & looks like it may want more down. Qs closed at 320, jumped to 323 AH and is now back at 320. The tomfoolery of it all... I mentioned earlier: all we have is a failed breakdown unless buyers take out swings
English
1
0
2
215
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
@tarun_kalra Long time. Enjoy reading your views on market positioning and technicals.
English
0
0
1
28
Tarun Kalra
Tarun Kalra@tarun_kalra·
Flexibility, adaptability, cutting losing trades early, keeping open mind. Moment I let bias in, mkt sends me extra sized helping of humble pie - silly rabbit thinking you are smarter than us. Hear you. So I evolve. I am a maniacal student of processing mkt generated clues
English
1
0
1
171
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
@saxena_puru What do you make of the strength from homebuilders? For now, the housing mkt still seems to be hanging in there.
English
5
0
0
307
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
@maccabeecap @BodhiTreeCIO Most likely its a preference by MMFs to buy 1-month t-bills ahead of a potential government default. twitter.com/dampedspring/s…
Andy Constan@dampedspring

The low yield of 1M bills is inspiring the FUD masters particularly accolades of @JeffSnider_AIP (who I respect fwiw) to pronounce a "collateral" shortage indicative of a "bank event". Its just normal debt ceiling drama prep. The implied debt ceiling date is what's priced. 1/n

English
0
0
0
83
JL
JL@jakelevison·
@BodhiTreeCIO What does it mean for T-bills to decline this rapidly? That the market is calling the Fed’s bluff and expecting a pivot?
English
2
0
0
37
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
@BodhiTreeCIO @BernieSanders True that free market liberalization has done more to fight global poverty but lets not forget that along the way we had the New Deal in the 1930s and Great Society Reforms of the 1960s. Point being that unfettered free markets would not have achieved the results that you cite.
English
0
0
1
126
Bernie Sanders
Bernie Sanders@BernieSanders·
There's something profoundly wrong in the global economy when 350 million people are "marching toward starvation" according to the United Nations, while the top 1% now own 5,500 yachts that are at least 100-feet long. We need a wealth tax to combat this grotesque inequality.
English
702
731
4.1K
580.8K
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
@KrishnaMemani Economy not close to breaking - in fact the big unknown for Fed is what will it take to cool the economy so as to bring down demand & inflation. Labor mkt strong, UE close to multidecade lows. Consumer spending decent. Long and variable lags is all we have to go on for now
English
0
0
0
21
Krishna Memani
Krishna Memani@kkmaway·
But the other assumption is that a broken economy can be fixed relatively quickly.... because....I don't know.... Didn't work out that way after GFC..I think in this go around, it is this assumption that will get tested....
English
3
0
6
1.3K
Krishna Memani
Krishna Memani@kkmaway·
One thing is quite clear from March 2023... policy makers are far more concerned about breaking the banking system than they are about breaking the real economy... That MO is based on certain assumptions....whether those assumptions are correct or not, only time will tell
English
2
1
9
3.6K
Sanjay Pamnani
Sanjay Pamnani@PamnaniSanjay·
@TheStalwart Not sure if this data takes into account the massive growth in high frequency trading & quant funds as well as HFs in general. If not, its hard to make an apples-to-apples comparison for retail & institutional (MFs/pensions) investor behavior.
English
0
0
0
41
Joe Weisenthal
Joe Weisenthal@TheStalwart·
Like, is everyone becoming a manic daytrader or is everyone just buying the whole stock market and ignoring what individual companies are doing?
English
24
0
50
20K
Joe Weisenthal
Joe Weisenthal@TheStalwart·
I'm not sure what to make of this chart. But one thing I find interesting is how it seems to be, in a sense, in opposition to the other popular concern out there that there's so much passive investing going on, that nobody is putting in the work on individual security selection.
Alexis Ohanian 🗽@alexisohanian

Did y’all know that in the 1960s and 70s, most shareholders would hodl a company’s stock for at least 3 years? Today, it's closer to 17 weeks! 🤯 Reminds me what we learned at @Reddit : In the short-run, any market is a voting machine. In the long-run, it is a weighing machine.

English
19
8
97
75.8K
Sanjay Pamnani retweetledi
Katie Roof
Katie Roof@Katie_Roof·
What is the term for a startup that’s no longer a unicorn 🦄 but still exists? Horse? 🐴
English
365
76
1.2K
416.5K