Neel
134 posts

Neel
@neelkon
quant @ p72, berkeley eecs, all opinions my llm's
Katılım Ekim 2021
128 Takip Edilen68 Takipçiler

as @modal has scaled from 15 people to over 100 i have been repeatedly instructed that i need to "give away my LEGOs"

English

To actually benefit from prefix caching in a multi-GPU setup, the next turn has to land on the worker that already holds the cached prefix. Otherwise you miss the local KV cache, recompute the repeated prompt from scratch, and only then cache it redundantly on another worker.


Hamza Elshafie@hamzaelshafie
Visual walkthrough of prefix caching in vLLM on a multi-turn chat example for lower TTFT.
English
Neel retweetledi

Ex-Point72 Proprietary Research Head Kirk McKeown on building edge, alpha decay, & why everything that happened on Wall Street is about to happen on Main Street.
Kirk McKeown (8.5 years @ Point72 under Steve Cohen | Built primary research at Glenview under Larry Robbins | Now founder of Carbon Arc @CarbonArcAI)
"Alpha rewards those who value assets in a cold way. You want to get it right — not be right."
We cover:
- How alpha creation differs across multi-manager vs. concentrated shops
- The 3 vectors every middle office function must move to justify its existence
- Why he worked 6-hour Sundays from 2006-2020 — and the math behind it
- The TSMC call that signaled semiconductor cancellations before anyone else knew
- What the quant revolution on Wall Street tells us about the AI economy today
- His framework: 4 market structures, 9 business models, & why they have rules
- The MIT beer game & why every business problem is really an inventory problem
- His hot take: a top hedge fund launches an enterprise AI lab in 2026
Highlights:
00:00 Intro
04:47 Tutor vs Glenview vs Point72: how edge differs
12:29 How to build “lift” for PMs: at-bats, hit-rate, sizing
18:44 Building research edge: outwork, read, fieldwork
27:16 Personal moat in 2026: analogs, history, decision trees
40:08 “Main Street becomes Wall Street”: what that actually means
44:30 Carbon Arc thesis: “decimalization” of data market structure
46:43 Why the edge migrates to data plus domain context
51:00 How to win in commoditized research: sample size beats anecdotes
01:03:26 Factorizing everything: themes, market structure, business models
01:08:37 Pruning decision trees: signals, scale points, inventory dynamics
01:14:18 Contrarian 2026 take: hedge funds launching enterprise AI labs
01:23:32 Final question: one habit to build career alpha
English
Neel retweetledi

Let me get this straight:
Anthropic refused to work with DoW unless they could promise their tech wasn't used for surveillance or killing.
DoW said that they need full capabilities.
Anthropic declined to give full access.
OpenAI stood by Anthropic for ensuring AI safety.
Trump then cancelled all Anthropic usage across the government, including a $200m contract.
OpenAI then submits a bid to replace Anthropic.
English
Neel retweetledi
Neel retweetledi
Neel retweetledi

Peter Thiel just told Silicon Valley it’s automating away its own cognitive moat.
Nobody there is paying attention.
Thiel: “It is striking to me how bad Silicon Valley is at talking about these sorts of things.”
The industry is either arguing over 20% improvements in the next transformer model or jumping straight to simulation theory.
They’re missing the massive real-world shift happening right in the middle.
Thiel: “My intuition would be it’s going to be quite the opposite, where it seems much worse for the math people than the word people.”
For decades, Silicon Valley worshipped quantitative intelligence. Math and coding were the ultimate safety nets.
Thiel: “Within three to five years, the AI models will be able to solve all the US Math Olympiad problems.”
Once a machine instantly solves the hardest math problems on earth, the economic value of being a human calculator doesn’t just decline.
It disappears.
And the historical irony is brutal.
The societal bias toward math over verbal ability started during the French Revolution. Not because math was more valuable. Because verbal ability ran in aristocratic families, and math was elevated as the great equalizer to break nepotism.
A 200-year-old political accident became the foundation of Silicon Valley’s entire hiring philosophy.
AI is about to snap it back.
The people who built the models that can now outperform them mathematically spent their careers optimizing for the wrong skill.
The future belongs to the word people.
The engineers didn’t see it coming because they were too busy calculating.
English

What can half of GPT-1 do? We trained a 42M transformer called SONIC to control the body of a humanoid robot. It takes a remarkable amount of subconscious processing for us humans to squat, turn, crawl, sprint. SONIC captures this "System 1" - the fast, reactive whole-body intelligence - in a single model that translates any motion command into stable, natural motor signals. And it's all open-source!!
The key insight: motion tracking is the one, true scalable task for whole body control. Instead of hand-engineering rewards for every new skill, we use dense, frame-by-frame supervision from human mocap data. The data itself encodes the reward function: "configure your limbs in any human-like position while maintaining balance".
We scaled humanoid motion RL to an unprecedented scale: 100M+ mocap frames and 500,000+ parallel robots across 128 GPUs. NVIDIA Isaac Lab allows us to accelerate physics at 10,000x faster tick, giving robots many years of virtual experience in only hours of wall clock time. After 3 days of training, the neural net transfers zero-shot to the real G1 robot with no finetuning. 100% success rate across 50 diverse real-world motion sequences.
One SONIC policy supports all of the following:
- VR whole-body teleoperation
- Human video. Just point a webcam to live stream motions.
- Text prompts. "Walk sideways", "dance like a monkey", "kick your left foot", etc.
- Music audio. The robot dances to the beat, adapting to tempo and rhythm.
- VLA foundation models. We plugged in GR00T N1.5 and achieved 95% success on mobile tasks.
We open-source the code and model checkpoints!! Deep dive in thread:
English

We wrote a book on inference and we’re giving away free copies today!

Philip Kiely@philipkiely
Inference Engineering launches today. baseten.com/inference-engi…
English

congrats to @ereborbank @PalmerLuckey on the fastest bank charter approval ever
the first stablecoin native bank incredibly bullish

English

@andrewchen Some friends @ Berkeley are building this, one is also founding Ramp's stablecoin team. Would love to intro :)
English









