Charlotte X.

11 posts

Charlotte X. banner
Charlotte X.

Charlotte X.

@xia_char

Scientist-turned VC Investor @fusionfundvc || AI/data infra, robotics, and healthcare || prev @Formlabs, @Stanford, @Umich || Opinions my own

SF Bay Area Katılım Temmuz 2023
207 Takip Edilen48 Takipçiler
Sabitlenmiş Tweet
Charlotte X.
Charlotte X.@xia_char·
Frontier labs such as @OpenAI have shifted to prioritizing large-scale #RL training and expected the RL compute to significantly exceed what’s been spent on pre-training compute today. In this blog, I went deep dive into why RL has become one of the major focuses for them. I also share some thoughts on the 𝗶𝗻𝗳𝗿𝗮𝘀𝘁𝗿𝘂𝗰𝘁𝘂𝗿𝗲 𝘀𝗵𝗶𝗳𝘁 required to support modern RL training workloads and why 𝗰𝗼𝗻𝘁𝗶𝗻𝘂𝗮𝗹 𝗹𝗲𝗮𝗿𝗻𝗶𝗻𝗴 and 𝘁𝗲𝘀𝘁-𝘁𝗶𝗺𝗲 𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 might be the key to solving long-tail and out-of-distribution problems. At @FusionFundVC, we back early-stage startups building next-generation RL infrastructure and tackling challenging problems that new RL advancements can unlock. If you're building in this space, I'd love to hear from you! charlottexia.substack.com/p/scaling-rl-2… #reinforcementlearning #RLinfra #ContinualLearning @grok
Charlotte X. tweet media
English
0
0
3
93
Charlotte X. retweetledi
Thore
Thore@buergelt_·
Can frontier llms predict clinical trial outcomes? We went deep and here's what we found! tl;dr >current frontier llms are weak biological alpha generators >performance is driven by memorization >the more recent the biology, the weaker the performance blog.pheiron.com/p/llm-bench
English
3
3
26
8.6K
Lunjun Zhang
Lunjun Zhang@LunjunZhang·
New work💡: "EMA Policy Gradient: Taming Reinforcement Learning for LLMs with EMA Anchor and Top-k KL" Two ideas, both minimal, both effective: 🚀 use a target network (EMA) for reference policy 🚀 Top-k KL that works like knowledge distillation but remains unbiased at any k
Lunjun Zhang tweet media
English
4
10
91
4.3K
Tom Yeh
Tom Yeh@ProfTomYeh·
I wrote a short story to explain to my students the evolution of PPO, DPO, GRPO, to GDPO (NVIDIA's new paper). 👇 This story is based on my own personal RL journey to become the family chef. 🍳 (when my wife was my girlfriend) 𝗣𝗣𝗢 I wanted to cook a new dish for our next date. I hired an expert (my buddy) to try it and tell me whether it might be good. If not, I tweaked my cooking method and tried again. This loop repeated. My girlfriend (user) was not involved during the training time, until the date (aka, the inference time). 𝗗𝗣𝗢 I stopped cooking new dishes for a moment. Instead, I went back through all my past meals — successes and disasters. I replayed those scenarios in my head and asked: “What would I have done differently?” I updated my 𝘥𝘦𝘤𝘪𝘴𝘪𝘰𝘯-𝘮𝘢𝘬𝘪𝘯𝘨, not by trying new dishes, but by learning from comparisons. (we got married) 𝗚𝗥𝗣𝗢 I’m out of past examples. And I didn't want to hire an expert anymore (I no longer lived with my buddies). So I cooked a bunch of dishes at once and simply watched: Which one did my wife eat more than the others? No expert judgment — just relative preference from outcomes. (now we have 2 kids) 𝗚𝗗𝗣𝗢 Then life gets more interesting. Now we have two kids. That’s two additional reward functions. Do I combine all their preferences into a single objective? Do I keep them separate? How do I optimize without one child dominating the signal? This is where multi-reward alignment starts to matter. #ppo #dpo #grpo #gdpo #aibyhand
English
21
111
715
39.7K
Charlotte X. retweetledi
Olivia Moore
Olivia Moore@omooretweets·
Crazy to see how far AI video has come in the last few months The handling of shadows and light here feels like a new standard 👇 (from @endlesstaverns, made with Runway Gen-3 + Suno)
English
14
117
620
251K
Charlotte X.
Charlotte X.@xia_char·
As we all see the surging demand for GPUs, I've been taking some time to reflect on the current market and technology trends in the AI infrastructure space. I'd like to share a few interesting observations: - In the era of #Transformers, the growth rate of #training compute demand significantly exceeds the extrapolation of Moore's Law. - The two distinct phases of LLM inference can lead to potential overspending on GPUs due to the #memory bandwidth bottleneck. - AI #inference compute characteristics present interesting heterogeneity, as seen with #AlphaFold in the computational biology field. - AI demands a transformation in #datacenter networks and #edge computing. A more detailed thought piece here: substack.com/home/post/p-14… If you're interested in brainstorming ideas and discussing startups in this space, feel free to reach out to me. I am at #CVPR2024 this week! #GenAI #infra
Charlotte X. tweet media
English
0
0
0
197
Charlotte X. retweetledi
Yangqing Jia
Yangqing Jia@jiayq·
"Are LLM APIs losing money?" A very interesting question. Here's my unpopular opinion and I look forward to hearing your opinions. ========= 1. If you are leading in high workload benchmarks, congrats, you are burning VC money. ========= LLM inference public api capacity is like running a restaurant: there are cooks, and you estimate a certain amount of customer traffic. Hiring cooks cost money. Latency and throughput are basically "how fast you can cook meal for customers". For a reasonable business, you want to have a "reasonable" number of cooks. In other words, you want to have capacities that host normal traffic, but not sudden bursts of Epicureans under seconds' notice. A sudden burst in traffic means there will be wait. Otherwise, you are having cooks goofing around. In the AI world, those are GPU machines. Benchmark loads are bursty. Under low workloads, benchmark load blends into normal traffic, and the measurement is an accurate representation of how the service is, under its current workload. High service load scenario is interesting because this introduces a disruption. Benchmarks run only a few times per day / week, so it is not a regular traffic one should expect. Imagine flooding 100 people into the local restaurant to check how fast the cook makes meal - the result will be pretty off. This is called the "observer effect", borrowing terms from quantum physics. The stronger the disturbance is (aka, larger sudden load), the less accurate it becomes. In other words: if you give a service a sudden, high load, and find that the service is responding very fast, you know that the service is otherwise having quite a bit of idle capacity. As an investor, when seeing this, you should ask your portfolio company: are you burning money responsibly? ========= 2. People eventually reach the same performance. ========= Everyone loves fights. And that's also very fun. I personally love competitions too. We had early fun with @soumithchintala 's convnet-benchmark. The conclusion is that everyone converges to the same solution very quickly (and, @nvidia was always the eventual winner, because GPUs). The only requirement - you do not screw up the compiler flags for your runtime. This is thanks to great open source projects, vLLM as an excellent example. This means, as a provider, if your performance is particularly worse than others, you can easily catch up by looking at open source solutions, and applying good engineering. ========= 3. "As a customer, I don't care about provider's cost." ========= This is totally true. I would say that we are living in a really great world for AI application builders: there are always API providers willing to burn money, and while you can enjoy free high quality steaks, go for it - nothing wrong with that. It's an interesting phenomenon that happened in other fields as well, like ride sharing - burn money to get traffic, and worry about profit later. The fact that AI is doing the same, makes me somewhat more confident that it is going to be a real business, because there is interest :) ========= 4. Our gratitude to benchmarkers ========= Benchmark is a tedious and sometimes error-prone job. And for better or worse, it usually happens that winners praise you and losers blame you. The last round of convolution neural net benchmark was similar. Been there, done that. As a result, I deeply appreciate benchmarking efforts from @withmartian, @ArtificialAnlys, and earlier @anyscalecompute. It's not an easy job, but such effort is going to help us get the next 10x gain in AI infra. ========= 5. We @LeptonAI help you to find the best AI infra strategy ========= This is really a self-promoting ending sentence. Because AI is so nascent and there are so many knobs to consider, if you are a startup or an enterprise trying to find YOUR OWN AI infra strategy in addition to public APIs, you might find some friendly help useful. we are a group of veterans in the AI framework and cloud infra world - and we would love to work with you on such needs.
Chief AI Officer@chiefaioffice

LLM inference leaderboard from @withmartian How long will VC subsidies last?

English
8
22
142
45.9K