pranav

1.2K posts

pranav banner
pranav

pranav

@_pranavnt

robot learning @uwcse • prev @morph_labs @atlasfellow

sf / seattle Katılım Ocak 2021
969 Takip Edilen1.4K Takipçiler
pranav retweetledi
Runtime
Runtime@RuntimeBRT·
🚨 Bengaluru-based @Airbound_Aero has conducted 700 flights for Narayana Health since January 2026 with a zero failure rate.
English
55
666
4.2K
372.2K
abinaya
abinaya@abinayaaaa·
for PI day, i'm thrilled to share that I'll be joining @physical_int in a few weeks to work on accelerating robot learning research through fast, observable runtimes! i started playing with robots relatively late - my sophomore year of college - and have learned almost entirely through personal projects. its a dream come true to collaborate w the brilliant folks at PI to make better robot software. stay tuned for more !!!
English
22
1
232
12.1K
pranav retweetledi
Jesse Zhang
Jesse Zhang@Jesse_Y_Zhang·
A reward model that works, zero-shot, across robots, tasks, and scenes? Introducing Robometer: Scaling general-purpose robotic reward models with 1M+ trajectories. Enables zero-shot: online/offline/model-based RL, data retrieval + IL, automatic failure detection, and more! 🧵 (1/12)
English
7
104
399
86.7K
pranav retweetledi
pranav retweetledi
TBPN
TBPN@tbpn·
Standard Intelligence's @devanshpandey responds to @tszzl's tweet that "text is the universal interface," and explains why their new foundation model is trained on video: "At some point in the arbitrarily long future, if we only use text models, we could force most things to be text. But I think there are just a lot of things that are much more native when done from a computer-use [perspective]." "GUIs are designed for humans to use. We have this massive long tail of things on the internet that are entirely undoable by LLMs." "For example, when I do ML engineering most of my time is spent doing the grunt work of engineering. It's a lot of looking at graphs, analyzing, and comparing loss curves. You can do this in text, but it's a much larger pain than doing it in the native interface." "There's a reason humans don't interact with a computer purely through text, it would kind of suck."
roon@tszzl

text is the universal interface

English
8
9
311
60K
pranav retweetledi
galen
galen@G413N·
computer use is too important to relegate to post-training. this has been many months in the making, I'm super proud of what we've achieved as a team and excited to scale!
Standard Intelligence@si_pbc

Computer use models shouldn't learn from screenshots. We built a new foundation model that learns from video like humans do. FDM-1 can construct a gear in Blender, find software bugs, and even drive a real car through San Francisco using arrow keys.

English
7
11
173
13.6K
pranav retweetledi
Standard Intelligence
Standard Intelligence@si_pbc·
Computer use models shouldn't learn from screenshots. We built a new foundation model that learns from video like humans do. FDM-1 can construct a gear in Blender, find software bugs, and even drive a real car through San Francisco using arrow keys.
GIF
English
186
402
3.9K
1.1M
pranav retweetledi
Standard Intelligence
Standard Intelligence@si_pbc·
We’ve made two main advances: the ability to train on our 11M+ hour computer action dataset and understand long-context video. Our video encoder can fit nearly two hours of 30FPS, high-resolution video into a 1M token context window, ~50x more efficient than existing SOTA.
Standard Intelligence tweet media
English
7
24
682
122.5K
pranav retweetledi
gavin leech (Non-Reasoning)
gavin leech (Non-Reasoning)@g_leech_·
improve AI discourse about 5% just by renaming evals accurately Humanity's Last Exam: PubQuizFromHell MATH: RemedialMath FrontierMath: QuarterFrontierMath SWE-Bench: DjangoBench MMLU Virology: NoiseBench Terminal Bench 2: NoiseBench METR HCAST: GreenfieldCodeGigworkBench
English
11
17
384
12.9K
pranav retweetledi
Seattle Seahawks
Seattle Seahawks@Seahawks·
SUPER BOWL LX CHAMPIONS ‼️
Seattle Seahawks tweet mediaSeattle Seahawks tweet mediaSeattle Seahawks tweet mediaSeattle Seahawks tweet media
Deutsch
679
10.3K
35.9K
1.5M
pranav retweetledi
NFL
NFL@NFL·
SEAHAWKS ARE THE CHAMPIONS!
English
354
2.8K
17.2K
328.8K
pranav retweetledi
Seattle Seahawks
Seattle Seahawks@Seahawks·
Came out to play.
Seattle Seahawks tweet media
English
161
2.1K
10.2K
201.9K
pranav retweetledi
Kushal Thaman
Kushal Thaman@kushal1t·
I spent a bunch of time a year ago thinking about the data wall. A blackpill at the time for me was when I realized that the total stock of natural text data is depleting much faster than Chinchilla's infamous 20 tokens per param compute optimal ratio suggested. Here is a naive BOTEC from back then: Famously, Chinchilla showed that using about 20 tokens per param was compute optimal, measured at 6*10^23 FLOPs. It turns out that even though MoEs are more compute efficient than dense models, training them compute optimally needs a lot more data! In fact, at a 1:32 (97%) sparsity it uses ~6x more tokens per active params (see [1]). The Llama 3 405B report measured 40 token per param to be optimal with their data at 4*10^25 FLOPs. And for a 1:32 sparse MoE model such as DeepSeek v3, this suggests 240 tokens per param could well end up being optimal! At this ratio, things would break down. A 4*10^27 FLOPs model (a pretraining run that might be planned e.g. for 2026) will need 400T tokens. A 5*10^28 FLOPs model would require O(1400T) tokens. These are insane numbers, and they only get worse into the 2030s! The totally unfiltered Common Crawl is about 240T tokens. People have been offsetting this to some extent by training for multiple epochs or repeating the same data a la "Scaling Data-Constrained Language Models" by Muennighoff et al. (2023). Of course, this is a naive BOTEC, and I'm happy to dive into more details, e.g. how much compute might be put into other uses, such as long-horizon RLVR which could well require a lot of those 5*10^28 FLOPs. But we are casually talking about hundreds of trillions to over a quadrillion tokens as compute-optimal! It makes one question whether these numbers are actually necessary for the kind of capability gains we want. We are working on this question at @flappyairplanes, and we're excited to be advised by @karpathy. I will end here with this @ilyasut quote from the @dwarkesh_sp episode with him: "The data is very clearly finite. What do you do next? Either you do some kind of souped-up pre-training, a different recipe from the one you’ve done before, or you’re doing RL, or maybe something else. But now that compute is big, compute is now very big, in some sense we are back to the age of research. [...] Up until 2020, from 2012 to 2020, it was the age of research. Now, from 2020 to 2025, it was the age of scaling—maybe plus or minus, let’s add error bars to those years—because people say, “This is amazing. You’ve got to scale more. Keep scaling.” The one word: scaling. But now the scale is so big. Is the belief really, “Oh, it’s so big, but if you had 100x more, everything would be so different?” It would be different, for sure. But is the belief that if you just 100x the scale, everything would be transformed? I don’t think that’s true. So it’s back to the age of research again, just with big computers." [1] arxiv: 2501.12370
Kushal Thaman tweet media
Andrej Karpathy@karpathy

A conventional narrative you might come across is that AI is too far along for a new, research-focused startup to outcompete and outexecute the incumbents of AI. This is exactly the sentiment I listened to often when OpenAI started ("how could the few of you possibly compete with Google?") and 1) it was very wrong, and then 2) it was very wrong again with a whole another round of startups who are now challenging OpenAI in turn, and imo it still continues to be wrong today. Scaling and locally improving what works will continue to create incredible advances, but with so much progress unlocked so quickly, with so much dust thrown up in the air in the process, and with still a large gap between frontier LLMs and the example proof of the magic of a mind running on 20 watts, the probability of research breakthroughs that yield closer to 10X improvements (instead of 10%) imo still feels very high - plenty high to continue to bet on and look for. The tricky part ofc is creating the conditions where such breakthroughs may be discovered. I think such an environment comes together rarely, but @bfspector & @amspector100 are brilliant, with (rare) full-stack understanding of LLMs top (math/algorithms) to bottom (megakernels/related), they have a great eye for talent and I think will be able to build something very special. Congrats on the launch and I look forward to what you come up with!

English
3
13
123
27K
pranav retweetledi
aidan
aidan@aidanmantine·
There might be fast takeoff at SFO, but people are forgetting about it in AI. We're building Flapping Airplanes to train models radically differently and fly over the data wall. We can’t wait to show you what we’ve been working on soon.
Flapping Airplanes@flappyairplanes

Announcing Flapping Airplanes! We’ve raised $180M from GV, Sequoia, and Index to assemble a new guard in AI: one that imagines a world where models can think at human level without ingesting half the internet.

English
37
12
381
63K
pranav retweetledi
Ethan Shen
Ethan Shen@ethnlshn·
Today, we release SERA-32B, an approach to coding agents that matches Devstral 2 at just $9,000. It is fully open-source and you can train your own model easily - at 26x the efficiency of using RL. Paper: allenai.org/papers/opencod… Here’s how 🧵
Ai2@allen_ai

Introducing Ai2 Open Coding Agents—starting with SERA, our first-ever coding models. Fast, accessible agents (8B–32B) that adapt to any repo, including private codebases. Train a powerful specialized agent for as little as ~$400, & it works with Claude Code out of the box. 🧵

English
27
90
691
90.4K