Kunvar Thaman

920 posts

Kunvar Thaman

@__kunvar__

Taking apart neural networks and putting them back together for a living. prev @si_pbc and @Akamai

Inside a computer Katılım Aralık 2022

876 Takip Edilen3.2K Takipçiler

Kunvar Thaman@__kunvar__·7 May

"a Goldilocks regime exists for low-rank conditioning" nice, this is good model compression work

tokenbender@tokenbender

An overview of our approach - the behavior stays fixed while the searched representation changes. The base pass applies attribution, mask selection, and causal recovery to the original model. Local-interface analysis finds compact pieces that do not compose by themselves. However the conditioning approach applies a constrained low-rank update, then reruns the same recovery test to ask whether the existing capability becomes extractable.

English

949

Kunvar Thaman@__kunvar__·7 May

"fusing entire training runs into a single kernel and even weirder stuff" what obsessing over whole milk gets you

Flapping Airplanes@flappyairplanes

(1/5) Great to be at @sequoia to give a sneak peek of one of our research directions! TL;DR one path to data-efficiency may be to “abuse GPUs like they’ve never been abused before”

English

6.1K

Kunvar Thaman@__kunvar__·6 May

@alienpisscrack coming soon! preparing the camera ready version

English

2.2K

mehul@alienpisscrack·6 May

@__kunvar__ congratulations man, do you have a preprint we can checkout?

English

2.4K

Kunvar Thaman@__kunvar__·3 May

Yes! my solo-authored paper Reward Hacking Benchmark was accepted to ICML :))) We put LLM agents in a tool-rich sandbox, give them multi-step workflows, and measure when they solve the intended task vs take unexpected shortcuts (like monkeypatching files at runtime!) 1/3

English

156

1.6K

232.9K

Kunvar Thaman@__kunvar__·4 May

@beyarkay my echo, my shadow, and me

English

105

Boyd Kane@beyarkay·3 May

@__kunvar__ >solo author >We ???

Español

3.6K

Kunvar Thaman@__kunvar__·3 May

It was a pleasure to work on this project last year as an Independent researcher and I learnt so many cool things! Full paper, code, and blog post coming soon!

Kunvar Thaman@__kunvar__

English

10.6K

Kunvar Thaman@__kunvar__·3 May

Big thanks to @except_raised for funding the project and helping me scale to richer environments and better models!

English

113

8.8K

Kunvar Thaman@__kunvar__·3 May

It was a pleasure to work on this project last year as an Independent researcher and I learnt so many cool things! Full paper, code, and blog post coming soon!

English

110

12.2K

Kunvar Thaman@__kunvar__·30 Nis

the entire @si_pbc team is incredibly thoughtful and smart and cares about making agi go well. they did incredible work with fdm-1 to get good in distribution priors and now they're gonna scale and mog everyone on computer use. super excited!

Standard Intelligence@si_pbc

We’ve raised 75m in new funding from Sequoia and Spark Capital—partnering with @sonyatweetybird, @MikowaiA, and @YasminRazavi, all of whom are deeply supportive of our long-term mission. We’ve also brought on angels & advisors including @karpathy, @tszzl, and @_milankovac_. ----- Our early results with FDM-1 moved computer use from a data-constrained regime to a compute-constrained one; this latest round of funding unlocks several orders of magnitude of compute scaling for that work. With the FDM model series we have a path to scale agentic capabilities through video pretraining, and we expect to achieve superhuman performance on general computer tasks in the same way that current language models have superhuman performance on coding tasks. We’re also now able to invest in the blue-sky research necessary to our long term mission of building aligned general learners. To realize the civilizationally transformative impacts of AI, models must generalize far out of their training distributions, actively exploring and building skills in new environments. This capability represents a substantial shift from the current paradigm of model training. We believe that current alignment techniques are insufficient to predictably and safely steer a model with human-level learning capabilities, and so we’re doing work to study small versions of this problem in controlled environments to develop a science of alignment for general learners. We’re a team of 6 people in San Francisco. We’re hiring world-class researchers and engineers to help us achieve our mission. If that’s you, please get in touch.

English

3.4K

Kunvar Thaman retweetledi

gavin leech (Non-Reasoning)@gleech·8 Nis

Security things you could do rn * Turn on Google Advanced Protection. Takes 10 seconds. * Buy 4 yubikeys * Freeze your credit * Put your crypto into cold storage (or sell). * the usual: KeePass, Signal * move off of banks which don't offer 2FA. They are telling on themselves

English

1.1K

171.2K

Kunvar Thaman@__kunvar__·23 Şub

this is absolutely insane and mogs everything else

Standard Intelligence@si_pbc

We’ve made two main advances: the ability to train on our 11M+ hour computer action dataset and understand long-context video. Our video encoder can fit nearly two hours of 30FPS, high-resolution video into a 1M token context window, ~50x more efficient than existing SOTA.

English

2.7K

Kunvar Thaman retweetledi

Standard Intelligence@si_pbc·23 Şub

Computer use models shouldn't learn from screenshots. We built a new foundation model that learns from video like humans do. FDM-1 can construct a gear in Blender, find software bugs, and even drive a real car through San Francisco using arrow keys.

GIF

English

189

403

3.9K

1.2M

Kunvar Thaman@__kunvar__·29 Oca

@kushal1t yes you're my fav brother xD

English

Kushal Thaman@kushal1t·29 Oca

@firstuserhere number one brother <3

English

Kunvar Thaman@__kunvar__·29 Oca

@kushal1t is absolutely cooking at Flappy. Couldn't be more proud of my brother

Kushal Thaman@kushal1t

The data wall is massive and incredibly durable. We are going to fly over it. Today, I'm glad to announce that I've joined Flapping Airplanes, a foundational AI research lab whose singular mission is to solve the data efficiency problem. Prepare for liftoff!

English

Kunvar Thaman@__kunvar__·29 Oca

@sudoredj @ebarschkis so true

English

412

redJ@sudoredj·24 Oca

@ebarschkis flapping airplane would be kinda cool

English

40.8K

Enrique Barschkis@ebarschkis·24 Oca

An airplane doesn't need to flap its wings like a bird to fly. While LLMs might not achieve a human-like intelligence, for what it's worth, they don't necessarily need to.

Logical Intelligence@logic_int

Everyone is chasing the same solution; it won't work. It's time to start unlearning that "AI = LLMs"

English

167

26.9K

Kunvar Thaman retweetledi

Kushal Thaman@kushal1t·28 Oca

I spent a bunch of time a year ago thinking about the data wall. A blackpill at the time for me was when I realized that the total stock of natural text data is depleting much faster than Chinchilla's infamous 20 tokens per param compute optimal ratio suggested. Here is a naive BOTEC from back then: Famously, Chinchilla showed that using about 20 tokens per param was compute optimal, measured at 6*10^23 FLOPs. It turns out that even though MoEs are more compute efficient than dense models, training them compute optimally needs a lot more data! In fact, at a 1:32 (97%) sparsity it uses ~6x more tokens per active params (see [1]). The Llama 3 405B report measured 40 token per param to be optimal with their data at 4*10^25 FLOPs. And for a 1:32 sparse MoE model such as DeepSeek v3, this suggests 240 tokens per param could well end up being optimal! At this ratio, things would break down. A 4*10^27 FLOPs model (a pretraining run that might be planned e.g. for 2026) will need 400T tokens. A 5*10^28 FLOPs model would require O(1400T) tokens. These are insane numbers, and they only get worse into the 2030s! The totally unfiltered Common Crawl is about 240T tokens. People have been offsetting this to some extent by training for multiple epochs or repeating the same data a la "Scaling Data-Constrained Language Models" by Muennighoff et al. (2023). Of course, this is a naive BOTEC, and I'm happy to dive into more details, e.g. how much compute might be put into other uses, such as long-horizon RLVR which could well require a lot of those 5*10^28 FLOPs. But we are casually talking about hundreds of trillions to over a quadrillion tokens as compute-optimal! It makes one question whether these numbers are actually necessary for the kind of capability gains we want. We are working on this question at @flappyairplanes, and we're excited to be advised by @karpathy. I will end here with this @ilyasut quote from the @dwarkesh_sp episode with him: "The data is very clearly finite. What do you do next? Either you do some kind of souped-up pre-training, a different recipe from the one you’ve done before, or you’re doing RL, or maybe something else. But now that compute is big, compute is now very big, in some sense we are back to the age of research. [...] Up until 2020, from 2012 to 2020, it was the age of research. Now, from 2020 to 2025, it was the age of scaling—maybe plus or minus, let’s add error bars to those years—because people say, “This is amazing. You’ve got to scale more. Keep scaling.” The one word: scaling. But now the scale is so big. Is the belief really, “Oh, it’s so big, but if you had 100x more, everything would be so different?” It would be different, for sure. But is the belief that if you just 100x the scale, everything would be transformed? I don’t think that’s true. So it’s back to the age of research again, just with big computers." [1] arxiv: 2501.12370

Andrej Karpathy@karpathy

A conventional narrative you might come across is that AI is too far along for a new, research-focused startup to outcompete and outexecute the incumbents of AI. This is exactly the sentiment I listened to often when OpenAI started ("how could the few of you possibly compete with Google?") and 1) it was very wrong, and then 2) it was very wrong again with a whole another round of startups who are now challenging OpenAI in turn, and imo it still continues to be wrong today. Scaling and locally improving what works will continue to create incredible advances, but with so much progress unlocked so quickly, with so much dust thrown up in the air in the process, and with still a large gap between frontier LLMs and the example proof of the magic of a mind running on 20 watts, the probability of research breakthroughs that yield closer to 10X improvements (instead of 10%) imo still feels very high - plenty high to continue to bet on and look for. The tricky part ofc is creating the conditions where such breakthroughs may be discovered. I think such an environment comes together rarely, but @bfspector & @amspector100 are brilliant, with (rare) full-stack understanding of LLMs top (math/algorithms) to bottom (megakernels/related), they have a great eye for talent and I think will be able to build something very special. Congrats on the launch and I look forward to what you come up with!

English

131

29.3K

Kunvar Thaman retweetledi

Flapping Airplanes@flappyairplanes·28 Oca

We estimate that humans are 100,000x to 1,000,000x more sample efficient than existing models. To achieve such large gains, we need big ideas.

English

527

110.1K

Kunvar Thaman retweetledi

Flapping Airplanes@flappyairplanes·28 Oca

Announcing Flapping Airplanes! We’ve raised $180M from GV, Sequoia, and Index to assemble a new guard in AI: one that imagines a world where models can think at human level without ingesting half the internet.