Boris Power

3.1K posts

Boris Power

Boris Power

@BorisMPower

Head of Applied Research @OpenAI

San Francisco, CA Katılım Temmuz 2017
129 Takip Edilen48.4K Takipçiler
Sabitlenmiş Tweet
Boris Power
Boris Power@BorisMPower·
At @OpenAI, we believe that AI can accelerate science and drug discovery. An exciting example is our work with @RetroBiosciences, where a custom model designed improved variants of the Nobel-prize winning Yamanaka proteins. Today we published a closer look at the breakthrough. ⬇️
Boris Power tweet media
English
160
631
3.6K
2.1M
Sir Thomas Hobbes
Sir Thomas Hobbes@SirTHobbes·
@BorisMPower @PeterDiamandis the statistic is somewhat meaningless. the human is still always supervising which means they will turn off the system and take over in a dangerous situation. I do this myself with EAP (since fsd is not available here)
English
1
0
0
58
Peter H. Diamandis, MD
Peter H. Diamandis, MD@PeterDiamandis·
Tesla's FSD: 5.3 million miles between accidents. US driving average: 660,000.  That's 9x safer. And it's only getting better.
English
337
558
4.6K
35.3M
Boris Power
Boris Power@BorisMPower·
A super fun and impactful challenge - you won’t regret participating!
Runpod@runpod

Runpod is @OpenAI 's infrastructure partner for Parameter Golf, the first challenge in the Model Craft series. Train the best language model that fits in a 16MB artifact in under 10 minutes on 8×H100s. Together with OpenAI, we’re distributing up to $1M in credits across the challenge period to support experimentation and broaden participation. Enter the challenge and request credits 👇

English
1
4
32
4.7K
Boris Power
Boris Power@BorisMPower·
@gabriel1 It’s difficult to under estimate how important a successfully scaled startup can be for the local startup ecosystem. Culture, talent, angel money, VC conviction…
English
0
0
3
1.3K
gabriel
gabriel@gabriel1·
@BorisMPower a company 4 years ago accumulated enough critical mass of people who worked hard and thought about crazy ideas. multiple people started unicorns, and their successes cascaded new startups and made it mainstream. also sweden has a good culture for questioning authority
English
2
1
59
4.7K
gabriel
gabriel@gabriel1·
sweden forced me to leave all my equity behind in a startup i left before. 1 year cliff in sweden is ILLEGAL to "protect employees" they tried force me to pay 70% tax to get equity in the early startup which is money i didn't have not even communism just raw capital destruction
English
65
55
2.1K
142.2K
Curiosity
Curiosity@CuriosityonX·
BREAKING🚨: ALL FIVE types of nucleic acid bases, the building blocks of LIFE 'DNA and RNA', have been found in samples collected from asteroid Ryugu
Curiosity tweet mediaCuriosity tweet media
English
543
3.4K
23.3K
3.6M
Boris Power retweetledi
chimp
chimp@chimpp·
Strait of hormuz right now
English
582
13.7K
83.9K
4.2M
Saqib Banbhan
Saqib Banbhan@SaqibBanbh90290·
Brian test...!!! 99% lose 1% win
Saqib Banbhan tweet media
English
41.1K
931
8.7K
7.6M
Boris Power
Boris Power@BorisMPower·
@DevinOlsenn I don’t have any of these issues, but many I’m just a bad driver. The only part where i have to take over is the end and beginning of the drive
English
0
0
6
1.4K
Devin Olsen
Devin Olsen@DevinOlsenn·
Tesla FSD does basically all of my driving with a push of a button - but to be transparent I would say on almost every one of my drives I still have to do these two things: - change the speed profiles regularly to deal with speed control issues - use the turn signals; this is to stop the car from making stupid lane changes or to get the car into the proper lane so it doesn’t miss an exit. I still think FSD is amazing but I really cannot wait until these issues and the few other lingering problems with FSD get solved.
English
306
35
1.3K
75.7K
Massimo
Massimo@Rainmaker1973·
The new U.S. dime design has removed the olive branches from the eagle. This change is part of the Semiquincentennial program celebrating the 250th anniversary of the United States. While traditional U.S. coinage often balances an olive branch (peace) and arrows (war/readiness), the 2026 dime specifically focuses on the Revolutionary War theme. This redesign is temporary and authorized for one year only. According to the U.S. Mint, the dime will return to its standard Roosevelt design—which features the torch, oak branch, and olive branch—in 2027.
Massimo tweet media
English
137
854
4.8K
327.1K
Boris Power retweetledi
Andy Masley
Andy Masley@AndyMasley·
Each frontier AI model seems to use a little under a year's worth of a square mile of farmland's water to train. I think about this as the country having 4 square miles of farmland sectioned off to grow some of the most popular consumer products in history.
Andy Masley tweet media
English
214
480
8.2K
595.8K
Boris Power
Boris Power@BorisMPower·
This is a glimpse into the future
Andrej Karpathy@karpathy

Three days ago I left autoresearch tuning nanochat for ~2 days on depth=12 model. It found ~20 changes that improved the validation loss. I tested these changes yesterday and all of them were additive and transferred to larger (depth=24) models. Stacking up all of these changes, today I measured that the leaderboard's "Time to GPT-2" drops from 2.02 hours to 1.80 hours (~11% improvement), this will be the new leaderboard entry. So yes, these are real improvements and they make an actual difference. I am mildly surprised that my very first naive attempt already worked this well on top of what I thought was already a fairly manually well-tuned project. This is a first for me because I am very used to doing the iterative optimization of neural network training manually. You come up with ideas, you implement them, you check if they work (better validation loss), you come up with new ideas based on that, you read some papers for inspiration, etc etc. This is the bread and butter of what I do daily for 2 decades. Seeing the agent do this entire workflow end-to-end and all by itself as it worked through approx. 700 changes autonomously is wild. It really looked at the sequence of results of experiments and used that to plan the next ones. It's not novel, ground-breaking "research" (yet), but all the adjustments are "real", I didn't find them manually previously, and they stack up and actually improved nanochat. Among the bigger things e.g.: - It noticed an oversight that my parameterless QKnorm didn't have a scaler multiplier attached, so my attention was too diffuse. The agent found multipliers to sharpen it, pointing to future work. - It found that the Value Embeddings really like regularization and I wasn't applying any (oops). - It found that my banded attention was too conservative (i forgot to tune it). - It found that AdamW betas were all messed up. - It tuned the weight decay schedule. - It tuned the network initialization. This is on top of all the tuning I've already done over a good amount of time. The exact commit is here, from this "round 1" of autoresearch. I am going to kick off "round 2", and in parallel I am looking at how multiple agents can collaborate to unlock parallelism. github.com/karpathy/nanoc… All LLM frontier labs will do this. It's the final boss battle. It's a lot more complex at scale of course - you don't just have a single train. py file to tune. But doing it is "just engineering" and it's going to work. You spin up a swarm of agents, you have them collaborate to tune smaller models, you promote the most promising ideas to increasingly larger scales, and humans (optionally) contribute on the edges. And more generally, *any* metric you care about that is reasonably efficient to evaluate (or that has more efficient proxy metrics such as training a smaller network) can be autoresearched by an agent swarm. It's worth thinking about whether your problem falls into this bucket too.

English
11
12
186
22.1K