drubinstein

134 posts

drubinstein banner
drubinstein

drubinstein

@dsrubinstein

Making models go brrr | Engineering @reflection_ai | Occasional PufferLib contributor

Katılım Mart 2024
82 Takip Edilen556 Takipçiler
Sabitlenmiş Tweet
drubinstein
drubinstein@dsrubinstein·
Excited to finally share our progress in developing a reinforcement learning system to beat Pokémon Red. Our system successfully completes the game using a policy under 10M parameters, PPO, and a few novel techniques. Blog posted below
English
13
33
401
55.8K
drubinstein
drubinstein@dsrubinstein·
Move over nvidia-smi, ibstat is my new best friend
English
2
0
2
163
drubinstein
drubinstein@dsrubinstein·
Underrated: Letting a coding agent run when you're in meetings.
English
0
0
2
140
drubinstein
drubinstein@dsrubinstein·
Had some fun helping out @kywch500 and @jsuarez simplifying Pufferlib's 2048 env the last couple of weeks. 2x better results with fewer observations, rewards and a new model architecture!
English
2
3
17
6.6K
drubinstein
drubinstein@dsrubinstein·
2048 is an interesting RL env. It can take over 20k steps to get to 65536.
English
0
0
5
377
drubinstein
drubinstein@dsrubinstein·
@aturker01 Any idea what a benchmark for backward would look like?
English
0
0
0
17
Abdussamet
Abdussamet@aturker01·
Torch RMSNorm implementation on B200 is just bad. I wrote a ~70 LOC CuTe DSL kernel using TV layout and it beats torch by a large margin. I also included @tri_dao’s Quack, but its overhead is huge for small inputs
Abdussamet tweet media
English
15
23
340
32.8K
Brandon Amos
Brandon Amos@brandondamos·
An update: I have left Meta Superintelligence Labs and joined @reflection_ai in NYC!! Today is my first day. I started in the Fundamental AI Research (FAIR) lab at Meta, then Facebook, over six (!) years ago as my first job out of the PhD. They were some formative years. The group is full of exceptionally talented people that have profoundly shaped my perspective on life and research. I am grateful for everything we have shared and proud of everything we created together. I have decided it's time to try to build a startup and new frontier models with Reflection. Superintelligence will be one of the most significant advancements of our lifetimes, resulting in a computational reflection of ourselves. We believe it should be safe, open, and accessible to all. I am excited to be jumping into the post-training and reinforcement learning pipelines to advance capabilities and alignment. And we are hiring! Please get in touch.
Brandon Amos tweet media
English
69
21
848
94K
drubinstein
drubinstein@dsrubinstein·
Welcome to the team!
🇺🇦 Alex Polozov@Skiminok

🎉 Next week, I am excited to join @reflection_ai as a Member of Technical Staff to help build the open intelligence ecosystem of the Western world. It's the most exciting opportunity to help software builders in our time, and will shape many years of AI Engineering in the medium-term before AGI. Not just about Western vs Eastern open models, but more about how AI-driven software will look like in 2030. I spent some time articulating my thoughts about where we're going as a community and why... which became a whole blog post. Take a look, hope it interests you! (And if it really does, we are hiring in NYC, SF, and London 😉) alexpolozov.com/blog/reflectio…

English
0
0
4
337
drubinstein
drubinstein@dsrubinstein·
It's amazing how much more you can debug and measure once you figure out the parent process's pid.
English
1
1
4
297
drubinstein
drubinstein@dsrubinstein·
We didn't qualify for the Gen1OU #pokeagent competition at Neurips. Found some bugs way too late. Everything will be open sourced as a part of #pufferlib so you can try training Pokemon Gen 1 battling at 1M SPS!
English
1
2
13
2.3K
drubinstein
drubinstein@dsrubinstein·
Attempting to beat the Neurips 2025 #PokeAgent challenge with a 2M parameter model in collaboration with @cooperunion and @jsuarez . Will it work? I hope so! Regardless, all work will be open-sourced.
English
1
1
13
2.8K
Casey Flint
Casey Flint@FlintCasey·
I am so very grateful to be part of this team and to have this news out! The world needs more open models and open science. Reflection is taking that need seriously and now we have the talent, compute and capital to make it happen.
Reflection@reflection_ai

Today we're sharing the next phase of Reflection. We're building frontier open intelligence accessible to all. We've assembled an extraordinary AI team, built a frontier LLM training stack, and raised $2 billion. Why Open Intelligence Matters Technological and scientific progress is driven by values of openness and collaboration. The internet, Linux, and the protocols and standards that underpin modern computing are all open. This isn't a coincidence. Open software is what gets forked, customized, and embedded into systems worldwide. It's what universities teach, what startups build on, what enterprises deploy. Open science enables others to learn from the results, be inspired by them, interrogate them, and build upon them in order to push the frontier of human knowledge and scientific advancement. AI got to where it is today through scaling ideas (e.g. self-attention, next token prediction, reinforcement learning) that were shared and published openly. Now AI is becoming the technology layer that everything else runs on top of. The systems that accelerate scientific research, enhance education, optimize energy usage, supercharge medical diagnoses, and run supply chains will all be built on AI infrastructure. But the frontier is currently concentrated in closed labs. If this continues, a handful of entities will control the capital, compute, and talent required to build AI, creating a runaway dynamic that locks everyone else out. There's a narrow window to change this trajectory. We need to build open models so capable that they become the obvious choice for users and developers worldwide, ensuring the foundation of intelligence remains open and accessible rather than controlled by a few. What We've Built Over the last year, we've been preparing for this mission. We’ve assembled a team who have pioneered breakthroughs including PaLM, Gemini, AlphaGo, AlphaCode, AlphaProof, and contributed to ChatGPT and Character AI, among many others. We built something once thought possible only inside the world’s top labs: a large-scale LLM and reinforcement learning platform capable of training massive Mixture-of-Experts (MoEs) models at frontier scale. We saw the effectiveness of our approach first-hand when we applied it to the critical domain of autonomous coding. With this milestone unlocked, we're now bringing these methods to general agentic reasoning. We've raised significant capital and identified a scalable commercial model that aligns with our open intelligence strategy, ensuring we can continue building and releasing frontier models sustainably. We are now scaling up to build open models that bring together large-scale pretraining and advanced reinforcement learning from the ground up. Safety and Responsibility Open intelligence also changes how we think about safety. It enables the broader community to participate in safety research and discourse, rather than leaving critical decisions to a few closed labs. Transparency allows independent researchers to identify risks, develop mitigations, and hold systems accountable in ways that closed development cannot. But openness also requires confronting the challenges of capable models being widely accessible. We're investing in evaluations to assess capabilities and risks before release, security research to protect against misuse, and responsible deployment standards. We believe the answer to AI safety is not “security through obscurity” but rigorous science conducted in the open, where the global research community can contribute to solutions rather than a handful of companies making decisions behind closed doors. Join Us There is a window of opportunity today to build frontier open intelligence, but it is closing and this may be the last. If this mission resonates, join us.

English
14
2
96
23K