drubinstein

134 posts

drubinstein

@dsrubinstein

Making models go brrr | Engineering @reflection_ai | Occasional PufferLib contributor

Katılım Mart 2024

82 Takip Edilen556 Takipçiler

Sabitlenmiş Tweet

drubinstein@dsrubinstein·5 Mar

Excited to finally share our progress in developing a reinforcement learning system to beat Pokémon Red. Our system successfully completes the game using a policy under 10M parameters, PPO, and a few novel techniques. Blog posted below

English

401

55.8K

drubinstein@dsrubinstein·4 Nis

Move over nvidia-smi, ibstat is my new best friend

English

163

drubinstein@dsrubinstein·19 Mar

I dared him to try an ai assisted native rewrite. As expected, 10k sps at best has now become 4M sps. Nice.

Dan Advantage@DanAdvantage

i did start with a rudimentary implementation of pokemon stemming from a native rewrite of pokemon firered. the starting point i used gets around 4,000,000 steps per second as an rl env. here is the entire prompt (caution: long!!!):

English

200

drubinstein@dsrubinstein·16 Mar

대박! A year ago we announced our series A. Today we’re announcing an amazing partnership with Shinsegae. Who knows what’ll come next?

Reflection@reflection_ai

Reflection is partnering with Shinsegae Group to build a 250-megawatt sovereign AI factory for the Republic of Korea. Open intelligence. Built on trust between allies. Owned by the nations that need it most. The future of sovereign AI. Read more in the @WSJ.

English

315

drubinstein@dsrubinstein·27 Oca

Underrated: Letting a coding agent run when you're in meetings.

English

140

drubinstein@dsrubinstein·26 Oca

@kywch500 @jsuarez Read to learn more kywch.github.io/blog/2026/01/l…

English

427

drubinstein@dsrubinstein·26 Oca

Had some fun helping out @kywch500 and @jsuarez simplifying Pufferlib's 2048 env the last couple of weeks. 2x better results with fewer observations, rewards and a new model architecture!

English

6.6K

drubinstein@dsrubinstein·26 Oca

@TheSayakMondal More or less. Getting to 128k is literally that.

English

168

drubinstein@dsrubinstein·26 Oca

2048 is an interesting RL env. It can take over 20k steps to get to 65536.

English

377

drubinstein@dsrubinstein·30 Ara

@aturker01 Any idea what a benchmark for backward would look like?

English

Abdussamet@aturker01·28 Ara

Bench and kernel code here -> -> github.com/aturker1/cutes…

English

2.7K

Abdussamet@aturker01·28 Ara

Torch RMSNorm implementation on B200 is just bad. I wrote a ~70 LOC CuTe DSL kernel using TV layout and it beats torch by a large margin. I also included @tri_dao’s Quack, but its overhead is huge for small inputs

English

340

32.8K

drubinstein@dsrubinstein·8 Ara

@brandondamos @reflection_ai Welcome to the team! Super excited to have you here!

English

279

Brandon Amos@brandondamos·8 Ara

An update: I have left Meta Superintelligence Labs and joined @reflection_ai in NYC!! Today is my first day. I started in the Fundamental AI Research (FAIR) lab at Meta, then Facebook, over six (!) years ago as my first job out of the PhD. They were some formative years. The group is full of exceptionally talented people that have profoundly shaped my perspective on life and research. I am grateful for everything we have shared and proud of everything we created together. I have decided it's time to try to build a startup and new frontier models with Reflection. Superintelligence will be one of the most significant advancements of our lifetimes, resulting in a computational reflection of ourselves. We believe it should be safe, open, and accessible to all. I am excited to be jumping into the post-training and reinforcement learning pipelines to advance capabilities and alignment. And we are hiring! Please get in touch.

English

848

94K

drubinstein@dsrubinstein·8 Ara

@VoidAsuka @DanAdvantage You do you.

English

Asuka🎀@VoidAsuka·8 Ara

@DanAdvantage @dsrubinstein just too many funnier things in life

English

121

Asuka🎀@VoidAsuka·8 Ara

I share this affliction

Andrej Karpathy@karpathy

Happy weekend to those who celebrate

English

drubinstein@dsrubinstein·2 Ara

Welcome to the team!

Behrooz Ghorbani@_ghorbani

Hi friends, after three incredible years at OpenAI I am excited to share that I am starting a new chapter at @reflection_ai, where I will be leading the Science of Scaling team. Our mission is to deepen the scientific understanding of large scale learning and to turn compute into intelligence as efficiently and predictably as possible.

English

463

drubinstein retweetledi

driss guessous@drisspg·31 Eki

As someone who spends way to much time in the PyTorch Profiler; github.com/gaogaotiantian… ^^ I really like this ^^

English

279

32.4K

drubinstein@dsrubinstein·6 Kas

Welcome to the team!

🇺🇦 Alex Polozov@Skiminok

🎉 Next week, I am excited to join @reflection_ai as a Member of Technical Staff to help build the open intelligence ecosystem of the Western world. It's the most exciting opportunity to help software builders in our time, and will shape many years of AI Engineering in the medium-term before AGI. Not just about Western vs Eastern open models, but more about how AI-driven software will look like in 2030. I spent some time articulating my thoughts about where we're going as a community and why... which became a whole blog post. Take a look, hope it interests you! (And if it really does, we are hiring in NYC, SF, and London 😉) alexpolozov.com/blog/reflectio…

English

337

drubinstein@dsrubinstein·1 Kas

It's amazing how much more you can debug and measure once you figure out the parent process's pid.

English

297

drubinstein@dsrubinstein·25 Eki

We didn't qualify for the Gen1OU #pokeagent competition at Neurips. Found some bugs way too late. Everything will be open sourced as a part of #pufferlib so you can try training Pokemon Gen 1 battling at 1M SPS!

English

2.3K

drubinstein@dsrubinstein·18 Eki

Attempting to beat the Neurips 2025 #PokeAgent challenge with a 2M parameter model in collaboration with @cooperunion and @jsuarez . Will it work? I hope so! Regardless, all work will be open-sourced.

English

2.8K

drubinstein@dsrubinstein·9 Eki

@FlintCasey 🫡

QME

Casey Flint@FlintCasey·9 Eki

I am so very grateful to be part of this team and to have this news out! The world needs more open models and open science. Reflection is taking that need seriously and now we have the talent, compute and capital to make it happen.

Reflection@reflection_ai

Today we're sharing the next phase of Reflection. We're building frontier open intelligence accessible to all. We've assembled an extraordinary AI team, built a frontier LLM training stack, and raised $2 billion. Why Open Intelligence Matters Technological and scientific progress is driven by values of openness and collaboration. The internet, Linux, and the protocols and standards that underpin modern computing are all open. This isn't a coincidence. Open software is what gets forked, customized, and embedded into systems worldwide. It's what universities teach, what startups build on, what enterprises deploy. Open science enables others to learn from the results, be inspired by them, interrogate them, and build upon them in order to push the frontier of human knowledge and scientific advancement. AI got to where it is today through scaling ideas (e.g. self-attention, next token prediction, reinforcement learning) that were shared and published openly. Now AI is becoming the technology layer that everything else runs on top of. The systems that accelerate scientific research, enhance education, optimize energy usage, supercharge medical diagnoses, and run supply chains will all be built on AI infrastructure. But the frontier is currently concentrated in closed labs. If this continues, a handful of entities will control the capital, compute, and talent required to build AI, creating a runaway dynamic that locks everyone else out. There's a narrow window to change this trajectory. We need to build open models so capable that they become the obvious choice for users and developers worldwide, ensuring the foundation of intelligence remains open and accessible rather than controlled by a few. What We've Built Over the last year, we've been preparing for this mission. We’ve assembled a team who have pioneered breakthroughs including PaLM, Gemini, AlphaGo, AlphaCode, AlphaProof, and contributed to ChatGPT and Character AI, among many others. We built something once thought possible only inside the world’s top labs: a large-scale LLM and reinforcement learning platform capable of training massive Mixture-of-Experts (MoEs) models at frontier scale. We saw the effectiveness of our approach first-hand when we applied it to the critical domain of autonomous coding. With this milestone unlocked, we're now bringing these methods to general agentic reasoning. We've raised significant capital and identified a scalable commercial model that aligns with our open intelligence strategy, ensuring we can continue building and releasing frontier models sustainably. We are now scaling up to build open models that bring together large-scale pretraining and advanced reinforcement learning from the ground up. Safety and Responsibility Open intelligence also changes how we think about safety. It enables the broader community to participate in safety research and discourse, rather than leaving critical decisions to a few closed labs. Transparency allows independent researchers to identify risks, develop mitigations, and hold systems accountable in ways that closed development cannot. But openness also requires confronting the challenges of capable models being widely accessible. We're investing in evaluations to assess capabilities and risks before release, security research to protect against misuse, and responsible deployment standards. We believe the answer to AI safety is not “security through obscurity” but rigorous science conducted in the open, where the global research community can contribute to solutions rather than a handful of companies making decisions behind closed doors. Join Us There is a window of opportunity today to build frontier open intelligence, but it is closing and this may be the last. If this mission resonates, join us.

English

23K

Keşfet

@kywch500 @jsuarez @TheSayakMondal @aturker01 @tri_dao @brandondamos @reflection_ai @VoidAsuka