Riya Patel
61 posts


@OmarHayat0 @OfficialLoganK Yeah looks like the form is only open for US schools
English

@riyapee @OfficialLoganK They took it away? I got it a few months ago
English

Can Canadian students get access to a year of free Gemini pro too @OfficialLoganK

Logan Kilpatrick@OfficialLoganK
Introducing Gemini 3.1 Pro, our new SOTA model across most reasoning, coding, and stem use cases!
English

@riyapee @puffer_ai This is super cool 👀 are you planning open sourcing/ releasing the model?
English

A 766K param model with RL outperforms Opus 4.6 on 8 bit games. I put 4 agents into a Pico Park emulation for 30 minutes. 500 million frames later, they’ve mastered cooperation and can consistently win the game.
Play alongside my agents in the blog below! Trained with @puffer_ai
English

@dhruvbhatia0 Can the agents watch these PR videos and create a verifiable loop ?
English

great work @riyapee and thanks for making good use of our GPUs
if you're doing cool work with multi agent RL + inference acceleration and want free compute for it, dm us
Riya Patel@riyapee
A 766K param model with RL outperforms Opus 4.6 on 8 bit games. I put 4 agents into a Pico Park emulation for 30 minutes. 500 million frames later, they’ve mastered cooperation and can consistently win the game. Play alongside my agents in the blog below! Trained with @puffer_ai
English

@riyapee @puffer_ai Super cool. Have you open sourced the code ?
English

@Anishfishhh @puffer_ai I made the game, this wasn’t the og game so was easy to expose the game state
English

@riyapee @puffer_ai ooo, was it easy to get the game data? is it just available in an env or did you have to actually scrape/monitor it yourself?
English

@Anishfishhh @puffer_ai No not visual I don’t feed the pixel values in. I have access to game data so map all objects in screen which is the input to the cnn
English

@riyapee @puffer_ai how'd you get the grid input? Is it purely visual -> cnn or do you have access to game data?
English

@silennai @puffer_ai yup, beat opus on cost to train and final performance
English

@riyapee @puffer_ai Great blog! I assume it beats Opus on cost and time?
English

@shivanijpatel @puffer_ai they do, their names are written under them
English


@puffer_ai The model is trained with PPO as the core algorithm using actor-critic architecture. The encoder uses both a CNN for the grid input to keep the spatial information and an MLP for the self data vector.

English

@riyapee Could not repro on my OS (Biebian). please help. biebian.sourceforge.net
English

@nico_jeannen uvx cursor-wrapped to see your stats
x.com/riyapee/status…
Riya Patel@riyapee
find out if your ngmi or cracked uvx cursor-wrapped to get your Cursor 2025 Wrapped🎁
English




