Devin White

83 posts

Devin White

@DevinWhiteAI

ML Researcher @USAEOP | Reinforcement learning, RLHF & LLMs

Sumali Şubat 2024

182 Sinusundan20 Mga Tagasunod

Naka-pin na Tweet

Devin White@DevinWhiteAI·10 Oca

Ever wanted to learn the basics of Fine-tuning a LLM? I just built a complete, single GPU friendly, end to end pipeline for RLHF fine-tuning using @huggingface TRL! Here is the 2 stage process I did: 1⃣ Train a "Judge" (reward model) on human preferences (a subset of the @NVIDIAAI HelpSteer3 dataset) 2⃣ Align @GoogleDeepMind Gemma 3 with the judge using RLOO! Try out the code below👇 #RLHF #ML #AI (Image generated by @NanoBanana )

English

430

Devin White@DevinWhiteAI·15 Oca

@wesroth If true this could be huge for research!

English

136

Wes Roth@WesRoth·15 Oca

Grok 4.20 is, by all accounts, going to be insane... early preview researchers out of UCI said "Grok 4.20 found a new Bellman function" this is a credible report that Grok 4.20 is capable of automated theorem discovery mathematicians usually have to guess the formula based on intuition... it's fuzzy (not a technical term) the solution Grok 4.20 gave was "sharp" ( sharp *is* a technical term, meaning it's precisive and accurate)

English

2.4K

Devin White@DevinWhiteAI·13 Oca

@googledevs Made a simple RLHF pipeline for Gemma 3 fine-tuning. Includes preprocessing, reward model training and RLOO fine-tuning! Code: github.com/Dev1nW/Simplif… #RLHF #Gemma #AI #GoogleDev (Image generated by @NanoBanana)

English

227

Google for Developers@googledevs·12 Oca

First commit of 2026. What are we building?

English

518

1.4K

195.9K

Devin White@DevinWhiteAI·13 Oca

@NewsFromGoogle This is great! With how good Gemini 3 Pro is, I'm excited to see what updates Siri gets!

English

News from Google@NewsFromGoogle·12 Oca

Joint Statement: Apple and Google have entered into a multi-year collaboration under which the next generation of Apple Foundation Models will be based on Google's Gemini models and cloud technology. These models will help power future Apple Intelligence features, including a more personalized Siri coming this year. After careful evaluation, Apple determined that Google's Al technology provides the most capable foundation for Apple Foundation Models and is excited about the innovative new experiences it will unlock for Apple users. Apple Intelligence will continue to run on Apple devices and Private Cloud Compute, while maintaining Apple's industry-leading privacy standards.

English

1.6K

6.4K

52.1K

11M

Devin White@DevinWhiteAI·10 Oca

Full Code: github.com/Dev1nW/Simplif…

English

108

Devin White@DevinWhiteAI·10 Oca

English

430

Devin White@DevinWhiteAI·9 Oca

@DanKornas Thanks for sharing! Just checked out the repo and it has tons of great resources! The finetuning guide in particular looks super useful! Great resource for LLMs, RL and AI/ML in general!

English

207

Dan Kornas@DanKornas·7 Oca

Algorithms for AI & ML This 69-page book from Stanford University is now absolutely FREE. github.com/AniruddhaChatt…

English

244

1.3K

68.2K

Devin White@DevinWhiteAI·8 Oca

Try it out and let me know what you think! More updates coming soon! Code: github.com/Dev1nW/Simplif…

English

Devin White@DevinWhiteAI·8 Oca

⚙️Why This Project? The goal of this project is to provide an easy to use template for RLHF experiments, research or just fun, while also being able to fit on a consumer GPU and being easy to use!

English

112

Devin White@DevinWhiteAI·8 Oca

🚀Excited to share my latest project: A simple implementation of RLHF for Language Models using @GoogleDeepMind Gemma 3! It demos the complete training loop from reward modeling to policy training, all powered by @huggingface TRL! Ideal for learning RLHF basics without the complexity. Thread 👇 #AI #ML #RLHF #HuggingFace #Gemma

English

166

Devin White@DevinWhiteAI·1 Oca

I’m especially thankful to all my collaborators, mentors, friends and the researchers I had the chance to meet, learn from, and exchange ideas with.

English

Devin White@DevinWhiteAI·1 Oca

1 paper at a #NeurIPS workshop (@arlet_workshop) Human-Inspired Multi-Level Reinforcement Learning: arxiv.org/pdf/2501.07502 And reached 50 citations this year!📈

English

Devin White@DevinWhiteAI·1 Oca

As 2025 comes to a close, I wanted to say thank you to everyone for their support throughout the year! Some research highlights are in the thread 🧵 Looking forward to what 2026 has in store and excited for some updates coming soon! #AI #ML #ReinforcementLearning #RLHF

English

Devin White@DevinWhiteAI·6 Ara

Tomorrow at #NeurIPS2025 I’ll be presenting “Human-Inspired Multi-Level Reinforcement Learning” at the ARLET workshop. 🕤 Poster Session 2 - 3:30 PM 📍 Upper Level Room 31ABC Working on RL/RLHF? Come say hi 👋 #ReinforcementLearning #RLHF

English

410

Devin White@DevinWhiteAI·25 Kas

Some more information on our paper at the ARLET workshop at #neurips2025! 📅 Date: Saturday, December 6, 2025 🕤 Poster Session 1: 11:15 AM 🕤 Poster Session 2: 3:30 PM 📍 Location: Upper Level Room 31ABC 📝 Paper: arxiv.org/pdf/2501.07502 #ReinforcementLearning #RLHF #AIResearch @arlet_workshop

English

291

Devin White nag-retweet

Manling Li@ManlingLi_·24 Kas

We are looking for PhDs and Postdocs! So proud of my students on achieving so many amazing things during their "very first year". I have been asked many times how I like being faculty, especially with funding cuts. My answer is always "it is the prefect job for me"! Still deep in the honeymoon phase. The only reason is the students are so amazing, making my transition so much easier. One year in, they already collected paper awards, orals, spotlights, etc What makes me proudest is they are vividly alive: curious, playful, confident in their own weird way, light up when talking about ideas, and never afraid to explore "the thing might fail". Everyone is just… themselves. And somehow, that version of themselves keeps shipping amazing work. In today's anxious academic world, this kind of aliveness is what I will try best to protect. Maybe the best part of being an advisor is that every student is so different and unique lol Interestingly, coming to second year, they've got their own passions, I can't just plug my ideas into their heads. So when I get excited about sth new, my first thought is: "Okay, time to find some fresh first-years who will be thrilled about this!" MLL lab is 1 year old, we started right in Oct 2024. We are growing and looking for more phds to join us! 1. Why our lab? (1/2) 2. Why @northwesterncs? (2/2) In 2025 alone: NU has 7 faculty as Sloan Fellows, plus a Nobel winner! Check more below

English

142

193.2K

Devin White@DevinWhiteAI·22 Kas

🚨Exciting news! Our paper “Human-Inspired Multi-Level Reinforcement Learning” was accepted to the ARLET Workshop @ NeurIPS 2025! In this paper we not only do Rating-based Reinforcement Learning but use the ratings as demonstrations for learning as well! Excited to share more at the poster session! 🚀 #NeurIPS2025 #ReinforcementLearning #AIResearch @arlet_workshop

English

145

Tuklasin

@wesroth @googledevs @NanoBanana @NewsFromGoogle @huggingface @NVIDIAAI @GoogleDeepMind @DanKornas