
Richard Sutton
405 posts

Richard Sutton
@RichardSSutton
Student of mind and nature, libertarian, chess player, cancer survivor. @ Keen, UAlberta, Amii, https://t.co/u8za2Kod54, The Royal Society, Turing Award


Hi RL Enthusiasts! RLC is coming to Montreal, Quebec: Aug 16–19, 2026. CFP is out now: rl-conference.cc/callforpapers.… Abstract: Mar 1 Submission: Mar 5 AOE Submit your best work and please share widely! #RLC #MachineLearning #AI #LLMs #RLHF #ReinforcementLearning #Research


Elon Musk just explained why the best engineers on Earth will never take your call. Three reasons. Most companies fail all three. Elon Musk: “State what’s the mission, what’s the problem we’re trying to solve? And just be clearly willing to pour a lot of blood, sweat, and tears into it.” The top one percent of talent does not care about your office. Your perks. Your free lunches. Your branded hoodie. They care about one thing. Does this matter. If the answer takes more than one sentence to explain, they are already gone. Musk broke motivation into three layers. The first is the work itself. Musk: “Somebody’s got to look forward to coming to work in the morning. Are they enjoying the work itself intrinsically?” Not the paycheck. Not the title. The work. Solving problems most people cannot even frame correctly. Alongside people who make you sharper just by being in the room. If Monday morning feels like a sentence, no salary commutes it. The best people leave. Not eventually. Within months. The second is money. Musk: “They also feel like they will receive fair financial compensation. Like that the financial rewards are good and fair.” Not charity. Not below-market equity with a four-year cliff. Fair. One word. Most companies still get it wrong. The engineer who knows exactly what they generate does not negotiate. They compare. When the gap gets wide enough, they vanish. No conversation. No counteroffer window. Two-week notice on a Friday afternoon. You do not cap the ceiling of someone producing at that level. You match it. Or you fill the desk again in six months. The third is the one that separates real companies from forgettable ones. Musk: “For the best people in the world, they’ll want to know: is what they’re doing going to matter? If they spend 10 years doing this, will it make a difference to the world?” Ten years. The best engineers on the planet are running a calculation no recruiter has a spreadsheet for. If I give this company a decade of my life, does the world look different because I did? If the answer is no, they are not coming. No signing bonus changes that. No recruiter pitch rewrites it. No equity package papers over it. The mission has to be real. And the person at the top has to be visibly bleeding for it. Musk: “Be clearly willing to pour blood, sweat, and tears into it.” Talent watches the founder before they read the offer letter. If the person running the company is coasting, optimizing for exits, playing it safe, the best people sense it before the first interview ends. They do not want a manager. They want someone who has bet everything and would do it again tomorrow. Most companies post a job. The ones that land the best people alive offer something no job listing can contain. The work has to be the reward. The money has to be fair. The mission has to be worth a decade of someone’s only life. Miss one and the person you needed most never even opened your email.




We developed an RL method for fine-tuning our models for precise tasks in just a few hours or even minutes. Instead of training the whole model, we add an “RL token” output to π-0.6, our latest model, which is used by a tiny actor and critic to learn quickly with RL.












UTAR proudly marks the successful conclusion of the 𝟏𝐬𝐭 𝐎𝐩𝐞𝐧𝐦𝐢𝐧𝐝 𝐖𝐢𝐧𝐭𝐞𝐫 𝐒𝐜𝐡𝐨𝐨𝐥. 📄 Read the full feature in The Star: thestar.com.my/metro/metro-ne… #UTAR #OpenmindWinterSchool #OpenmindResearchInstitute #AIMalaysia



