Angelo Poerio retweetledi
Angelo Poerio
2.3K posts

Angelo Poerio retweetledi
Angelo Poerio retweetledi

One theorem every ML engineer should know:
The Bellman Optimality Principle.
It states that the optimal solution to a decision problem can be constructed recursively from optimal subproblems.
In reinforcement learning, this becomes:
Why it matters:
• Foundation of Q-learning and dynamic programming
• Enables sequential decision-making under uncertainty
• Central to robotics, game AI, and autonomous systems
• Connects optimization with learning
The profound idea:
Intelligence can emerge from recursively improving future decisions.
Almost every modern RL algorithm —
from DQN to AlphaGo —
builds on Bellman’s insight.
Reinforcement learning is ultimately the mathematics of long-term consequences.
Image: share.google/AIBaxXi8u61KVl…

English
Angelo Poerio retweetledi
Angelo Poerio retweetledi

Angelo Poerio retweetledi

The most comprehensive RL overview I've ever seen.
Kevin Murphy from Google DeepMind, who has over 128k citations, wrote this.
What makes this different from other RL resources:
→ It bridges classical RL with the modern LLM era:
There's an entire chapter dedicated to "LLMs and RL" covering:
- RLHF, RLAIF, and reward modeling
- PPO, GRPO, DPO, RLOO, REINFORCE++
- Training reasoning models
- Multi-turn RL for agents
- Test-time compute scaling
→ The fundamentals are crystal clear
Every major algorithm, like value-based methods, policy gradients, and actor-critic are explained with mathematical rigor.
→ Model-based RL and world models get proper coverage
Covers Dreamer, MuZero, MCTS, and beyond, which is exactly where the field is heading.
→ Multi-agent RL section
Game theory, Nash equilibrium, and MARL for LLM agents.
I have shared the arXiv paper in the replies!

English

The New Linux Kernel AI Bot Uncovering Bugs Is A Local LLM On Framework Desktop + AMD Ryzen AI Max - Phoronix share.google/USwdBdDp6WPygq…
English
Angelo Poerio retweetledi
Angelo Poerio retweetledi

Angelo Poerio retweetledi
Angelo Poerio retweetledi
Angelo Poerio retweetledi
Angelo Poerio retweetledi

A single GPU can now calculate hundreds of global weather scenarios in under 60 seconds. The exact same task requires a supercomputer and hours of brute-force physics.
Google DeepMind recently released WeatherNext 2. The model beats the previous state-of-the-art system on 99.9% of weather variables across a 15-day forecast window. It achieves this massive jump in accuracy using a new modelling approach called a Functional Generative Network.
Meteorologists categorise weather data into two buckets:
1. Marginals are isolated data points, like the precise temperature at a specific location or the wind speed at a certain altitude.
2. Joints are the massive, interconnected systems that form when all those individual elements interact.
The researchers hid the joint systems from the model during training. They only taught it the isolated marginals. When they turned it on, the model skillfully predicted the massive, complex systems anyway.
The architecture forces an 87-million-dimensional output distribution through a 32-dimensional mathematical bottleneck. To survive this severe constraint and still produce accurate individual data points, the neural network has no choice but to learn the underlying physics linking everything together. It figures out the weather because that’s the most efficient way to solve the maths.
The practical results are immediate. The model gives forecasters a full 24-hour advantage in tropical cyclone tracking compared to the previous leading system. It maps extreme wind speeds and heatwaves with unprecedented precision.
We’re watching a pretty big shift in predictive capabilities. The machine is deducing the structural reality of planetary weather from isolated fragments of data.

English
Angelo Poerio retweetledi
Angelo Poerio retweetledi

netwatch is an all in one network diagnostics tool that monitors connections in real time.
It has a live traffic timeline, ASCII network map, latency heat-maps and more.
Matt Hartley (matthart1983 on GitHub) made netwatch using @ratatui_rs and is Terminal Tool of the Week! ⭐️
English


















