Ryan Keisler

370 posts

Ryan Keisler banner
Ryan Keisler

Ryan Keisler

@RyanKeisler

AI weather @brightbandtech. Previously @KoBold_Metals, @DescartesLabs, cosmology @UChicago & @Stanford. Sensors, simulation, & ML.

New Mexico, USA Katılım Haziran 2013
778 Takip Edilen1.8K Takipçiler
Sabitlenmiş Tweet
Ryan Keisler
Ryan Keisler@RyanKeisler·
📢 Time to share a project I’ve been working on: Forecasting Global Weather with Graph Neural Networks 🧵 (1/N)
English
40
276
1.8K
0
jcamdr
jcamdr@jcamdr70·
@RyanKeisler @benedictk__ Very interesting, congratulations! This raises a fascinating question: how do we identify what the AI ​​model possesses that the physical model lacks?
English
1
0
0
82
Ryan Keisler
Ryan Keisler@RyanKeisler·
I'm excited to finally open-source the model from my 2022 paper, “Forecasting Global Weather with Graph Neural Networks”. Highlights: • 10-day forecast in <1 min • Initialize forecasts from ERA5 or IFS analysis • Scripts for eval, sensitivities, & Hurricane Sandy
English
17
136
1.5K
125.7K
Ryan Keisler
Ryan Keisler@RyanKeisler·
@benedictk__ It has skill, i.e. better than the avg weather, out to 9 or 10 days. When I released the paper in 2022, this was the SOTA AI model. Fast forward to 2026, and the top (open) AI models are from Google DeepMind and ECMWF, and are generally better than the best physics-based model.
English
1
1
5
586
Ryan Keisler
Ryan Keisler@RyanKeisler·
@TeksEdge Good question! I haven't profiled it, but I bet it will work on most laptops. I would start with CPU only (the default). You'll need to first install git and uv. If you try it and run into any issues, please LMK via a github issue. #installation" target="_blank" rel="nofollow noopener">github.com/rkeisler/keisl…
English
0
0
7
679
David Hendrickson
David Hendrickson@TeksEdge·
@RyanKeisler Congratulations! What are the minimum requirements for model/GPU support? I'd love to predict my weather.
English
1
0
3
1.3K
Ryan Keisler
Ryan Keisler@RyanKeisler·
@dcxStep Yea, you've got it. The only detail I would add is red = positive derivative = a perturbation to U at some point x reinforces a same-signed perturbation to U at the cursor point 24h later; blue = negative derivative = reverse-sign perturbation at cursor point.
English
1
0
1
36
Stephan Hoyer
Stephan Hoyer@shoyer·
@RyanKeisler This paper was such a breakthrough! Reading it was the first time I believed that SOTA pure-AI weather prediction was possible. Thanks for sharing, Ryan.
English
2
0
19
2.4K
Ryan Keisler
Ryan Keisler@RyanKeisler·
@paul_skeie I knew that I wanted to highlight the forecast sensitivities because they're so fast to compute (I think this was 5 minutes?) and under-visualized. Then I happened to land on d(u500)/d(u500) as a nice one.
English
0
0
4
1.3K
Paul Skeie
Paul Skeie@paul_skeie·
@RyanKeisler It is very cool what you did there. How did you come up with the idea to do that?
English
1
0
3
1.4K
Ryan Keisler
Ryan Keisler@RyanKeisler·
My hope is that, as a relatively lightweight model, this can serve as a starting point for anyone interested in ML weather prediction. github.com/rkeisler/keisl…
English
1
4
76
3.3K
Ryan Keisler
Ryan Keisler@RyanKeisler·
The video above illustrates an example of forecast sensitivity: how does tomorrow’s wind at a certain location depend on today’s winds everywhere else?
English
1
0
23
3.2K
Stephan Hoyer
Stephan Hoyer@shoyer·
Genuine question for those who train LLMs -- are there any effective strategies for going beyond next-few-token prediction for pre-training? This is serious issue for building AI climate models, which are currently trained on weather forecasting (predicting forward a few days).
Reflection@reflection_ai

Most approaches to “agentic AI” focus on post-training fixes. In this conversation, member of our technical staff, @achowdhery argues the bottleneck is pre-training itself. Drawing on her work on PaLM and early Gemini, she explains why next-token prediction breaks down for long-horizon planning -- and how objectives, attention, and training data must evolve to support true agentic behavior.

English
2
1
12
2.8K
Ryan Keisler retweetledi
Shirley Ho
Shirley Ho@cosmo_shirley·
Our internship program at Polymathic is open for opportunities from now through fall 2025! I believe our program provides an opportunity to work alongside some of the best researchers and engineering experts in the world — exploring the unknown of building foundation models for science, together. These are full-time and paid positions in the Big Apple! Interested? 👇 forms.gle/Jm9v4VxJrna7VR… Deadline Nov 5!
English
9
42
206
58.6K
Ryan Keisler retweetledi
Stephan Hoyer
Stephan Hoyer@shoyer·
I'm incredibly proud to share NeuralGCM, our new AI and physics based approach to weather and climate modeling with state-of-the-art accuracy, published today in @Nature: nature.com/articles/s4158…
GIF
English
26
237
1.3K
136.7K
Pierre Gentine
Pierre Gentine@PierreGentine·
1/2 We hear a lot about "Earth twins" these days. There are still major challenges on the way before we can get to Earth's twins whether using high-resolution Earth system models or AI-based models but progress is happening fast, especially on the AI front. Some comparisons:
Pierre Gentine tweet mediaPierre Gentine tweet media
English
1
8
40
8.2K
Ryan Keisler retweetledi
Stephan Hoyer
Stephan Hoyer@shoyer·
New open source release from my team at Google: Dinosaur, a differentiable dynamical core for global atmospheric modeling, written in JAX: github.com/google-researc… Dinosaur is a core component of NeuralGCM and we hope it is useful for the weather/climate research community.
English
7
59
417
51.9K
Ryan Keisler
Ryan Keisler@RyanKeisler·
Or a mix? E.g. the EDA perturbations are a little under-dispersed, some additional stochasticity (flow-dependent SVD, gaussian process, etc) helps, & the DM stochasticity goes in this bucket? And then the DM sample variance was (lightly) tuned to optimize performance/calibration?
English
2
0
0
563
Ryan Keisler
Ryan Keisler@RyanKeisler·
Or, alternatively, is this DM-only, 1-initial-condition ensemble *under*-dispersed, and the good ensemble calibration that is observed in Section 5.4 comes primarily from using initial conditions based on ERA5 EDA perturbations?
English
1
0
1
594
Ryan Keisler
Ryan Keisler@RyanKeisler·
(Sounds nice qualitatively, but would the amplitude of the sample variance depend on the details of DM training process, e.g. noise schedule? I'm not sure.)
English
1
0
0
393
Ryan Keisler
Ryan Keisler@RyanKeisler·
The idea would be, when the DM generates samples conditioned on current state, there is higher sample variance in the nodes that are more difficult to predict (eg Fig 5 of arxiv.org/abs/2309.01745 @thuereyGroup). If you squint, this starts to look like a well calibrated ensemble.
English
1
0
1
355
Ryan Keisler
Ryan Keisler@RyanKeisler·
The DM is trained on (deterministic) ERA5. This raises an interesting question: if you generate an ensemble from the *same initial conditions*, such that all stochasticity is sourced from the DM rather than the spread in the initial conditions, is the ensemble well calibrated?
English
1
0
0
366
Ryan Keisler
Ryan Keisler@RyanKeisler·
From Section A.4.1 of the paper, the diffusion-model-only stochasticity "works relatively well". Interesting. If I put on my optimist hat, I can imagine a scenario where the DM learns to generate well calibrated samples "for free", even when learning from deterministic ERA5.
English
1
0
0
277