Herbie He

24 posts

Herbie He banner
Herbie He

Herbie He

@HeHerbie

Zizhan (Herbie) He. CS master's student @McGill

Katılım Haziran 2025
20 Takip Edilen25 Takipçiler
Sabitlenmiş Tweet
Herbie He
Herbie He@HeHerbie·
New paper 🚨 #ICLR26 Most world models predict the future from a past trajectory. But neuroscience suggests that such inference can instead be made from temporally independent experiences. We built the Episodic Spatial World Model (ESWM), a model that does exactly this:
English
2
8
18
1.5K
Herbie He
Herbie He@HeHerbie·
#ICLR2026 I’m presenting this paper today Apr 24, at pavillon 3 poster 1612, 10am-1pm. Come by if you’re interested! Links to poster and video: iclr.cc/virtual/2026/p…
Herbie He@HeHerbie

New paper 🚨 #ICLR26 Most world models predict the future from a past trajectory. But neuroscience suggests that such inference can instead be made from temporally independent experiences. We built the Episodic Spatial World Model (ESWM), a model that does exactly this:

English
0
2
9
570
Herbie He
Herbie He@HeHerbie·
[7/8] Beyond Grid World, ESWM is scalable to the more complex MiniGrid (high-dimensional observation) and 3D indoor scenes ProcThor (realistic pixel observations).
Herbie He tweet media
English
1
0
2
67
Herbie He
Herbie He@HeHerbie·
New paper 🚨 #ICLR26 Most world models predict the future from a past trajectory. But neuroscience suggests that such inference can instead be made from temporally independent experiences. We built the Episodic Spatial World Model (ESWM), a model that does exactly this:
English
2
8
18
1.5K
Herbie He retweetledi
Pablo Samuel Castro
Pablo Samuel Castro@pcastr·
New paper 🚨 "Stable Deep Reinforcement Learning via Isotropic Gaussian Representations" Deep RL suffers from unstable training, representation collapse, and neuron dormancy. We show that a simple geometric insight, isotropic Gaussian representations, can fix this. Here's how 👇
Pablo Samuel Castro tweet mediaPablo Samuel Castro tweet media
English
4
35
231
19.8K
Herbie He retweetledi
Ching Fang (chingfang.bsky.social)
Humans and animals can rapidly learn in new environments. What computations support this? We study the mechanisms of in-context reinforcement learning in transformers, and propose how episodic memory can support rapid learning. Work w/ @KanakaRajanPhD: arxiv.org/abs/2506.19686
English
8
59
249
25.5K
Herbie He
Herbie He@HeHerbie·
[11/12] 🔧 When environments change—say a new wall appears—ESWM adapts instantly. No retraining is needed. Just update the memory bank and the model replans. This separation of memory and reasoning makes ESWM highly flexible.
Herbie He tweet media
English
1
0
3
140
Herbie He
Herbie He@HeHerbie·
[10/12] 🧭 It gets even better! ESWM can navigate between arbitrary points using only its memory bank—planning efficiently in latent space with near-optimal paths. No access to global maps or coordinates required.
Herbie He tweet media
English
1
0
3
104
Herbie He
Herbie He@HeHerbie·
🧠 Can a neural network build a spatial map from scattered episodic experiences like humans do? We introduce the Episodic Spatial World Model (ESWM)—a model that constructs flexible internal world models from sparse, disjoint memories. 🧵👇 [1/12]
Herbie He tweet media
English
1
6
25
2.5K
Herbie He
Herbie He@HeHerbie·
[9/12] 🚶 With no additional training, ESWM can explore novel environments efficiently by acting on uncertainty.
Herbie He tweet media
English
1
0
3
100
Herbie He
Herbie He@HeHerbie·
[8/12] ⚙️ How are these maps built? We find that ESWM stitches together memories via overlapping states—merging local transitions into global structure. Obstacles and boundaries serve as spatial anchors, guiding how memories are organized in latent space.
Herbie He tweet media
English
1
0
3
99
Herbie He
Herbie He@HeHerbie·
[7/12] 🏞️ How does ESWM solve the task? Using ISOMAP, we visualize its latent representations—beautifully organized spatial layouts emerge from its internal states, even when the model sees only a small part or out-of-distribution environments.
Herbie He tweet media
English
1
0
3
106
Herbie He
Herbie He@HeHerbie·
[6/12] ⚡️ Transformer-based ESWM models outperform LSTMs and Mamba, especially in settings where observations are compositional. Attention allows the model to flexibly bind relevant memories and generalize across structures.
Herbie He tweet media
English
1
0
4
104
Herbie He
Herbie He@HeHerbie·
[5/12] To train ESWM, we use meta-learning across diverse environments. At test time, the model gets a minimal set of disjoint episodic memories (single transitions) and must predict a missing element in a new transition—without ever seeing the full map.
Herbie He tweet mediaHerbie He tweet media
English
1
0
4
113
Herbie He
Herbie He@HeHerbie·
[4/12] 🧠 Inspired by the MTL’s architecture and function, we built ESWM: a neural network that infers the structure of its environment from isolated, one-step transitions—just like the brain integrates episodes into a cognitive map.
Herbie He tweet media
English
1
0
5
117