"#Learning4Control" Arama Sonuçları

Arama Sonuçları: "#Learning4Control"

4 sonuç

Hanjiang Hu@huhanjiang·5 Kas

📝 Paper: arxiv.org/pdf/2410.16281 💻 Code: github.com/intelligent-co… Catch our poster at Session 2, #18 on 11/07 and SAFE-ROL workshop on 11/09! Many thanks to my amazing collaborators: @_yujie_yang_ @wei_tianhao @ChangliuL #Learning4Control #NeuralCBFs #FormalVerification (2/2)

English

178

Antoine Leeman@antoine_leeman·20 Şub

This definition rings a bell 😇 #Learning4Control #Control4Learning

Yann LeCun@ylecun

Lots of confusion about what a world model is. Here is my definition: Given: - an observation x(t) - a previous estimate of the state of the world s(t) - an action proposal a(t) - a latent variable proposal z(t) A world model computes: - representation: h(t) = Enc(x(t)) - prediction: s(t+1) = Pred( h(t), s(t), z(t), a(t) ) Where - Enc() is an encoder (a trainable deterministic function, e.g. a neural net) - Pred() is a hidden state predictor (also a trainable deterministic function). - the latent variable z(t) represents the unknown information that would allow us to predict exactly what happens. It must be sampled from a distribution or or varied over a set. It parameterizes the set (or distribution) of plausible predictions. The trick is to train the entire thing from observation triplets (x(t),a(t),x(t+1)) while preventing the Encoder from collapsing to a trivial solution on which it ignores the input. Auto-regressive generative models (such as LLMs) are a simplified special case in which 1. the Encoder is the identity function: h(t) = x(t), 2. the state is a window of past inputs 3. there is no action variable a(t) 4. x(t) is discrete 5. the Predictor computes a distribution over outcomes for x(t+1) and uses the latent z(t) to select one value from that distribution. The equations reduce to: s(t) = [x(t),x(t-1),...x(t-k)] x(t+1) = Pred( s(t), z(t), a(t) ) There is no collapse issue in that case.

English

1.2K

Márcio J. Lacerda@LacerdaMJ·14 Eki

It is a polytopic system with five vertices, order 5, and 1 control input. Project developed by undergrad students Ricardo and Hugo. #matlab #ReinforcementLearning #learning4control #control

English

University of Toronto Robotics Institute@UofTRobotics·6 Eyl

Learning from @florian_shkurti about #learning4control at our inaugural @UofTRobotics #workshop

University of Toronto Robotics Institute tweet media

English