Jaival Patel

146 posts

Jaival Patel banner
Jaival Patel

Jaival Patel

@patjaival

robotics @MDA_Space, engsci @UofT

Toronto, ON Katılım Şubat 2022
111 Takip Edilen91 Takipçiler
Sabitlenmiş Tweet
Jaival Patel
Jaival Patel@patjaival·
after 3 months of continuous crashing, i finally got rl to land a rocket by itself! yes, the complete 6dof dynamics: translation + rotation, variable mass, tvc, disturbances, all of it done by the rl itself. the core issue is that landing is a constrained braking problem, not open-ended control. rl fails because the feasible solution manifold is extremely narrow. once the search space was shaped properly, rl converged. i tried various rl policies and architectures to figure all this out. full technical analysis here check it out!: jaivalpatel.com/research-work.…
Jaival Patel tweet media
English
2
1
19
1.3K
Jaival Patel retweetledi
X Freeze
X Freeze@XFreeze·
NASA just officially unveiled their master plan for a permanent Moon Base at the lunar South Pole This is not just about flags and footprints. NASA is moving to establish an enduring, sustained human presence, and they are heavily relying on commercial innovators to build it The roadmap is highly aggressive: • Phase 1: Heavy robotic missions and commercial payload deliveries • Phase 2: Semi-permanent infrastructure, including fission surface power and lunar drones • Phase 3: A sustained, permanent human outpost The most important takeaway is NASA explicitly stated this base is the ultimate proving ground to prepare humanity for missions to Mars While legacy aerospace companies are still struggling to reliably get a small capsule to the ISS, NASA is setting the stage for massive lunar infrastructure....which is exactly the kind of heavy-lift planetary deployment SpaceX’s Starship was designed for The multi-planetary economy is officially kicking off
X Freeze tweet media
English
789
1.7K
10.4K
14.9M
Jaival Patel
Jaival Patel@patjaival·
in aero, control problems rarely give you clean dynamics and unlimited compute. reentry is a great example: nonlinear dynamics, changing aero effects, tight constraints, and little margin of error (close to none actually). that’s when learning-augmented MPC sparked my interest. MPC gives structure and constraint handling. learning can help when the model is incomplete. hence, for the coming weeks, i'll be looking into combining these ideas: learning-augmented MPC with reentry attitude control. very excited to see what i come up with!
English
0
0
0
31
Jaival Patel retweetledi
Zelda
Zelda@zeldapoem·
Pinch me, I can't believe someone wrote about lab notebooks. Unbelievably cool
Zelda tweet media
English
26
752
8.4K
213.6K
Jaival Patel
Jaival Patel@patjaival·
@lawrencefeng17 curious if this extends to physical RL. if a model is pretrained early on rich dynamics priors like contact, friction, actuator limits, etc, does that improve retention and robustness of the physical environment and its variables? i would assume so
English
1
0
1
193
Lawrence Feng
Lawrence Feng@lawrencefeng17·
1/ To retain post-training capabilities after further fine-tuning, mix that data into pretraining. The effect can be invisible until fine-tuning begins; early exposure may not help post-training performance, but it changes what persists. How a model learns a task matters.
English
6
24
86
26.5K
Arjun Virk
Arjun Virk@virkvarjun·
Life Update: I've moved to SF to build the future of robotics learning @bracketbot with @sincethestudy. My research focuses on unlocking continual learning for robotics policies. More soon.
Arjun Virk tweet media
English
17
3
131
8K
kache
kache@yacineMTB·
Am I the only person in the world working on robotics instead of large language models
English
267
16
1.3K
59.1K
Jaival Patel retweetledi
altan tutar
altan tutar@altantutar·
This is insane! Actor Labs fine-tuned π0.5 (physical intelligence's flagship VLA model) and deployed it on a real excavator. They just raised $4M led by Eniac Ventures, with Hyperion, Hummingbird, 2048 Ventures, and Nova Global. From founder @laneburgett: "We've collected a massive corpus of real-world data with natural language labels from operators in the industry and are using it to create some really cool policies. Here's our first demo of it successfully completing a task with just 200 trajectories."
English
7
24
186
15.7K
Jaival Patel
Jaival Patel@patjaival·
building a 7dof robot that gets sent to space in two months was not on my summer bucket-list but oh well
English
0
0
3
133
Jaival Patel
Jaival Patel@patjaival·
@dhruvr_43 LOL good luck bro my prayers are w u for that course. it fried me 😭
English
0
0
1
12
dhruvr_43
dhruvr_43@dhruvr_43·
@patjaival Taking analog control systems rn to be able to understand this big brain shit
English
1
0
2
52
Jaival Patel
Jaival Patel@patjaival·
hot take, aerospace can grow quicker if we trusted reinforcement learning more in the domain. there are many ways we can impose safety-critical RL: - control barrier functions: keeping the policy inside a mathematically safe region (surrogate model?). RL agent can optimize performance, but a barrier layer can block actions that violate constraints. - run time safety filters - shielded RL: a policy that proposes actions, but a classical controller or rule-based shield that overrides unsafe actions. - hybrid RL + classical control (what i worked on last month) - uncertainty-aware policies: using ensembles, Bayesian models, or confidence estimates so the system knows when it is uncertain. (im looking into this right now for commercial robotic systems and it is giving me good results so far) in the end, RL doesn't have to replace classical GNC. it simply just needs to sit inside the constraints - it's RL inside safety-critical control architecture.
Jaival Patel@patjaival

after 3 months of continuous crashing, i finally got rl to land a rocket by itself! yes, the complete 6dof dynamics: translation + rotation, variable mass, tvc, disturbances, all of it done by the rl itself. the core issue is that landing is a constrained braking problem, not open-ended control. rl fails because the feasible solution manifold is extremely narrow. once the search space was shaped properly, rl converged. i tried various rl policies and architectures to figure all this out. full technical analysis here check it out!: jaivalpatel.com/research-work.…

English
1
0
4
366
Jaival Patel
Jaival Patel@patjaival·
being the only intern on ur team has to be exciting, motivating, and scary all at the same time
English
0
0
6
138
saksham
saksham@sakshambatraa·
reinventing Groq's LPU with @michael_trbo we got instruction driven data movement working between SRAM memory blocks and MXM compute!!
English
11
17
75
6.3K
Jaival Patel
Jaival Patel@patjaival·
@rory_mg nordspace truly redefining the canadian aerospace sector. we love it
English
1
0
1
131
rory 🍁
rory 🍁@rory_mg·
why launch one rocket from Canadian soil when you can launch three? images of the Hadfield engine coming shortly...
rory 🍁 tweet media
English
6
20
175
8.4K
Adithya S K
Adithya S K@adithya_s_k·
Excited to release the Ultimate guide to RL environments! Definitions of RL environments differ wildly in the LLM era, so we spent the last month building several RL environments across 6 different frameworks, domains and complexities to map out which are easiest to build with and which can be scaled to 1000s.
English
51
158
1.2K
222.9K