Siddhant Agarwal

54 posts

Siddhant Agarwal

Siddhant Agarwal

@agsidd10

PhD CS @UTAustin | Reinforcement Learning & Robotics, prev intern @Amazon

Austin, USA Katılım Kasım 2013
568 Takip Edilen388 Takipçiler
Sabitlenmiş Tweet
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
Do RL solutions share a common structure? We show that all solutions of Reinforcement Learning lie on a hyperplane. Our work, Proto Successor Measure, learns this abstraction of the MDP to do zero-shot RL for any reward function. (1/n)
Siddhant Agarwal tweet media
English
3
16
103
20.9K
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
Can regularized latent dynamics prediction scale Behavior Foundation Models? Check out our work at ICLR 2026.
Pranaya Jajoo @ICLR@JajooPranaya

Introducing RLDP – a simple, scalable approach for building strong Behavioral Foundation Models. 🚀 #ICLR2026 ✅ Robust objective: avoids brittle unsupervised RL objectives while staying simple and scalable. 🗂️ No data-coverage limitations: works across a wide variety of datasets. 🧩 One latent space for all tasks: a unified representation space. 🎯➡️🤖 Zero-shot control: specify a reward and directly obtain a policy/behavior – no additional training required.

English
0
0
8
747
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
I’ll be attending #NeurIPS2025 and presenting our work, “RLZero: Direct Policy Inference from Language Without In-Domain Supervision." Excited to chat about unsupervised RL, reasoning, and RL more broadly. I’m also exploring industry opportunities — feel free to reach out!
English
0
0
3
269
sushant sareen
sushant sareen@sushantsareen·
If you Punjabi Muslims had any courage and commitment to democracy, two bit generals wouldn’t be ordering your chaprasi PM and the entire parliament around. Don’t blame your failures on India. Also if you had any integrity and intellectual honesty you would ask your military about the losses to your airbases, radars, AD systems and aircraft. You love quoting Trump but did he say anywhere which country lost planes? Or you assumed it yourself because that’s what your generals told you. If trump is so correct then how come you disagree with him on Nuke tests? Finally, you claim you can’t take TTP attacks anymore and will strike Afghanistan. How is that any different from what we say about terrorism which is spawned and sponsored and supported by your country against India? If your generals push terrorists into India, we will come for you and next time there will be no restraint and no trump to intervene because the Americans have lost whatever goodwill they enjoyed. Don’t want a war? Don’t send in your jihadi terrorists. And if you love democracy then show some testicular fortitude to fight for it. BTW we do love your army. It is doing to your country and people what we could only dream of doing. Your duffer generals most of them high school pass are our strategic assets.
Rauf Klasra@KlasraRauf

India should take credit for strengthening hands of Pakistan army. You guys are responsible. Stupid attack on Pakistan in May unleashed by war monger and populist PM Modi to gain votes in Bihar elections followed by applauses has further strengthened its grip on power. Any war, conflict or tension b/w Pakistan and India suit both BJP & Pak Establishment. So keep raising heat and level of threats it ll benefit our army and weaken civilians in Pakistan. If this is purpose of Indian political elite, then let me congratulate as you guys have succeeded in achieving your mission, my friends. Recent Indian attack is another example that has further strengthen Pak Establishment grip on power and weaken Pakistani civilian leaders. Why you lament now to see Pakistan army gaining control and clout in Pakistan.. Perhaps Indians love to deal with Pakistan army generals from Gen Ayub (Lal Bhadur Shastri sb), Gen Zia ( Rajiv sb) to Gen Musharraf ( Vajpayee sb&Manmohan sb) . Indians trust Pakistan Generals more than its prime ministers.

English
78
612
3.1K
176.9K
Siddhant Agarwal retweetledi
Rutav
Rutav@rutavms·
Intelligent humanoids should have the ability to quickly adapt to new tasks by observing humans Why is such adaptability important? 🌍 Real-world diversity is hard to fully capture in advance 🧠 Adaptability is central to natural intelligence We present MimicDroid 👇 🌐 ut-austin-rpl.github.io/MimicDroid
English
7
37
122
39.6K
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
I’ll be at #ICML2025 presenting our paper, “Proto Successor Measure: Representing the Behavior Space of an RL Agent”. Excited to connect with others working on unsupervised RL and RL more broadly. Also am on the lookout for research collaborations and opportunities in industries.
English
0
4
55
4.9K
Rajat Kumar Jenamani
Rajat Kumar Jenamani@rkjenamani·
Really excited to share that FEAST won the Best Paper Award at #RSS2025! Huge thanks to everyone who’s shaped this work, from roboticists to care recipients, caregivers, and occupational therapists. ❤️
Rajat Kumar Jenamani tweet media
Rajat Kumar Jenamani@rkjenamani

Most assistive robots live in labs. We want to change that. FEAST enables care recipients to personalize mealtime assistance in-the-wild, with minimal researcher intervention across diverse in-home scenarios. 🏆 Outstanding Paper & Systems Paper Finalist @RoboticsSciSys 🧵1/8

English
17
8
128
9.5K
Siddhant Agarwal retweetledi
Rajat Kumar Jenamani
Rajat Kumar Jenamani@rkjenamani·
Most assistive robots live in labs. We want to change that. FEAST enables care recipients to personalize mealtime assistance in-the-wild, with minimal researcher intervention across diverse in-home scenarios. 🏆 Outstanding Paper & Systems Paper Finalist @RoboticsSciSys 🧵1/8
English
5
67
326
68.5K
Siddhant Agarwal retweetledi
Gokul Swamy
Gokul Swamy@g_k_swamy·
Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!
GIF
English
10
71
266
89.8K
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
RLZero imagines the trajectory for the language prompt which is used to produce a policy through zero-shot imitation learning. Opens interesting avenues to apply our recent work on zero-shot RL, PSM (arxiv.org/abs/2411.19418) (2/n)
English
1
0
0
116
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
PSM provides an efficient algorithm to learn the basis of the solution space. Any point on the plane is a solution of the Bellman Flow for the MDP. Obtaining the optimal policy for any reward reduces to a simple constrained linear optimization problem. (3/n)
English
1
1
3
1.1K
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
@lnalegre Thanks. Yes, learning a CCS over successor features seems related. We have discussed connections to SFs and how they can also be represented using a basis. PSM expands on SFs as it does not assume rewards to be linear in state features.
English
0
0
1
113
Lucas Alegre
Lucas Alegre@lnalegre·
@agsidd10 Really interesting work! I believe it is very connected with the idea of learning a convex coverage set (CCS) over successor features/successor representations we showed in our ICML'22 paper: tinyurl.com/yc6aekxa
English
2
0
7
309
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
Do RL solutions share a common structure? We show that all solutions of Reinforcement Learning lie on a hyperplane. Our work, Proto Successor Measure, learns this abstraction of the MDP to do zero-shot RL for any reward function. (1/n)
Siddhant Agarwal tweet media
English
3
16
103
20.9K
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
@lnalegre @ben_eysenbach State marginal occupancy measures can be obtained using a linear transformation on successor measures. So they also form a similar polytope, but in a different space.
English
0
0
1
35
Lucas Alegre
Lucas Alegre@lnalegre·
@agsidd10 Do you also have any insights on how the polytope defined in your paper relates to the polytope over marginal state occupancy measures defined by @ben_eysenbach in the paper "The information geometry of unsupervised reinforcement learning"?
English
1
0
0
100
Siddhant Agarwal
Siddhant Agarwal@agsidd10·
Successor Measures represent the probability of reaching a particular state following a policy from a given state and action. We show that the successor measures for any policy in an MDP lie on an affine space represented by a basis. (2/n)
English
1
0
5
1K