Siddhant Agarwal

54 posts

Siddhant Agarwal

@agsidd10

PhD CS @UTAustin | Reinforcement Learning & Robotics, prev intern @Amazon

Austin, USA Katılım Kasım 2013

568 Takip Edilen388 Takipçiler

Sabitlenmiş Tweet

Siddhant Agarwal@agsidd10·3 Ara

Do RL solutions share a common structure? We show that all solutions of Reinforcement Learning lie on a hyperplane. Our work, Proto Successor Measure, learns this abstraction of the MDP to do zero-shot RL for any reward function. (1/n)

English

103

20.9K

Siddhant Agarwal@agsidd10·9 Şub

Can regularized latent dynamics prediction scale Behavior Foundation Models? Check out our work at ICLR 2026.

Pranaya Jajoo @ICLR@JajooPranaya

Introducing RLDP – a simple, scalable approach for building strong Behavioral Foundation Models. 🚀 #ICLR2026 ✅ Robust objective: avoids brittle unsupervised RL objectives while staying simple and scalable. 🗂️ No data-coverage limitations: works across a wide variety of datasets. 🧩 One latent space for all tasks: a unified representation space. 🎯➡️🤖 Zero-shot control: specify a reward and directly obtain a policy/behavior – no additional training required.

English

747

Siddhant Agarwal@agsidd10·26 Kas

I’ll be attending #NeurIPS2025 and presenting our work, “RLZero: Direct Policy Inference from Language Without In-Domain Supervision." Excited to chat about unsupervised RL, reasoning, and RL more broadly. I’m also exploring industry opportunities — feel free to reach out!

English

269

Siddhant Agarwal@agsidd10·10 Kas

@sushantsareen 0z

sushant sareen@sushantsareen·9 Kas

If you Punjabi Muslims had any courage and commitment to democracy, two bit generals wouldn’t be ordering your chaprasi PM and the entire parliament around. Don’t blame your failures on India. Also if you had any integrity and intellectual honesty you would ask your military about the losses to your airbases, radars, AD systems and aircraft. You love quoting Trump but did he say anywhere which country lost planes? Or you assumed it yourself because that’s what your generals told you. If trump is so correct then how come you disagree with him on Nuke tests? Finally, you claim you can’t take TTP attacks anymore and will strike Afghanistan. How is that any different from what we say about terrorism which is spawned and sponsored and supported by your country against India? If your generals push terrorists into India, we will come for you and next time there will be no restraint and no trump to intervene because the Americans have lost whatever goodwill they enjoyed. Don’t want a war? Don’t send in your jihadi terrorists. And if you love democracy then show some testicular fortitude to fight for it. BTW we do love your army. It is doing to your country and people what we could only dream of doing. Your duffer generals most of them high school pass are our strategic assets.

Rauf Klasra@KlasraRauf

India should take credit for strengthening hands of Pakistan army. You guys are responsible. Stupid attack on Pakistan in May unleashed by war monger and populist PM Modi to gain votes in Bihar elections followed by applauses has further strengthened its grip on power. Any war, conflict or tension b/w Pakistan and India suit both BJP & Pak Establishment. So keep raising heat and level of threats it ll benefit our army and weaken civilians in Pakistan. If this is purpose of Indian political elite, then let me congratulate as you guys have succeeded in achieving your mission, my friends. Recent Indian attack is another example that has further strengthen Pak Establishment grip on power and weaken Pakistani civilian leaders. Why you lament now to see Pakistan army gaining control and clout in Pakistan.. Perhaps Indians love to deal with Pakistan army generals from Gen Ayub (Lal Bhadur Shastri sb), Gen Zia ( Rajiv sb) to Gen Musharraf ( Vajpayee sb&Manmohan sb) . Indians trust Pakistan Generals more than its prime ministers.

English

612

3.1K

176.9K

Siddhant Agarwal retweetledi

Jiaxun Cui 🐿️@cuijiaxun·23 Eki

Meta has gone crazy on the squid game! Many new PhD NGs are deactivated today (I am also impacted🥲 happy to chat)

Yuandong Tian@tydsh

Several of my team members + myself are impacted by this layoff today. Welcome to connect :)

English

110

1.4M

Siddhant Agarwal retweetledi

Rutav@rutavms·16 Eyl

Intelligent humanoids should have the ability to quickly adapt to new tasks by observing humans Why is such adaptability important? 🌍 Real-world diversity is hard to fully capture in advance 🧠 Adaptability is central to natural intelligence We present MimicDroid 👇 🌐 ut-austin-rpl.github.io/MimicDroid

English

122

39.6K

Siddhant Agarwal@agsidd10·2 Tem

I’ll be at #ICML2025 presenting our paper, “Proto Successor Measure: Representing the Behavior Space of an RL Agent”. Excited to connect with others working on unsupervised RL and RL more broadly. Also am on the lookout for research collaborations and opportunities in industries.

English

4.9K

Siddhant Agarwal@agsidd10·25 Haz

@rkjenamani Congratulations Rajat

Indonesia

119

Rajat Kumar Jenamani@rkjenamani·25 Haz

Really excited to share that FEAST won the Best Paper Award at #RSS2025! Huge thanks to everyone who’s shaped this work, from roboticists to care recipients, caregivers, and occupational therapists. ❤️

Rajat Kumar Jenamani@rkjenamani

Most assistive robots live in labs. We want to change that. FEAST enables care recipients to personalize mealtime assistance in-the-wild, with minimal researcher intervention across diverse in-home scenarios. 🏆 Outstanding Paper & Systems Paper Finalist @RoboticsSciSys 🧵1/8

English

128

9.5K

Siddhant Agarwal retweetledi

Rajat Kumar Jenamani@rkjenamani·20 Haz

English

326

68.5K

Siddhant Agarwal retweetledi

RL Beyond Rewards Workshop@RLBRew_RLC·10 Haz

Reminder! The deadline in coming up in 5 days. Submit your works soon. You can submit your under-review NeurIPS papers too.

RL Beyond Rewards Workshop@RLBRew_RLC

We are extending the submission deadline to June 15, 2025 (AoE). Submit your ideas soon!

English

2.8K

Siddhant Agarwal retweetledi

Gokul Swamy@g_k_swamy·10 Haz

Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!

GIF

English

266

89.8K

Siddhant Agarwal retweetledi

RL Beyond Rewards Workshop@RLBRew_RLC·28 May

We are extending the submission deadline to June 15, 2025 (AoE). Submit your ideas soon!

RL Beyond Rewards Workshop@RLBRew_RLC

⚠️ Reminder! Submissions for @RL_Conference's RL beyond Reward Workshop are due May 30 (AoE)! We are brewing an interesting program and seeking innovative research work in reward-free RL. All papers are welcome, from exploratory abstracts to complete research papers.

English

6.3K

Siddhant Agarwal@agsidd10·19 May

Missed the NeurIPS deadline? RLBrew Workshop deadline in less than 15 days. Submit your finished or unfinished work to this RLC workshop.

RL Beyond Rewards Workshop@RLBRew_RLC

English

552

Siddhant Agarwal@agsidd10·10 Ara

Check out our work: arxiv.org/abs/2412.05718… (3/n)

English

Siddhant Agarwal@agsidd10·10 Ara

RLZero imagines the trajectory for the language prompt which is used to produce a policy through zero-shot imitation learning. Opens interesting avenues to apply our recent work on zero-shot RL, PSM (arxiv.org/abs/2411.19418) (2/n)

English

116

Siddhant Agarwal@agsidd10·10 Ara

It's exciting to think about the capabilities of zero-shot RL as foundation models. Using our work RLZero, you can specify your task as natural language prompts and expect zero-shot policy generation for embodied agents. (1/n)

Harshit Sikchi@harshit_sikchi

🤖 Introducing RL Zero 🤖: a new approach to transform language into behavior zero-shot for embodied agents without labeled datasets! RL Zero enables prompt-to-policy generation, and we believe this unlocks new capabilities in scaling up language-conditioned RL, providing an interpretable link between RL agents and humans and achieving true cross-embodiment transfer.

English

372

Siddhant Agarwal@agsidd10·4 Ara

@EliSennesh @harshit_sikchi @PeterStone_TX @yayitsamyzhang @utlarg @lab_midi @texas_robotics @MLFoundations The theorems have already been extended to continuous-space MDPs. We did not discuss POMDPs in this work.

English

Eli Sennesh@EliSennesh·3 Ara

@agsidd10 @harshit_sikchi @PeterStone_TX @yayitsamyzhang @utlarg @lab_midi @texas_robotics @MLFoundations Is there some fundamental reason the theorems can't be extended to POMDPs as well as continuous-space MDPs?

English

Siddhant Agarwal@agsidd10·3 Ara

PSM provides an efficient algorithm to learn the basis of the solution space. Any point on the plane is a solution of the Bellman Flow for the MDP. Obtaining the optimal policy for any reward reduces to a simple constrained linear optimization problem. (3/n)

English

1.1K

Siddhant Agarwal@agsidd10·4 Ara

@lnalegre Thanks. Yes, learning a CCS over successor features seems related. We have discussed connections to SFs and how they can also be represented using a basis. PSM expands on SFs as it does not assume rewards to be linear in state features.

English

113

Lucas Alegre@lnalegre·3 Ara

@agsidd10 Really interesting work! I believe it is very connected with the idea of learning a convex coverage set (CCS) over successor features/successor representations we showed in our ICML'22 paper: tinyurl.com/yc6aekxa

English

309

Siddhant Agarwal@agsidd10·3 Ara

English

103

20.9K

Siddhant Agarwal@agsidd10·4 Ara

@lnalegre @ben_eysenbach State marginal occupancy measures can be obtained using a linear transformation on successor measures. So they also form a similar polytope, but in a different space.

English

Lucas Alegre@lnalegre·3 Ara

@agsidd10 Do you also have any insights on how the polytope defined in your paper relates to the polytope over marginal state occupancy measures defined by @ben_eysenbach in the paper "The information geometry of unsupervised reinforcement learning"?

English

100

Siddhant Agarwal@agsidd10·3 Ara

Successor Measures represent the probability of reaching a particular state following a policy from a given state and action. We show that the successor measures for any policy in an MDP lie on an affine space represented by a basis. (2/n)

English

Keşfet

@sushantsareen @rkjenamani @RoboticsSciSys @EliSennesh @harshit_sikchi @PeterStone_TX @yayitsamyzhang @utlarg