Thomas Lips

91 posts

Thomas Lips

Thomas Lips

@_tlips_

PhD student in robotic manipulation @UGent 🇧🇪 | Looking for 🤖 that don't require cages or meticulously orchestrated environments

Katılım Mart 2020
297 Takip Edilen67 Takipçiler
Thomas Lips
Thomas Lips@_tlips_·
@wimmer_th very interesting! What is the inference time? And how does that compare against bilinear upsampling, for example?
English
0
0
0
272
Thomas Wimmer
Thomas Wimmer@wimmer_th·
Super excited to introduce ✨ AnyUp: Universal Feature Upsampling 🔎 Upsample any feature - really any feature - with the same upsampler, no need for cumbersome retraining. SOTA feature upsampling results while being feature-agnostic at inference time.
English
7
135
871
92.2K
Thomas Lips
Thomas Lips@_tlips_·
@bousmalis @tdavchev Impressive! How did you teleop this? And does the active vision generalize well to novel tasks?
English
0
0
0
83
Konstantinos Bousmalis
Konstantinos Bousmalis@bousmalis·
One of the coolest features in our newest release for Gemini Robotics 1.5 was using the wrist camera of one of the arms of our bi-arm Franka robot for active vision! Without this capability the model would not have enough visibility to solve some of our most dexterous tasks.
English
3
16
89
5.2K
Thomas Lips
Thomas Lips@_tlips_·
@asoare159 @neurosp1ke But I think we actually agree on the technical side, and it is a matter of semantics? Does the data collection strategy merit a separate "adjective" or not ? I think it does but I get why others would say that this adds too much weight to it, as there is no RL-algo difference
English
0
0
0
64
Thomas Lips
Thomas Lips@_tlips_·
@asoare159 @neurosp1ke So when you say offline, you say: I have another means of gathering all my data. With online you say: I need to use the policy for (additional) data collection? And then off-policy algorithms allow you to combine that data with any other data, on-policy algo's don't
English
1
0
0
58
Alexander Soare
Alexander Soare@asoare159·
Can someone enlighten me? I'm confused about off/on policy and off/on line RL. Are they supposed to be orthogonal axes? ✅off policy + offline: Makes sense ✅ on policy + online: Makes sense ❌ on policy + offline: Nonsense ❔off policy + online: Either (a) If it's truly off-policy, might as well be offline then. This is a question of workflow, not algorithm. Or (b) you can say something like ε-greedy DQN fits here, because the ε-greedy policy is not the same as the argmax policy. But there's a lot of overlap between the ε-greedy policy and argmax policy, which is why you want to do it online at all. So you might as well say its (almost) on-policy then. I'm seeing a lot of mutual information between these axes. Feels like it would be less confusing to just keep off/on policy (because that dictates whether you have the freedom to do offline).
English
3
0
3
574
Thomas Lips
Thomas Lips@_tlips_·
@asoare159 @neurosp1ke On-policy implies online afaik. But off policy does not necessarily imply offline right? You can use off-policy in online and offline settings I think, and both can make sense?
English
1
0
0
45
Alexander Soare
Alexander Soare@asoare159·
@_tlips_ @neurosp1ke Oh yeah I mean it is, but my point is that it's already prescribed by the other distinction: on/off policy. On policy means online is the only sensible implementation. Off policy means offline is the more sensible implementation.
English
1
0
0
40
Thomas Lips
Thomas Lips@_tlips_·
@asoare159 @neurosp1ke Fair point! But it still feels like using / allowing the agent to do data collection is an important distinction
English
1
0
1
38
Alexander Soare
Alexander Soare@asoare159·
@_tlips_ @neurosp1ke I think one can't make a general comment about how much exploration is involved in the offline dataset. It's up to the details of the experiment. There isn't necessarily less exploration involved. You can cover the whole state-action space with a random policy (asymptotically).
English
1
0
1
62
Thomas Lips
Thomas Lips@_tlips_·
@asoare159 @neurosp1ke Not an expert in offline RL, but my intuition is that w/ offline RL there is less exploration involved? In online settings the algo needs to collect its own data to cover the relevant parts of the state space. The dataset in offline RL supposedly already covers the state space?
English
1
0
2
56
Alexander Soare
Alexander Soare@asoare159·
Hm, like yeah I agree that off-policy methods work for online training. But if they are truly off policy, algorithmically it is equivalent to offline training, perhaps with some weird schedule on how you decide to introduce the data into the training loop. That's why I dont think it's appropriate to call it "online RL" or "offline RL" because that indicates something about the nature of the RL algorithm. Really it's "off policy RL, oh and I decided for my worlflow to gather the data online". If something breaks during the training run and you have to start again, you wouldn't dump the online data. You'd keep it and go from there.
English
1
0
2
70
Thomas Lips retweetledi
François Chollet
François Chollet@fchollet·
Because AI is an engineering discipline and not a scientific field, it's never possible to fully separate the properties of a given approach from those of its specific implementations. The artifact is the method.
English
53
127
970
69.2K
Thomas Lips retweetledi
Russ Tedrake
Russ Tedrake@RussTedrake·
TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/ One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the technology, and to share a lot of details for how we're achieving it. youtube.com/watch?v=BEXFnr…
YouTube video
YouTube
English
8
105
487
87.7K
Chris Paxton
Chris Paxton@chris_j_paxton·
There's been really interesting work lately on one-shot learning from demonstration work like this. Would love to know more about how this works; it makes robots much more usable as personal tools when you can do stuff like this
ARX@ARXrobotics

learning with just one video🦾

English
4
6
111
8.7K
Konstantin Mishchenko
Konstantin Mishchenko@konstmish·
@_tlips_ Typically takes about a month to get reviews, sometimes a couple of weeks extra if one of the reviews didn't do anything. Then there is a discussion period, and once it's over the reviewers submit their decisions and the action editor submits theirs. In total, it's ~2-3 months.
Konstantin Mishchenko tweet media
English
1
0
2
279
Konstantin Mishchenko
Konstantin Mishchenko@konstmish·
A gentle reminder that TMLR is a great journal that allows you to submit your papers when they are ready rather than rushing to meet conference deadlines. The review process is fast, there are no artificial acceptance rates, and you have more space to present your ideas in the main body. And by the way, if you have experience reviewing for ML conferences and want to become a TMLR reviewer, please get in touch and share your email address. TMLR doesn't assign more than 1 submission at a time, and you can limit how many papers per year you get to review.
English
15
30
325
30.2K
Thomas Lips
Thomas Lips@_tlips_·
@DJiafei @JitendraMalikCV Agreed! Why don't we have it? Seems like most ingredients are available? And still, whenever you want to do something in sim, you are limited to the same ~50 tasks with little to no diversity.
English
0
0
1
77
Jiafei Duan
Jiafei Duan@DJiafei·
How do we accelerate progress in robotics? As @JitendraMalikCV has noted, computer vision leapt ahead once robust, openly-shared benchmarks appeared. Robotics still lacks that foundation. Real-world benchmarking is logistically expensive and hard to standardize; simulation is our most practical path. Yet many simulated suites plateau quickly, and raw success/failure tells us almost nothing—what does a 1% difference between two VLA models on LIBERO really mean? What the field needs is a rigorous simulation benchmark that is: -Challenging but unsolved – rich enough to reveal meaningful differences between approaches. -Dynamic and evolving – updated regularly so it doesn’t become stale or “solved” in a single leaderboard cycle. -Beyond binary success – equipped with fine-grained, diagnostic metrics that expose how and why policies succeed or fail, giving us actionable insight before real-world deployment. Example of a really good benchmark that plateau.
Jiafei Duan tweet media
Yi Ru (Helen) Wang@YiruHelenWang

🚨Tired of binary pass/fail metrics that miss the bigger picture? 🤖Introducing #RoboEval — an open benchmark that shows *how* robot manipulation policies behave and *why* they fail, not just *if* they succeed. 🧵1/n 🔗 robo-eval.github.io 📄 robo-eval.github.io/media/RoboEval…

English
1
5
30
4.3K
Thomas Lips
Thomas Lips@_tlips_·
@phillip_isola I really, really like this book. Only read a couple of chapters so far, but it seems to combine intuition with rigour in a way I haven't found before. Thanks for open-sourcing it!
English
0
0
0
43
Phillip Isola
Phillip Isola@phillip_isola·
Our computer vision textbook is now available for free online here: visionbook.mit.edu We are working on adding some interactive components like search and (beta) integration with LLMs. Hope this is useful and feel free to submit Github issues to help us improve the text!
English
34
604
2.9K
181.9K
Thomas Larsen
Thomas Larsen@thlarsen·
AI 2027 skeptics can't give a coherent scenario for how they think AI progress will go. I would love it if @GaryMarcus, AI is a normal technology folks, @MechanizeWork people, etc, would actually write out what they think will happen
English
18
7
148
19.9K
Thomas Lips
Thomas Lips@_tlips_·
@helper2424 @VilleKuosmanen @mimicrobotics This is where 'common sense' comes in, which helps in dealing with failures and can hopefully be obtained with other (less expensive) data sources such as sim, third person demonstrations, non-action data,...
English
0
0
1
38
Thomas Lips
Thomas Lips@_tlips_·
@helper2424 @VilleKuosmanen @mimicrobotics I think many researchers are actively adding 'envisioned failures ' explicitly to the train set during data collection.. seems to work to some extent, but I have the same feeling that covering all potential failures is rather hard..
English
1
0
1
26
Ville🤖
Ville🤖@VilleKuosmanen·
An interesting note from @mimicrobotics's paper - learning from corrections is the most underrated trick in today's robot learning that worked really well for me yet I rarely see it mentioned. You don't even need RL, just add the corrections as new episodes to IL data mix
Ville🤖 tweet media
English
3
0
31
2.5K