jpizarrom (@jpizarrom) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

jpizarrom@jpizarrom·3 Ara

🤖 early-stage experiment finetuning SmolVLA + RECAP-style advantage signals (inspired on π*0.6 paper) as BC actor + one-step flow actor with AWR on SO100 @LeRobotHF linkedin.com/feed/update/ur… previous experiments ACFQL + HIL-SERL x.com/jpizarrom/stat… x.com/jpizarrom/stat…

jpizarrom@jpizarrom

@qiyang_li @zhiyuan_zhou_ @svlevine This policy was trained with ACFQL/QC-FQL on @LeRobotHF @huggingface on top of HIL-SERL github.com/huggingface/le… linkedin.com/posts/jpizarro…

English

3

8

51

16.3K

jpizarrom@jpizarrom·10 Ara

@q17224 @LeRobotHF My current experiments focus more on the feasibility of learning. The next steps are reliability.

English

0

1

25

Jiabin@q17224·10 Ara

@jpizarrom @LeRobotHF Great work! Does the robot continuous work for hours successfully?

English

1

0

44

jpizarrom@jpizarrom·3 Ara

🤖 early-stage experiment finetuning SmolVLA + RECAP-style advantage signals (inspired on π*0.6 paper) as BC actor + one-step flow actor with AWR on SO100 @LeRobotHF linkedin.com/feed/update/ur… previous experiments ACFQL + HIL-SERL x.com/jpizarrom/stat… x.com/jpizarrom/stat…

jpizarrom@jpizarrom

@qiyang_li @zhiyuan_zhou_ @svlevine This policy was trained with ACFQL/QC-FQL on @LeRobotHF @huggingface on top of HIL-SERL github.com/huggingface/le… linkedin.com/posts/jpizarro…

English

3

8

51

16.3K

jpizarrom@jpizarrom·9 Ara

@RohanSeelan @LeRobotHF Don't know whether is feasible or not, but i am learning a lot while trying to do it.

English

0

39

jpizarrom@jpizarrom·9 Ara

@RohanSeelan @LeRobotHF Recently i did found this paper Refined Policy Distillation: From VLA Generalists to RL Experts arxiv.org/abs/2503.05833 Essentially, i am trying to check whether is feasible to do something similar but within ACFQL + VLA-BC finetuning with RECAP signals + one-step distillation.

English

1

0

2

62

jpizarrom@jpizarrom·8 Ara

@DrorIfrah @FRANKAROBOTICS @LeRobotHF Next steps focus on improving reliability and evaluation, as I was primarily focussed on feasibility of learning. Additionally, plan to try update ACFQL to support multi-task, longer-horizon scenarios. x.com/jpizarrom/stat…

jpizarrom@jpizarrom

🤖 early-stage experiment finetuning SmolVLA + RECAP-style advantage signals (inspired on π*0.6 paper) as BC actor + one-step flow actor with AWR on SO100 @LeRobotHF linkedin.com/feed/update/ur… previous experiments ACFQL + HIL-SERL x.com/jpizarrom/stat… x.com/jpizarrom/stat…

English

0

1

40

Dror David Ifrah@DrorDavidIfrah·8 Ara

@jpizarrom @FRANKAROBOTICS @LeRobotHF That looks great, what's the next step

English

1

0

48

jpizarrom@jpizarrom·7 Ara

On September I integrated @FRANKAROBOTICS arm into ACFQL + HIL-SERL on @LeRobotHF framework as my participation at the largest Humanoid Manipulation & AI Hackathon at the Deutsches Museum linkedin.com/posts/jpizarro… #LeRobot #Robotics #AI #DeepLearning #ReinforcementLearning

English

4

5

43

4.3K

jpizarrom@jpizarrom·8 Ara

@dAmineKharrat @FRANKAROBOTICS @LeRobotHF It was the best Hackathon experience I've ever had! 🚀 I am really thankful that @Nicolas_Keller and all the organizers that gave me the opportunity to participate. He wrote about the event in linkedin.com/feed/update/ur… In wrote about in linkedin.com/posts/jpizarro…

English

0

3

52

Amine Kharrat@dAmineKharrat·8 Ara

@jpizarrom @FRANKAROBOTICS @LeRobotHF Coool, how was it

English

1

0

2

97

jpizarrom@jpizarrom·6 Ara

@svlevine Thanks a lot for sharing such amazing work I am implementing SmolVLA with RECAP-style advantage signal as BC Actor in ACFQL + HIL-SERL on @LeRobotHF It seems to stabilize training, allowing the one-step actor to start learning x.com/jpizarrom/stat…

jpizarrom@jpizarrom

🤖 early-stage experiment finetuning SmolVLA + RECAP-style advantage signals (inspired on π*0.6 paper) as BC actor + one-step flow actor with AWR on SO100 @LeRobotHF linkedin.com/feed/update/ur… previous experiments ACFQL + HIL-SERL x.com/jpizarrom/stat… x.com/jpizarrom/stat…

English

0

1

142

Sergey Levine@svlevine·18 Kas

To read about π*0.6 and Recap, check out the official blog post: pi.website/blog/pistar06 The blog post also links to a full research paper about the method, as well as a model card for the π0.6 model.

English

3

8

114

9K

Sergey Levine@svlevine·18 Kas

We just released results for our newest VLA from Physical Intelligence: π*0.6. This one is trained with RL, and it makes it quite a bit better: often doubles throughput, enables real-world tasks like folding real laundry and making espresso drinks at the office.

English

46

192

1.7K

297.5K

jpizarrom retweetledi

LeRobot@LeRobotHF·5 Ara

Great community project 💛, we are very excited to bring more real world RL to LeRobot. Follow the progress on our Discord!

jpizarrom@jpizarrom

🤖 early-stage experiment finetuning SmolVLA + RECAP-style advantage signals (inspired on π*0.6 paper) as BC actor + one-step flow actor with AWR on SO100 @LeRobotHF linkedin.com/feed/update/ur… previous experiments ACFQL + HIL-SERL x.com/jpizarrom/stat… x.com/jpizarrom/stat…

English

0

12

123

13.8K

jpizarrom retweetledi

lil’km@_lilkm_·3 Ara

Thrilled to have worked on integrating the Earth Rover from @frodobots into @LeRobotHF! Everything's fully open source—dataset, software, hardware. Super excited to start training models on this massive 7K-hour dataset from 40+ cities. democratizing urban nav research for all!

FrodoBots@frodobots

We're open-sourcing our Earth Rover platform with @huggingface & @sigrobotics! 🤖 Integrated hardware (electronics, software, 3D files) with @LeRobotHF 🌎 7,000 hours of driving data from 40+ cities, curated by UC Berkeley researchers Thread ↓

English

1

2

10

7.1K

jpizarrom@jpizarrom·3 Ara

[4] arxiv.org/abs/2511.14759 π∗0.6: a VLA That Learns From Experience [5] huggingface.co/datasets/jpiza… offline RL dataset 381 episodes [6] github.com/jpizarrom/lero… Current WIP branch with ACFQL + Flow Matching + VLA RECAP style advantages [7] github.com/huggingface/gy… gym-hil

English

0

2

235

jpizarrom@jpizarrom·3 Ara

**🔗 References:** [1] github.com/huggingface/le… LeRobot: Making AI for Robotics more accessible with end-to-end learning [2] github.com/huggingface/le… PR Add Flow Q-learning (FQL) agent with action chunking [3] arxiv.org/abs/2507.07969 Reinforcement Learning with Action Chunking

English

1

0

1

259

jpizarrom retweetledi

Remi Cadene@RemiCadene·2 Ara

Grab your ticket to the Hardware Meetup at our event place ~Neon Noire~ A chance to meet the team at UMA! Speakers: Steven Palma (LeRobot Hugging Face), Aurélien Cord (CTO Stanley Robotics), Roch Molléro (CTO Gobano Robotics), & myself luma.com/y48iwply

Remi Cadene@RemiCadene

Humanity is at a turning point. I am launching UMA to build general-purpose mobile and humanoid robots from Europe. Proud to start with people I admired for years, and grateful for all your support! Reach out to us @UMA_Robots ❤️

English

3

20

141

26.1K

jpizarrom@jpizarrom·2 Ara

@zhiyuan_zhou_ @qiyang_li @svlevine @LeRobotHF @huggingface @zhiyuan_zhou_ @svlevine @qiyang_li Thank you very much for sharing such amazing work

English

0

2

96

Paul Zhou@zhiyuan_zhou_·2 Ara

@jpizarrom @qiyang_li @svlevine @LeRobotHF @huggingface Another task 👀

English

1

0

1

134

Qiyang (Colin) Li@qiyang_li·1 Ara

Excited to be in San Diego attending NeurIPS this week! Come check out our paper "Reinforcement Learning with Action Chunking" (w/ @zhiyuan_zhou_, @svlevine) in Exhibit Hall C,D,E (#415) on Thursday 12/04, 11a - 2p. x.com/qiyang_li/stat…