lalo hernandez

1.1K posts

lalo hernandez

@saucesaft

Hi! I'm an undergraduate robotics student @ ITESM. Currently doing a research internship at JRL CNRS-AIST

Tsukuba-shi, Ibaraki Katılım Ocak 2021

1K Takip Edilen55 Takipçiler

lalo hernandez retweetledi

Suvaditya Mukherjee@halcyonrayes·26 Nis

based off of the original blog i wrote to show ray tracing in 15 minutes in pytorch, this new iteration is a blog that describes the same but in jax for tpus. the goal is to show that a ton of problems can be expressed in tpu-compliant ways. link below! x.com/halcyonrayes/s…

Suvaditya Mukherjee@halcyonrayes

ray tracing ☀️ is among the cleanest algorithms out there that had been fairly out of reach for consumer machines until gpus happened. over the weekend, i wrote a tiny 15-minute intro to ray-tracing in simple @PyTorch to introduce the algorithm and make a ray-traced sphere! (1/n)

English

9.4K

lalo hernandez retweetledi

Vector Wang@VectorWang2·20 Nis

No world model is accurate, especially the intuitive one in your head Most video models collapse the future into one deterministic rollout, slowly ManiDreams keeps the uncertainty, and plans over it in a modular framework 😴Dream, 🤔Predict, 📦Constrain rice-robotpi-lab.github.io/ManiDreams/

English

314

158.4K

lalo hernandez retweetledi

Ilir Aliu@IlirAliu_·14 Nis

Latent Encoder-Decoder code base. Fully open sourced! You can train and visualize the latent space. [📍 Save it, to find it later when you need it] Thanks for sharing, Xueyan Zou (@xyz2maureen). Code: lnkd.in/dF_GnJAt Paper: lnkd.in/diya2BCs ——- Weekly robotics and AI insights. Subscribe free: 22astronauts.com

English

5.7K

lalo hernandez retweetledi

Yu Xiang@YuXiang_IRVL·2 Nis

Training a policy across four different hands in NVIDIA Isaac Lab: Leap, Allegro, Shadow, and MANO (human hand). Very cool work from @lfcasas7

English

100

7.2K

lalo hernandez retweetledi

JulianSaks@JulianSaks·29 Mar

let’s break this list down so it’s actually useful • JEPA / H-JEPA: avoids predicting every single pixel (too expensive) and rather predicts in latent space. H-JEPA adds hierarchy - short term details vs long term planning ie. how humans actually learn • I-JEPA: built for very efficient vision models. Masks image patches and predicts the semantics and in doing so bypasses heavy compute of traditional autoencoders • MC-JEPA & V-JEPA: both of these are built for videos. MC-JEPA separates content (what an object is) vs motion (how it moves). V-JEPA masks video features with no text labels making it perfect of action tracking at scale • Audio-JEPA: filters out background noise by treating sounds like visuals • Point-JEPA & 3D-JEPA: used primarily in AVs. Uses LiDAR point clouds & volumetric grids • ACT-JEPA: filters out real world noise to learn manipulation tasks efficiently via imitation learning • V-JEPA 2: predicts future physical states of the world caused by an action before it happens • LeJEPA: replaces techniques like masking with an Energy-Based Model (EBM) which mathematically prevents "feature collapse" & ensures the model scales reliably as data increases • Causal-JEPA: for learning true cause-and-effect physics by applying object level masking • V-JEPA 2.1: great for spatial grounding since it combines a dense predictive loss across image & video • LeWorldModel: built directly on LeJEPA's math but super compact - 15M params • ThinkJEPA: uses dense physical prediction with VLM reasoning. Best used when long-term strategy is needed

Turing Post@TheTuringPost

14 most important and influential types of JEPA ▪️ JEPA / H-JEPA ▪️ I-JEPA ▪️ MC-JEPA ▪️ V-JEPA ▪️ Audio-JEPA ▪️ Point-JEPA ▪️ 3D-JEPA ▪️ ACT-JEPA ▪️ V-JEPA 2 ▪️ LeJEPA ▪️ Causal-JEPA ▪️ V-JEPA 2.1 ▪️ LeWorldModel ▪️ ThinkJEPA Save the list and check this out to explore these JEPA milestones as a map of AI progress: turingpost.com/p/jepamap

English

659

82K

lalo hernandez retweetledi

C. Zhang@ChongZzZhang·30 Mar

Because I do not want to work, I wrote a new tech blog on sim2real for legged locomotion zita-ch.github.io/tech-blog/?pos… It is a brief summary of how I do sim2real, mostly focused on 4 aspects: Asset Contact Actuation Perception

English

310

26.2K

lalo hernandez@saucesaft·17 Mar

Take a look! Code: github.com/saucesaft/diff… Reference paper: arxiv.org/abs/2404.02887 #robotics #locomotion #differentiablesimulation #mujoco #JAX

English

lalo hernandez@saucesaft·17 Mar

I replicated their result in MuJoCo MJX, a pure JAX physics engine, so the entire pipeline runs as a single differentiable JAX program. ANYmal learns to walk from scratch in about 2 hours on a single laptop GPU, no reference trajectories, no motion capture data.

English

lalo hernandez@saucesaft·17 Mar

What if physics had a gradient? (1/5)

GIF

English

lalo hernandez retweetledi

Tianye Ding@TianyeJerryDing·10 Mar

Excited to share that our paper LASER has been accepted to #CVPR2026! We bridge the gap between high-quality offline reconstruction and real-time streaming. We can now turn SOTA models like VGGT and π³ into streaming systems—training-free.⚡️ Kilometer-scale reconstruction at 14 FPS. (1/n)

English

313

15.6K

lalo hernandez@saucesaft·4 Mar

really early version, need to iron some things out github.com/saucesaft/recu…

English

lalo hernandez@saucesaft·4 Mar

releasing recurrl-jax, a highly customizable recurrent RL library built on JAX/MJX • PPO & A2C with LSTM, GRU, GTrXL • asymmetric actor-critic • batched MJX environments with domain randomization going to be posting more about the development here

English

lalo hernandez retweetledi

Ben Clavié@bclavie·23 Şub

a propos of nothing

English

173

8.1K

lalo hernandez retweetledi

Siddharth Ancha@siddancha·19 Şub

Really interesting work! Pretrained representations like DINOv2 trained with contrastive losses seem to lie on a high-dimensional sphere. So instead of standard flow matching with straight line paths, we should do Riemannian flow matching constrained on this manifold!

Amandeep Kumar@Amandeep__kumar

🚀 Unlocking Standard Diffusion Transformers on Representation Encoders Why do standard DiTs fail to converge on high-dimensional features like DINOv2? 📉 We found the answer isn't just "more parameters"—it's Geometry. Introducing Riemannian Flow Matching with Jacobi Regularization (RJF) 📄 Paper: arxiv.org/abs/2602.10099

English

178

20K

lalo hernandez retweetledi

Ayush Tewari@_atewari·14 Şub

Is pixel prediction the best way to build a world model? Check out VDAWorld, an alternative path to building interpretable, editable, and physically grounded world models. We use a VLM to build a simulation of the scene with the help of a computer vision toolbox.

English

165

lalo hernandez retweetledi

Ai2@allen_ai·11 Şub

Introducing MolmoSpaces, a large-scale, fully open platform + benchmark for embodied AI research. 🤖 230k+ indoor scenes, 130k+ object models, & 42M annotated robotic grasps—all in one ecosystem.

English

103

723

96.6K

lalo hernandez retweetledi

Interesting Engineering@IntEngineering·9 Şub

The SoftFoot Pro by the Istituto Italiano di Tecnologia mimicks human biomechanics to achieve a more natural gait for its user, without any motor needed.

English

346

26.1K

lalo hernandez retweetledi

机器之心 JIQIZHIXIN@jiqizhixin·5 Şub

New paradigm from Kaiming He's team: Drifting Models! With this approach, you can generate a perfect image in a single step. The team trains a "drifting field" that smoothly moves samples toward equilibrium with the real data distribution. The result? A one-step generator that sets a new SOTA on ImageNet 256x256, beating complex multi-step models.

English

162

1.3K

319.9K

Keşfet

@xyz2maureen @lfcasas7 @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA