lalo hernandez

1.1K posts

lalo hernandez banner
lalo hernandez

lalo hernandez

@saucesaft

Hi! I'm an undergraduate robotics student @ ITESM. Currently doing a research internship at JRL CNRS-AIST

Tsukuba-shi, Ibaraki Katılım Ocak 2021
1K Takip Edilen55 Takipçiler
lalo hernandez retweetledi
Suvaditya Mukherjee
Suvaditya Mukherjee@halcyonrayes·
based off of the original blog i wrote to show ray tracing in 15 minutes in pytorch, this new iteration is a blog that describes the same but in jax for tpus. the goal is to show that a ton of problems can be expressed in tpu-compliant ways. link below! x.com/halcyonrayes/s…
Suvaditya Mukherjee tweet media
Suvaditya Mukherjee@halcyonrayes

ray tracing ☀️ is among the cleanest algorithms out there that had been fairly out of reach for consumer machines until gpus happened. over the weekend, i wrote a tiny 15-minute intro to ray-tracing in simple @PyTorch to introduce the algorithm and make a ray-traced sphere! (1/n)

English
1
4
51
9.4K
lalo hernandez retweetledi
Vector Wang
Vector Wang@VectorWang2·
No world model is accurate, especially the intuitive one in your head Most video models collapse the future into one deterministic rollout, slowly ManiDreams keeps the uncertainty, and plans over it in a modular framework 😴Dream, 🤔Predict, 📦Constrain rice-robotpi-lab.github.io/ManiDreams/
English
8
37
314
158.4K
lalo hernandez retweetledi
Ilir Aliu
Ilir Aliu@IlirAliu_·
Latent Encoder-Decoder code base. Fully open sourced! You can train and visualize the latent space. [📍 Save it, to find it later when you need it] Thanks for sharing, Xueyan Zou (@xyz2maureen). Code: lnkd.in/dF_GnJAt Paper: lnkd.in/diya2BCs ——- Weekly robotics and AI insights. Subscribe free: 22astronauts.com
English
0
12
68
5.7K
lalo hernandez retweetledi
Yu Xiang
Yu Xiang@YuXiang_IRVL·
Training a policy across four different hands in NVIDIA Isaac Lab: Leap, Allegro, Shadow, and MANO (human hand). Very cool work from @lfcasas7
English
1
10
100
7.2K
lalo hernandez retweetledi
JulianSaks
JulianSaks@JulianSaks·
let’s break this list down so it’s actually useful • JEPA / H-JEPA: avoids predicting every single pixel (too expensive) and rather predicts in latent space. H-JEPA adds hierarchy - short term details vs long term planning ie. how humans actually learn • I-JEPA: built for very efficient vision models. Masks image patches and predicts the semantics and in doing so bypasses heavy compute of traditional autoencoders • MC-JEPA & V-JEPA: both of these are built for videos. MC-JEPA separates content (what an object is) vs motion (how it moves). V-JEPA masks video features with no text labels making it perfect of action tracking at scale • Audio-JEPA: filters out background noise by treating sounds like visuals • Point-JEPA & 3D-JEPA: used primarily in AVs. Uses LiDAR point clouds & volumetric grids • ACT-JEPA: filters out real world noise to learn manipulation tasks efficiently via imitation learning • V-JEPA 2: predicts future physical states of the world caused by an action before it happens • LeJEPA: replaces techniques like masking with an Energy-Based Model (EBM) which mathematically prevents "feature collapse" & ensures the model scales reliably as data increases • Causal-JEPA: for learning true cause-and-effect physics by applying object level masking • V-JEPA 2.1: great for spatial grounding since it combines a dense predictive loss across image & video • LeWorldModel: built directly on LeJEPA's math but super compact - 15M params • ThinkJEPA: uses dense physical prediction with VLM reasoning. Best used when long-term strategy is needed
Turing Post@TheTuringPost

14 most important and influential types of JEPA ▪️ JEPA / H-JEPA ▪️ I-JEPA ▪️ MC-JEPA ▪️ V-JEPA ▪️ Audio-JEPA ▪️ Point-JEPA ▪️ 3D-JEPA ▪️ ACT-JEPA ▪️ V-JEPA 2 ▪️ LeJEPA ▪️ Causal-JEPA ▪️ V-JEPA 2.1 ▪️ LeWorldModel ▪️ ThinkJEPA Save the list and check this out to explore these JEPA milestones as a map of AI progress: turingpost.com/p/jepamap

English
10
95
659
82K
lalo hernandez retweetledi
C. Zhang
C. Zhang@ChongZzZhang·
Because I do not want to work, I wrote a new tech blog on sim2real for legged locomotion zita-ch.github.io/tech-blog/?pos… It is a brief summary of how I do sim2real, mostly focused on 4 aspects: Asset Contact Actuation Perception
English
6
33
310
26.2K
lalo hernandez
lalo hernandez@saucesaft·
I replicated their result in MuJoCo MJX, a pure JAX physics engine, so the entire pipeline runs as a single differentiable JAX program. ANYmal learns to walk from scratch in about 2 hours on a single laptop GPU, no reference trajectories, no motion capture data.
English
1
0
0
34
lalo hernandez
lalo hernandez@saucesaft·
What if physics had a gradient? (1/5)
GIF
English
1
0
0
20
lalo hernandez retweetledi
Tianye Ding
Tianye Ding@TianyeJerryDing·
Excited to share that our paper LASER has been accepted to #CVPR2026! We bridge the gap between high-quality offline reconstruction and real-time streaming. We can now turn SOTA models like VGGT and π³ into streaming systems—training-free.⚡️ Kilometer-scale reconstruction at 14 FPS. (1/n)
English
7
44
313
15.6K
lalo hernandez
lalo hernandez@saucesaft·
releasing recurrl-jax, a highly customizable recurrent RL library built on JAX/MJX • PPO & A2C with LSTM, GRU, GTrXL • asymmetric actor-critic • batched MJX environments with domain randomization going to be posting more about the development here
English
1
1
1
31
lalo hernandez retweetledi
Ben Clavié
Ben Clavié@bclavie·
a propos of nothing
Ben Clavié tweet media
English
0
9
173
8.1K
lalo hernandez retweetledi
Siddharth Ancha
Siddharth Ancha@siddancha·
Really interesting work! Pretrained representations like DINOv2 trained with contrastive losses seem to lie on a high-dimensional sphere. So instead of standard flow matching with straight line paths, we should do Riemannian flow matching constrained on this manifold!
Amandeep Kumar@Amandeep__kumar

🚀 Unlocking Standard Diffusion Transformers on Representation Encoders Why do standard DiTs fail to converge on high-dimensional features like DINOv2? 📉 We found the answer isn't just "more parameters"—it's Geometry. Introducing Riemannian Flow Matching with Jacobi Regularization (RJF) 📄 Paper: arxiv.org/abs/2602.10099

English
0
22
178
20K
lalo hernandez retweetledi
Ayush Tewari
Ayush Tewari@_atewari·
Is pixel prediction the best way to build a world model? Check out VDAWorld, an alternative path to building interpretable, editable, and physically grounded world models. We use a VLM to build a simulation of the scene with the help of a computer vision toolbox.
English
7
15
165
9K
lalo hernandez retweetledi
Ai2
Ai2@allen_ai·
Introducing MolmoSpaces, a large-scale, fully open platform + benchmark for embodied AI research. 🤖 230k+ indoor scenes, 130k+ object models, & 42M annotated robotic grasps—all in one ecosystem.
English
10
103
723
96.6K
lalo hernandez retweetledi
Interesting Engineering
Interesting Engineering@IntEngineering·
The SoftFoot Pro by the Istituto Italiano di Tecnologia mimicks human biomechanics to achieve a more natural gait for its user, without any motor needed.
English
4
67
346
26.1K
lalo hernandez retweetledi
机器之心 JIQIZHIXIN
机器之心 JIQIZHIXIN@jiqizhixin·
New paradigm from Kaiming He's team: Drifting Models! With this approach, you can generate a perfect image in a single step. The team trains a "drifting field" that smoothly moves samples toward equilibrium with the real data distribution. The result? A one-step generator that sets a new SOTA on ImageNet 256x256, beating complex multi-step models.
机器之心 JIQIZHIXIN tweet media
English
15
162
1.3K
319.9K