Sentientx

8 posts

Sentientx

@Sentientx_io

Embodied AI Lab. We collect real world robot data for humanoid manipulation models.

शामिल हुए Ocak 2026

12 फ़ॉलोइंग59 फ़ॉलोवर्स

Sentientx रीट्वीट किया

satyam@satiyum·28 Oca

More data leads to better models. And more data is not as hard to get as people claim. Don’t solve problems that don’t really exist. Scale real world data.

elvis@omarsar0

First empirical evidence that VLA models scale with massive real-world robot data. VLA foundation models promise robots that can follow natural language instructions and adapt to new tasks quickly. However, the field has lacked comprehensive studies on how performance actually scales with real-world data. This new research introduces LingBot-VLA, a Vision-Language-Action foundation model trained on approximately 20,000 hours of real-world manipulation data from 9 dual-arm robot configurations. Scaling pre-training data from 3,000 hours to 20,000 hours improves downstream success rates consistently, with no signs of saturation. More data still helps. The architecture uses a Mixture-of-Transformers design that couples a pre-trained VLM (Qwen2.5-VL) with an action expert through shared self-attention. This allows high-dimensional semantic priors to guide action generation while avoiding cross-modal interference. On the GM-100 benchmark spanning 100 tasks across 3 robotic platforms with 22,500 evaluation trials, LingBot-VLA achieves 17.30% success rate and 35.41% progress score, outperforming π0.5 (13.02% SR, 27.65% PS), GR00T N1.6 (7.59% SR, 15.99% PS), and WALL-OSS (4.05% SR, 10.35% PS). In simulation on RoboTwin 2.0, the model reaches 88.56% success rate in clean scenes and 86.68% in randomized environments, beating π0.5 by 5.82% and 9.92% respectively. Training efficiency matters for scaling. Their optimized codebase achieves 261 samples per second per GPU on an 8-GPU setup, representing a 1.5-2.8× speedup over existing VLA codebases like StarVLA, OpenPI, and DexBotic. Data efficiency is equally impressive: with only 80 demonstrations per task, LingBot-VLA outperforms π0.5 using the full 130-demonstration set. This is the first empirical demonstration that VLA performance continues scaling with more real-world robot data without saturation, providing a clear roadmap for building more capable robotic foundation models. Paper: arxiv.org/abs/2601.18692 Learn to build effective AI agents in our academy: dair-ai.thinkific.com

English

469

Sentientx रीट्वीट किया

satyam@satiyum·26 Oca

data collection efforts should aim to maximize total information gain for the models but information gain over what? for now, its information gain over the internet corpus as you see here, its very fruitful to control the hardware which makes our world more information rich for models

MrNeRF@janusch_patas

Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposure Monocular Videos Contributions: • We present Mono4DGS-HDR, the first system for reconstructing 4D HDR scenes from unposed monocular LDR videos captured with alternating exposures. • We propose a unified framework with a two-stage optimization procedure that learns video Gaussians in the first stage, transfers the Gaussians to world space, and then optimizes world Gaussians along with camera poses in the second stage. Temporal luminance regularization is also conducted to enhance HDR temporal stability. • We construct a new benchmark for evaluation and show that our approach significantly outperforms adapted alternative solutions.

English

497

Sentientx@Sentientx_io·26 Oca

@crypto_tytn We will buy you a jumper as well.

English

𝚃𝚒𝚝𝚊𝚗 (dev. arc ♟.♟)@crypto_tytn·26 Oca

@Sentientx_io Contributions allowed?

English

Sentientx@Sentientx_io·20 Oca

CES was a hit

English

196

Sentientx रीट्वीट किया

Sar-thak@sarthaktiwaryy·8 Oca

Lovely night with robotics veterans in vegas. @henry_yu_01 @Ricburton @Zebo_Furqatzoda @vuppetmaster And of course @GoingBallistic5 you were missed at CES 🫡

English

437

Sentientx@Sentientx_io·26 Oca

Core idea: let an LLM actively steer the human during recording to induce structure that passive egocentric video systematically misses. Most egocentric datasets are heavily biased toward a world where: •tasks rarely fail •failures rarely compound •abandoning a task midway is always acceptable (low stakes) This creates deceptively “clean” trajectories.

English

179

Sentientx@Sentientx_io·26 Oca

We’re experimenting with LLM-controlled egocentric data collection for world models or robotics learning. The LLM assigns long-horizon objectives, interrupts with immediate sub-tasks, lets the episode drift into side tasks, and forces failure + recovery. This produces intent → action → outcome → correction loops in real time, not post-hoc labels. Claim: diversity-aware prompting can fill data gaps passive video never will, without brute-force scale. Curious whether this actually helps world models learn controllable, general dynamics. 🧵

English

1.1K

Sentientx@Sentientx_io·20 Oca

The hoodie was more impactful then a booth

English

143

खोजें

@crypto_tytn @henry_yu_01 @Ricburton @Zebo_Furqatzoda @vuppetmaster @GoingBallistic5 @elonmusk @BarackObama