Sabitlenmiş Tweet
Anurag Bagchi
18 posts

Anurag Bagchi
@Miccooper9
CMU, ex-TikTok AI https://t.co/8BWbkmDWJJ
Katılım Ağustos 2021
1.3K Takip Edilen197 Takipçiler
Anurag Bagchi retweetledi

Excited to share our project - Sim2Reason!
Key Insight: Simulators are an untapped source of cheap supervision for scientific reasoning. LLMs can learn physical reasoning from simulation to improve on real world benchmarks such as the International Physics Olympiad!
Mihir Prabhudesai@mihirp98
What if AI learned physics the way Newton did – by experiencing it? We built Sim2Reason: train LLMs inside virtual worlds governed by real physics laws, zero human annotation. Result: +5–10% improvement on International Physics Olympiad, zero-shot. 🧵
English
Anurag Bagchi retweetledi

In my recent blog post, I argue that "vision" is only well-defined as part of perception-action loops, and that the conventional view of computer vision - mapping imagery to intermediate representations (3D, flow, segmentation...) is about to go away.
vincentsitzmann.com/blog/bitter_le…
English

@zhihelu1 Thanks! for world models, more precisely, forward dynamics models (current state + action -> future state), this is standard formulation. There are lots of model-based control approaches that can be used to plan/predict actions using such world models.
English

@Miccooper9 Great work! A simple concern is that action trajectory is the input for future frame generation, but in practice we often do not have the GT trajectory. How do we handle this case?
English

[6/6] Fine-grained humanoid manipulation
EgoWM enables precise 25-DoF joint-angle manipulation with the EVE-1X humanoid,
even at 4× temporal compression (Cosmos-2B).
📷 Learn more:
Project page: egowm.github.io
Paper: arxiv.org/pdf/2601.15284
English
Anurag Bagchi retweetledi

My role at Meta's SAM team (MSL, previously at FAIR Perception) has been impacted within 3 months of joining after PhD.
If you work with multimodal LLMs for grounding or complex reasoning, or have a long-term vision of unified understanding and generation, let's talk.
I am on the job market starting immediately.
#metalayoffs #FAIR #MSL #SAM
Jiaxun Cui 🐿️@cuijiaxun
Meta has gone crazy on the squid game! Many new PhD NGs are deactivated today (I am also impacted🥲 happy to chat)
English

Happening now! ICCV 25 poster#323
Drop by to chat and see some cool results!
Anurag Bagchi@Miccooper9
[ICCV 25] Refer Everything Model (REM) (1/6) We leverage Text-to-Video Generation models to zero-shot segment any concept in a video using text. REM generalises to dynamic concepts like smoke, light-beam and more without ever having seen segmentation masks for these entities.
English

@andrew_n_carr Thanks Andrew! We were also really surprised to see how well this worked. Exciting times ahead.
English

(6/6) We’re at the start of the internet-scale "video" era, and the possibilities are exciting. Learn more at refereverything.github.io — our code & model weights are available. Visiting ICCV? Come see our poster on Oct 23 to chat and see results in action!
English
