Jinyu Hou

25 posts

Jinyu Hou banner
Jinyu Hou

Jinyu Hou

@jinyuhou0

PhDing @LTIatCMU || MS @MLDCMU || HBSc @UofT Interested in agent, world model, RL

Pittsburgh, PA Katılım Aralık 2017
354 Takip Edilen187 Takipçiler
Sabitlenmiş Tweet
Jinyu Hou retweetledi
Jinyu Hou retweetledi
Benhao Huang
Benhao Huang@huskydogewoof·
🌀 Introducing 𝐄𝐪𝐮𝐢𝐥𝐢𝐛𝐫𝐢𝐮𝐦 𝐑𝐞𝐚𝐬𝐨𝐧𝐞𝐫𝐬 (𝐄𝐪𝐑) ! Feedforward models and weight-tied models behave very differently on hard reasoning generalization. EqR pushes this difference to the extreme by learning 𝐭𝐚𝐬𝐤-𝐜𝐨𝐧𝐝𝐢𝐭𝐢𝐨𝐧𝐞𝐝 𝐧𝐞𝐮𝐫𝐚𝐥 𝐚𝐭𝐭𝐫𝐚𝐜𝐭𝐨𝐫𝐬 . • Sudoku-Extreme: 99.8% • Maze: 93% #ICML2026
English
11
61
278
64.9K
Jinyu Hou retweetledi
Mingkai Deng
Mingkai Deng@mdeng34·
Really interesting post -- agreed that our goals should be physical AGI, and goal-driven beats idea-driven. Though we see it differently on a couple of things: 1. If you pick 99% + 1 hour of demonstrated task data as your success criteria, world model will surely look unnecessary. But physical AGI is about dealing with situations you **cannot** demonstrate ahead of time. This is not a methods debate, but a goal debate. World model solves this problem by simulating possible outcomes and generating synthetic experience for unseen tasks. 2. One useful analogy: LLMs aren't strong just because of post-training RL. Self-supervised pretraining is arguably the source of its intelligence. World models play the same role for physical AI -- they're not a training trick you can skip with more data, but an indispensable component for understanding and reasoning. 3. Language is not just a "crutch while we don't have enough robotics data" -- it encodes institutions, social norms, and mental states that physical interaction data can't capture efficiently, regardless of scale. This is what led us to the GLP (Generative Latent Prediction) world model architecture. It includes an enhanced LLM dynamics backbone and mixed continuous/discrete latent states. Language and physical commonsense aren't A or B, but complementary abstractions the world model should unify. PAN, a world model built on GLP, is trained on internet data but already enables open-domain action simulation that transfers to robotic policies. More on GLP: arxiv.org/abs/2507.05169 More on PAN: arxiv.org/abs/2511.09057
English
0
1
2
179
Jinyu Hou
Jinyu Hou@jinyuhou0·
Really thought-provoking post — the goal-driven vs. idea-driven distinction resonates a lot. It got me wonder though: perhaps goal-driven research doesn't have to be agnostic about world model? In our recent work (arxiv.org/abs/2507.05169), we argue that the field has gotten too focused on world models as video generators, when their real value should be as reasoning engines — specifically, simulating counterfactual action outcomes to enable planning, which seems closely aligned with the zero-shot physical AGI goal outlined here. These aren't mutually exclusive with data scaling — if anything, a good world model should amplify the value of the data you already have by enabling generalization beyond its empirical coverage. Would love to hear your thoughts.
English
0
1
2
165
Jinyu Hou retweetledi
LLM360
LLM360@llm360·
To mark the 2nd anniversary of LLM360, we are proud to release K2-V2: a 70B reasoning-centric foundation model that delivers frontier capabilities. As a push for "360-open" transparency, we are releasing not only weights, but the full recipe: data composition, training code, logs, and intermediate checkpoints. About K2-V2: 🧠 70B params, reasoning-optimized 🧊 512K context window 🔓 "360-Open" (Data, Logs, Checkpoints) 📈 SOTA on olympiad math and complex logic puzzles
LLM360 tweet media
English
2
25
55
21.8K
Jinyu Hou retweetledi
Eric Xing
Eric Xing@ericxing·
Now you have an alternative to the super popular but unfortunately not so transparent (you have no idea how it was trained, what data was used, is it safe …) base LLMs such as Qwen 2.5 or 3, to build your own reasoning or general purpose LLMs through post-train, SFT, RL, etc. It is 360-open and reproducible.
MBZUAI@mbzuai

Today, we are releasing a new version of K2 (K2-V2), a 360-open LLM built from scratch as a superior base for reasoning adaptation, while still excelling at core LLM capabilities like conversation, knowledge retrieval, and long-context understanding. K2 fills a major gap: highly capable models with no transparency. Instead of releasing only weights, we’re sharing the full training story — dataset recipes, mid-training checkpoints, logs, code, and evaluation tools. That’s 360-open. What’s inside: • 70B dense transformer engineered as a reasoning-enhanced base model • Native 512K context (extendable via RoPE scaling) • Mid-training reasoning phase • Strong tool-use scaffolding What we’re open-sourcing: • 250M+ reasoning traces (math, planning, multi-step logic) • Full pre- & mid-training data compositions • All mid-training checkpoints • Training logs, code, Eval360 Performance: • GPQA-Diamond: 55.1% mid-training → 69.3% after SFT (strongest fully open 70B model) • KK-8 Logic Puzzles: 83% — competitive with DeepSeek-R1 & OpenAI o3-mini-high • ArenaHard V2: 62.1% — close to Qwen3 235B • Outperforms Qwen2.5-72B and approaches Qwen3-235B despite being smaller and fully transparent. 🔗 The Model: bit.ly/3KIYwuo 🔗Technical Report: bit.ly/49V8h2U 🔗Blog: bit.ly/49V7gb6

English
1
11
44
9.4K
Jinyu Hou retweetledi
Eric Xing
Eric Xing@ericxing·
In this paper we present the first full implementation of the Generative Latent Prediction (GLP) architecture of world modeling, that brings perception, state, action, and causality into a single, coherent world model that can plan, imagine, and reason through language, interaction, and thought experiment. arxiv.org/abs/2511.09057 @szxiangjn, @YiGu025, @guangyi_l, @waterluffy, @ZhitingHu
English
4
23
93
15.1K
Jinyu Hou retweetledi
Zhiting Hu
Zhiting Hu@ZhitingHu·
🔥Really excited to see the release of PAN world model, a project I had been working over the past years. PAN is a general world model capable of simulating physical, agentic, and nested worlds, synthesizing infinite interactive experiences for training AI agents. Building on top of pretrained LLMs and video diffusion models, PAN connects language, perception, action, and latent thoughts, for long-horizon simulation and reasoning. PAN shows overwhelming performance gains over JEPA-2, Cosmos-2, and other prior models. More in the thread👇 ... 1/
English
8
53
240
31.1K
Jinyu Hou retweetledi
Mingkai Deng
Mingkai Deng@mdeng34·
Honored to co-lead this paper with @ericxing & team - Formally showed WM as part of optimal, general agent - Reviewed several schools of WM towards this goal - Outlined an new PAN architecture for general WM Excited for the upcoming release of 27B PAN v1! arxiv.org/abs/2507.05169
Eric Xing@ericxing

I have been long arguing that a world model is NOT about generating videos, but IS about simulating all possibilities of the world to serve as a sandbox for general-purpose reasoning via thought-experiments. This paper proposes an architecture toward that arxiv.org/abs/2507.05169

English
1
5
26
4.1K
Jinyu Hou retweetledi
Eric Xing
Eric Xing@ericxing·
I have been long arguing that a world model is NOT about generating videos, but IS about simulating all possibilities of the world to serve as a sandbox for general-purpose reasoning via thought-experiments. This paper proposes an architecture toward that arxiv.org/abs/2507.05169
English
7
84
514
46.6K
Jinyu Hou
Jinyu Hou@jinyuhou0·
@savvyRL Hi Rosanne, I applied for the MS position and sent you a message in DM.
English
0
0
0
153
Jinyu Hou retweetledi
Caleb Ellington
Caleb Ellington@probablybots·
The Contextualized Machine Learning White Paper arxiv.org/abs/2310.11340 w/ @ben_lengerich Intuition, applications, algorithms, and extensions for contextualized models: models that understand heterogeneity in real data, adapt to new environments, and are explainable by design.
English
0
8
22
4.6K
Jinyu Hou retweetledi
Sang Choe
Sang Choe@sangkeun_choe·
High-quality data is a key to successful pretrain/finetuning in the GPT era, but manual data curation is expensive💸 We tackle data quality challenges involving large models and datasets with ScAlable Meta leArning (SAMA) #NeurIPS2023💫 Arxiv: arxiv.org/abs/2310.05674 🧵 (1/n)
Sang Choe tweet media
English
2
22
80
13.7K
Jinyu Hou retweetledi
Vahid Balazadeh
Vahid Balazadeh@vahidbalazadeh·
There's been a lot of success in causal effect estimation using machine learning. But what if point identification is impossible? Our NeurIPS 2022 paper, "Partial Identification of Treatment Effects with Implicit Generative Models," estimates bounds on causal effects instead. 🧵
Vahid Balazadeh tweet media
English
1
7
15
0
Jinyu Hou
Jinyu Hou@jinyuhou0·
Working on the project was a great experience from which I learned a lot. Many thanks to @kieranrcampbell for all the instructions and thank everyone else on the project for the great collaboration!
Kieran Campbell@kieranrcampbell

My group's first research paper on automated cell type assignment for highly multiplexed imaging data now published in @CellSystemsCP Paper: authors.elsevier.com/a/1dmB38YyDffJ… Tool: github.com/camlab-bioml/a… Some thoughts and updates:

English
0
0
4
0
Jinyu Hou retweetledi
Kieran Campbell
Kieran Campbell@kieranrcampbell·
Our first research paper as a group was preprinted today: automated cell assignment for highly multiplexed imaging and proteomic data Paper: biorxiv.org/content/10.110…
Kieran Campbell tweet media
English
6
35
189
0