Seungju Back

@SeungjuBack

Undergrad@KAIST EE/CS

Katılım Ağustos 2025

24 Takip Edilen17 Takipçiler

Seungju Back retweetledi

Sungjin Ahn@SungjinAhn_·7h

🧠We introduce "Generative Recursive Reasoning"! Recursive Reasoning Models like HRM, TRM, and Looped Transformers are deterministic — same input, same reasoning, every time. They collapse the entire space of plausible reasoning paths into a single attractor. Our model GRAM (Generative Recursive reAsoning Models) turns recursion itself into a stochastic latent trajectory. Multiple hypotheses, alternative solution strategies, and inference-time scaling not just by depth, but by width — parallel trajectory sampling. And here's the kicker: the same formulation that gives us conditional reasoning p(y|x) also makes GRAM a general generative model p(x). With only 10M params: • Sudoku-Extreme: 97.0% (TRM 87.4%) • ARC-AGI-1: 52.0% • ARC-AGI-2: 11.1% • N-Queens coverage: 90%+ 📄 Paper: arxiv.org/abs/2605.19376 🌐 Project page: ahn-ml.github.io/gram-website w/ Junyeob Baek @JunyeobB (KAIST), Mingyu Jo @pyross0000 (KAIST), Minsu Kim @minsuuukim (KAIST & Mila), Mengye Ren @mengyer (NYU), Yoshua Bengio @Yoshua_Bengio (Mila), Sungjin Ahn @SungjinAhn_ (KAIST)

English

130

925

79.5K

Seungju Back retweetledi

alphaXiv@askalphaxiv·4 Mar

“Understanding LoRA as Knowledge Memory” Right now, most research treats LoRA like a cheap fine-tune toggle, but if you want to use it as swappable knowledge memory, the rule of thumb has been mostly vibes. This paper fixes that with a systematic audit of LoRA-as-memory, where it maps how storage scales and saturates with rank. It shows you get way more memory per token by training on QA/summaries instead of raw passages.

English

384

18.5K

Seungju Back retweetledi

Sungjin Ahn@SungjinAhn_·3 Mar

Understanding LoRA as Knowledge Memory 🚀 Can we save new LLM facts directly into LoRA weights? While recent works are hastily treating LoRA as a plug-and-play knowledge memory, the fundamental mechanics governing its capacity and composability have remained largely unexplored. 🤯We asked the hard question: Can an adapter meant for task adaptation actually serve as a reliable store for precise, declarative knowledge? To find out, we ran the first systematic empirical study mapping the design space of LoRA-based memory. The shocking reality is that treating LoRA as a memory unit can catastrophically fail in certain settings if you blindly trust it. ✅ Rather than proposing a single architecture, our paper provides practical guidance on its hidden operational boundaries —from characterizing finite storage capacity limits to the harsh realities of multi-module scaling and merging interference. Check out our systematic map of when LoRA memory succeeds, and exactly when it breaks! 🧑🏻‍💻Led by my fantastic students @SeungjuBack (KAIST) and @DongwooLee00 (KAIST), in collaboration with Samsung SDS. arxiv.org/abs/2603.01097

English

187

11.4K

Keşfet

@JunyeobB @pyross0000 @minsuuukim @mengyer @Yoshua_Bengio @SungjinAhn_ @DongwooLee00 @elonmusk