Mingyu Jo (@pyross0000) - Twitter Profili | Zamantika Mersobahis Locabet

Mingyu Jo retweetledi

🧠We introduce "Generative Recursive Reasoning"! Recursive Reasoning Models like HRM, TRM, and Looped Transformers are deterministic — same input, same reasoning, every time. They collapse the entire space of plausible reasoning paths into a single attractor. Our model GRAM (Generative Recursive reAsoning Models) turns recursion itself into a stochastic latent trajectory. Multiple hypotheses, alternative solution strategies, and inference-time scaling not just by depth, but by width — parallel trajectory sampling. And here's the kicker: the same formulation that gives us conditional reasoning p(y|x) also makes GRAM a general generative model p(x). With only 10M params: • Sudoku-Extreme: 97.0% (TRM 87.4%) • ARC-AGI-1: 52.0% • ARC-AGI-2: 11.1% • N-Queens coverage: 90%+ 📄 Paper: arxiv.org/abs/2605.19376 🌐 Project page: ahn-ml.github.io/gram-website w/ Junyeob Baek @JunyeobB (KAIST), Mingyu Jo @pyross0000 (KAIST), Minsu Kim @minsuuukim (KAIST & Mila), Mengye Ren @mengyer (NYU), Yoshua Bengio @Yoshua_Bengio (Mila), Sungjin Ahn @SungjinAhn_ (KAIST)

English

21

112

756

56K

Mingyu Jo retweetledi

Justin Deschenaux@jdeschena·6d

🔥 New paper: Language Modeling with Hyperspherical Flows Recent flow language models (FLMs) all use Gaussian noise. Makes sense for images, but not necessarily for text 🫠 We propose to add noise by rotating embeddings on 𝕊^{d−1} instead 🌐 w/ @caglarml (1/9)

English

13

82

425

56.5K

Mingyu Jo retweetledi

Sungjin Ahn@SungjinAhn_·31 Mar

We are seeking a highly motivated postdoctoral researcher to work on fundamental challenges toward AGI, particularly in reasoning, abstraction, and world modeling. The position also offers potential opportunities for co-advising with Yoshua Bengio (Mila) and/or Mengye Ren (NYU). Research areas include: • World Model Learning & Planning • Compositional Generalization & Neuro-Symbolic World Learning • Causal Discovery, Reasoning, and Abstraction This position is supported by the InnoCORE Fellowship Program 2026, with: • Competitive salary of KRW 90M+ (~USD 60K+) • Renewable yearly contract For more information and recent publications: mlml.kaist.ac.kr If you are interested, please send me your CV by email.

English

1

16

64

10.3K

Mingyu Jo retweetledi

Sungjin Ahn@SungjinAhn_·3 Mar

Understanding LoRA as Knowledge Memory 🚀 Can we save new LLM facts directly into LoRA weights? While recent works are hastily treating LoRA as a plug-and-play knowledge memory, the fundamental mechanics governing its capacity and composability have remained largely unexplored. 🤯We asked the hard question: Can an adapter meant for task adaptation actually serve as a reliable store for precise, declarative knowledge? To find out, we ran the first systematic empirical study mapping the design space of LoRA-based memory. The shocking reality is that treating LoRA as a memory unit can catastrophically fail in certain settings if you blindly trust it. ✅ Rather than proposing a single architecture, our paper provides practical guidance on its hidden operational boundaries —from characterizing finite storage capacity limits to the harsh realities of multi-module scaling and merging interference. Check out our systematic map of when LoRA memory succeeds, and exactly when it breaks! 🧑🏻‍💻Led by my fantastic students @SeungjuBack (KAIST) and @DongwooLee00 (KAIST), in collaboration with Samsung SDS. arxiv.org/abs/2603.01097

English

2

35

186

11.3K

Mingyu Jo retweetledi

Jaesik Yoon@jaesikyoon_·4 Kas

🧠 Our core question: "How can we extend MCTD to longer, more complex compositional planning tasks, beyond its trained trajectory lengths?" 💡 Our solution (C-MCTD): We solve this problem with plan-level tree search, and boost its efficiency via parallelization and amortization. It has been accepted as a Spotlight at the upcoming #neurips2025 . 📄 ArXiv: arxiv.org/abs/2510.21361 🌐 Project Page: jaesikyoon.com/c-mctd-page/ This work was advised by @SungjinAhn_ and co-worked with a great colleague @hyeonscho . Huge thanks to them and MLML members!

English

0

16

93

7.5K

Mingyu Jo retweetledi

Caglar Gulcehre@caglarml·25 Eki

This was an incredibly fun project to work on, and it has some of my favorite components in a research idea: - Simple. - Intuitive and works really well. In this work, we introduced the loophole technique, which lets discrete diffusion models bypass the "sampling wall" by preserving rich token distributions across steps—unlocking faster, more coherent, non-autoregressive text generation. You just provide it with context at each step in the diffusion, like an RNN does, and it helps a lot. Read the paper and the project page to learn more about it: sites.google.com/view/lddms/home BTW, if you are looking for a PhD or Postdoc position, @SungjinAhn_ 's group is doing a lot of interesting work similar to this one. I would definitely recommend considering his lab!

Sungjin Ahn@SungjinAhn_

🚨 Check out our new paper on next generation language modeling via "loopholing" discrete diffusion! 🤯 Surprisingly, our loopholing diffusion achieved a huge performance improvement, finally making it match (or even surpass) autoregressive models! ✅ How? We introduce the "loopholing" mechanism — a discrete diffusion that introduces a deterministic bypass alongside the stochastic path to break the sampling wall. 👨🏻‍💻 Led by my fantastic student Mingyu (@pyross0000, KAIST) and @jaesikyoon_ (KAIST), in collaboration with Justin Deschenaux (EPFL) and Caglar Gulcehre (EPFL, Microsoft). 📄 arXiv: arxiv.org/abs/2510.19304 🌐 Project: sites.google.com/view/lddms/home

English

1

6

17

2.5K

Mingyu Jo retweetledi

DailyPapers@HuggingPapers·24 Eki

Loopholing Discrete Diffusion was just released by researchers from Microsoft, KAIST, EPFL, SAP & NYU A novel method that reduces generative perplexity in discrete diffusion models by deterministically carrying latent information across steps, bypassing the "sampling wall."

English

1

2

8

1K

Mingyu Jo retweetledi

Jaesik Yoon@jaesikyoon_·24 Eki

Why should diffusion language models be confined to a discrete token space? We studied how to overcome this limitation by applying a 'loophole' for updating continuous latents during the denoising process. Curious about our findings? Check out our paper, "Loopholing Discrete Diffusion"! Huge thanks to my advisor @SungjinAhn_ and @pyross0000 (amazing achievement as an undergraduate!), and our wonderful collaborators @jdeschena and @caglarml !

Sungjin Ahn@SungjinAhn_

🚨 Check out our new paper on next generation language modeling via "loopholing" discrete diffusion! 🤯 Surprisingly, our loopholing diffusion achieved a huge performance improvement, finally making it match (or even surpass) autoregressive models! ✅ How? We introduce the "loopholing" mechanism — a discrete diffusion that introduces a deterministic bypass alongside the stochastic path to break the sampling wall. 👨🏻‍💻 Led by my fantastic student Mingyu (@pyross0000, KAIST) and @jaesikyoon_ (KAIST), in collaboration with Justin Deschenaux (EPFL) and Caglar Gulcehre (EPFL, Microsoft). 📄 arXiv: arxiv.org/abs/2510.19304 🌐 Project: sites.google.com/view/lddms/home

English

0

6

23

2.1K

Mingyu Jo@pyross0000·23 Eki

RT @jdeschena: 🚨 What if diffusion language models could pass context between sampling steps? 👀 Turns out it leads to major performance gai…

English

0

1

0

68

Mingyu Jo retweetledi

Sungjin Ahn@SungjinAhn_·23 Eki

🚨 Check out our new paper on next generation language modeling via "loopholing" discrete diffusion! 🤯 Surprisingly, our loopholing diffusion achieved a huge performance improvement, finally making it match (or even surpass) autoregressive models! ✅ How? We introduce the "loopholing" mechanism — a discrete diffusion that introduces a deterministic bypass alongside the stochastic path to break the sampling wall. 👨🏻‍💻 Led by my fantastic student Mingyu (@pyross0000, KAIST) and @jaesikyoon_ (KAIST), in collaboration with Justin Deschenaux (EPFL) and Caglar Gulcehre (EPFL, Microsoft). 📄 arXiv: arxiv.org/abs/2510.19304 🌐 Project: sites.google.com/view/lddms/home

GIF

English

5

17

64

19.7K

Mingyu Jo retweetledi

Sungjin Ahn@SungjinAhn_·28 Ağu

🚀 Introducing CrafterDojo! Crafter has been a popular testbed for open-ended agent learning—but progress has been limited without foundation models like VPT, CLIP, and STEVE. With CrafterDojo, we provide these models + toolkits so the community can easily prototype LLM-augmented agents in Crafter. Led by amazing students: Junyeong Park & Hyeonseo Cho (KAIST) ✨ 📄 arXiv: arxiv.org/abs/2508.13530 🌐 Webpage: sites.google.com/view/crafterdo…

English

2

7

25

1.8K

Mingyu Jo

Keşfet