Xin Yan

104 posts

Xin Yan

@cakeyan9

Research Scientist @ ByteDance Seed | Prev. @UWaterloo @MITIBMLab @01AI_Yi @WHU_1893

เข้าร่วม Şubat 2022

599 กำลังติดตาม73 ผู้ติดตาม

Xin Yan รีทวีตแล้ว

Jiatao Gu@thoma_gu·1d

If you are still at @iclr_conf today, please come check out our poster this afternoon at 2:30pm at DeLTa Workshop, Poster #110 Paper: arxiv.org/abs/2604.17673 Excited to share our work on grokking in diffusion models! We show that flow-matching diffusion models can grok modular addition and learn interpretable periodic/Fourier-like representations. One of my favorite findings: during sampling, the model undergoes a phase transition — early timesteps perform algorithmic reasoning, while later timesteps become visual denoising. Interestingly, this transition is clearly visible in the model’s internal Fourier structure. This suggests diffusion models are not merely denoisers: under the right setting, they can implement structured computation along a continuous generation trajectory! Great work by my students at @PennEngineers @hozy5333 @hagsaeng_bag and Mattis Dalsætra Østby!

GIF

Joon Hyeok Kim@hozy5333

📌 Catch our poster presentation at the ICLR 2026 DeLTa Workshop Afternoon Session Poster #110! 📄 Arxiv: arxiv.org/abs/2604.17673 Grokking of Diffusion Models: Case Study on Modular Addition

English

126

12K

Xin Yan@cakeyan9·6d

Tested GPT-Image-2's torn paper effect. Fun: 1. 3 images are from different sessions and prompts. The generated text (Abstract/Intro) is almost identical. 2. Only grabs half of the real Abstract 3. The text rendering across the torn paper gaps is incredibly seamless. Pixel DiT?

English

289

Xin Yan@cakeyan9·22 Nis

2.0 is insane! the website is also incredible. it really proves how crucial info density is today—not just for text, but for visuals as well. something we definitly need to learn.

OpenAI@OpenAI

ChatGPT Images 2.0 is a step change in detailed instruction following, placing and relating objects accurately, and rendering dense text, with the ability to generate across aspect ratios. It’s also accurate across languages and uses its expanded visual and world knowledge to fill in the gaps for you, so you get smarter images with less prompting. openai.com/index/introduc…

English

Xin Yan รีทวีตแล้ว

Yuntian Deng @ ICLR@yuntiandeng·21 Nis

The advisor review is a good idea, but that doc keeps getting edited/deleted. So I built append.page/p/advisors: reads like a doc, but the data underneath is a hash chain. Once you post, nobody can silently edit/delete it. Anonymously review your advisor for the next cohort.

Rob Tang@XiangruTang

小红书上的北美教授红黑榜 #heading=h.6pyxgqw8wy3" target="_blank" rel="nofollow noopener">docs.google.com/document/d/1-A… 其实没有绝对的红和绝对的黑读phd不容易做教授也不容易大家应该互相理解找到平衡，找到适合自己的组才是最重要的

English

33.9K

Xin Yan@cakeyan9·16 Nis

insane...

@levelsio@levelsio

OpenAI's new image model GPT-Image-2 has leaked It seems to have extremely good world knowledge and great text rendering Possibly better than Nano Banana Pro It's on @arena under code names: - maskingtape-alpha - gaffertape-alpha - packingtape-alpha

Türkçe

Xin Yan รีทวีตแล้ว

Peter Lin@peter9863·16 Nis

Seedance 2.0 Model Card is out! arxiv.org/pdf/2604.14148

English

1.8K

Xin Yan รีทวีตแล้ว

Natalie Khalil@natalienkhalil·13 Nis

Basically

English

1.4K

6.9K

1.3M

Xin Yan รีทวีตแล้ว

Peter Lin@peter9863·15 Nis

Continuous Adversarial Flow Models (CAFMs) Paper: arxiv.org/abs/2604.11521 Flow matching generates poor samples without guidance because the MSE loss induces incorrect generalization. Instead of an isotropic Euclidean distance, we need a manifold-aware criterion—but how can we obtain it? CAFMs bring adversarial training to continuous time. Learning velocity with a discriminator induces better generalization because the discriminator as a criterion can learn the manifold! Also unlike flow matching’s forward KL objective, adversarial training allows optimizing different divergences. CAFMs can generate sharper and higher-quality samples. Adversarial training in continuous time also avoids the vanishing gradient problem, leading to stable training. CAFMs can be trained from scratch or used to post-train existing flow models. Post-training SiT/JiT for just 10 epochs yields large FID improvements. We also observe significant GenEval and DPG improvements when post-training text-to-image models. More details in this thread!

English

505

261K

Xin Yan รีทวีตแล้ว

Yuntian Deng @ ICLR@yuntiandeng·14 Nis

🚀 Launching ProgramAsWeights (PAW)! Define functions in English → PAW compiles them into tiny neural programs → Run locally like normal Python functions. A neural program combines discrete text + continuous LoRA to adapt a fixed small interpreter. 🔗 programasweights.com

English

328

32.2K

Xin Yan@cakeyan9·11 Nis

@yuntiandeng I can't recall the exact timing, but Yuntian mentioned the idea of Neural OS to me as early as two years ago. He always has such a forward-looking vision. Definitely everyone should pay more attention to his research group.

English

5.2K

Xin Yan รีทวีตแล้ว

Yuntian Deng @ ICLR@yuntiandeng·11 Nis

Glad to see followups to neural-os.com, but disappointed that neither the blog (with 34 refs) nor the code repo acknowledged NeuralOS, even tho the released data code appears to build directly on top of ours. That omission is hard to understand given our shared vision.

Jürgen Schmidhuber@SchmidhuberAI

Neural Computers arxiv.org/abs/2604.06425

English

664

247.8K

Xin Yan@cakeyan9·11 Nis

i am not a part of the seedance team, but i still feel confused why happyhorse got such a high elo... i do not think it's better than seedance

Chetaslua@chetaslua

🚨 Breaking News Artificial Analysis suddenly introduced an unknown model that surpassed Seedance 2.0. I am uploading few of its result in few mins

English

273

Xin Yan รีทวีตแล้ว

Yuntian Deng @ ICLR@yuntiandeng·11 Nis

@omarsar0 This direction was already explored in our earlier work NeuralOS (neural-os.com, ICLR 2026). We've invested nearly two years and over 5K commits to reach the current system, so I hope appropriate credit can be given.

English

246

24.1K

Xin Yan@cakeyan9·27 Mar

he’s right

José Maria Macedo@ZeMariaMacedo

x.com/i/article/2036…

English

Xin Yan@cakeyan9·27 Mar

@Jiaxi_Cui 十分同意

日本語

112

Panda@Jiaxi_Cui·26 Mar

很多人凭自己的直觉认为多模态应该尽可能多的做标注但实际并不是的，在北大做Languagebind(arxiv.org/pdf/2310.01852)的时候我就非常排斥对图片进行标注再训练或者检索因为人为或者模型打出的标注，只是按照人类的想法把向量空间的特征映射到了人类自然语言空间而已，本来就造成了特征损失而借助 AutoResearch，在我们的 cerul.ai 的实验上我验证了这个想法，实际上对图片的标注越多，反而越会损伤embedding检索的性能

中文

141

14.2K

Xin Yan รีทวีตแล้ว

Owen Tian Ye@tiny85114767·20 Mar

Realtime Editing as a Systems Problem. About 2 months of our full-stack optimization—cache/kernel/VAE serving paths + causal editing distillation + reward-based DMD for few-step editing. Tech Blog Preview: owen718.github.io/blogs/realtime…

English

1.7K

Xin Yan@cakeyan9·21 Mar

@JustinLin610 hinting at what? 👀

English

1.8K

Junyang Lin@JustinLin610·21 Mar

working on distillation

English

1.6K

79.9K

Xin Yan@cakeyan9·21 Mar

found this blog by accident. i read every word. it's incredible.

Ziming Liu@ZimingLiu11

When do "Neural thickets" / RandOpt work? In today's blog, I show that sequence length is a key parameter -- RandOpt works better for longer sequences, while gradient-based methods work better for shorter sequences. kindxiaoming.github.io/blog/2026/rand…

English

Xin Yan รีทวีตแล้ว

Cursor@cursor_ai·17 Mar

We trained Composer to self-summarize through RL instead of a prompt. This reduces the error from compaction by 50% and allows Composer to succeed on challenging coding tasks requiring hundreds of actions.