Fujun Luan

16 posts

Fujun Luan

@fujun_luan

AI researcher @Apple | ex-@Adobe | Ph.D. @Cornell

Sunnyvale, CA Katılım Mayıs 2023

365 Takip Edilen324 Takipçiler

Fujun Luan retweetledi

Anthropic@AnthropicAI·4d

You can read a detailed technical report on the software vulnerabilities and exploits discovered by Claude Mythos Preview here: red.anthropic.com/2026/mythos-pr…

English

180

1.8K

638.9K

Fujun Luan retweetledi

Jason Wei@_jasonwei·16 Tem

Becoming an RL diehard in the past year and thinking about RL for most of my waking hours inadvertently taught me an important lesson about how to live my own life. One of the big concepts in RL is that you always want to be “on-policy”: instead of mimicking other people’s successful trajectories, you should take your own actions and learn from the reward given by the environment. Obviously imitation learning is useful to bootstrap to nonzero pass rate initially, but once you can take reasonable trajectories, we generally avoid imitation learning because the best way to leverage the model’s own strengths (which are different from humans) is to only learn from its own trajectories. A well-accepted instantiation of this is that RL is a better way to train language models to solve math word problems compared to simple supervised finetuning on human-written chains of thought. Similarly in life, we first bootstrap ourselves via imitation learning (school), which is very reasonable. But even after I graduated school, I had a habit of studying how other people found success and trying to imitate them. Sometimes it worked, but eventually I realized that I would never surpass the full ability of someone else because they were playing to their strengths which I didn’t have. It could be anything from a researcher doing yolo runs more successfully than me because they built the codebase themselves and I didn’t, or a non-AI example would be a soccer player keeping ball possession by leveraging strength that I didn’t have. The lesson of doing RL on policy is that beating the teacher requires walking your own path and taking risks and rewards from the environment. For example, two things I enjoy more than the average researcher are (1) reading a lot of data, and (2) doing ablations to understand the effect of individual components in a system. Once when collecting a dataset, I spent a few days reading data and giving each human annotator personalized feedback, and after that the data turned out great and I gained valuable insight into the task I was trying to solve. Earlier this year I spent a month going back and ablating each of the decisions that I previously yolo’ed while working on deep research. It was a sizable amount of time spent, but through those experiments I learned unique lessons about what type of RL works well. Not only was leaning into my own passions more fulfilling, but I now feel like I’m on a path to carving a stronger niche for myself and my research. In short, imitation is good and you have to do it initially. But once you’re bootstrapped enough, if you want to beat the teacher you must do on-policy RL and play to your own strengths and weaknesses :)

English

127

342

3.4K

345.2K

Fujun Luan retweetledi

Kimi.ai@Kimi_Moonshot·16 Mar

Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…

English

333

13.5K

4.9M

Fujun Luan retweetledi

Figure@Figure_robot·9 Mar

Today we're showing Helix 02 that can tidy a living room fully autonomously Figure is designed so when you leave the house, your home resets exactly how you like it

English

717

1.2K

9.4K

2.1M

Fujun Luan retweetledi

Amy Tam@amytam01·17 Şub

x.com/i/article/2023…

ZXX

186

820

7.5K

2.7M

Fujun Luan retweetledi

vixhaℓ@TheVixhal·15 Şub

Inspired by @karpathy microgpt, I built microgpt.c with fully manual forward and backward propagation. It is about 600 lines of pure C with no external libraries or dependencies, just raw computational power.

vixhaℓ@TheVixhal

x.com/i/article/2022…

English

163

1.7K

157.5K

Fujun Luan retweetledi

Qwen@Alibaba_Qwen·16 Şub

🚀 Qwen3.5-397B-A17B is here: The first open-weight model in the Qwen3.5 series. 🖼️Native multimodal. Trained for real-world agents. ✨Powered by hybrid linear attention + sparse MoE and large-scale RL environment scaling. ⚡8.6x–19.0x decoding throughput vs Qwen3-Max 🌍201 languages & dialects 📜Apache2.0 licensed 🔗Dive in: GitHub: github.com/QwenLM/Qwen3.5 Chat: chat.qwen.ai API：modelstudio.console.alibabacloud.com/ap-southeast-1… Qwen Code: github.com/QwenLM/qwen-co… Hugging Face: huggingface.co/collections/Qw… ModelScope: modelscope.cn/collections/Qw… blog: qwen.ai/blog?id=qwen3.5

English

271

867

5.3K

1.3M

Fujun Luan retweetledi

Boris Cherny@bcherny·1 Şub

I'm Boris and I created Claude Code. I wanted to quickly share a few tips for using Claude Code, sourced directly from the Claude Code team. The way the team uses Claude is different than how I use it. Remember: there is no one right way to use Claude Code -- everyones' setup is different. You should experiment to see what works for you!

English

926

5.8K

50.9K

9.1M

Fujun Luan retweetledi

Boris Cherny@bcherny·2 Oca

I'm Boris and I created Claude Code. Lots of people have asked how I use Claude Code, so I wanted to show off my setup a bit. My setup might be surprisingly vanilla! Claude Code works great out of the box, so I personally don't customize it much. There is no one correct way to use Claude Code: we intentionally build it in a way that you can use it, customize it, and hack it however you like. Each person on the Claude Code team uses it very differently. So, here goes.

English

1.3K

54.3K

8.1M

Fujun Luan@fujun_luan·13 Haz

@shuangz Congrats, Shuang!

English

Shuang Zhao@shuangz·13 Haz

After ten rewarding years at UC Irvine, I will be joining the Siebel School of Computing and Data Science at the University of Illinois at Urbana-Champaign this fall. I am deeply grateful for the support and friendships I have experienced at UCI, which I will miss a great deal!

English

591

Fujun Luan@fujun_luan·14 Mar

@edliu1105 Very cool!

English

Edward Liu@edliu1105·13 Mar

Even though we haven't been publishing papers around DLSS, it takes a tremendous amount of hardcore research to bring AI models like DLSS4 into production. Super happy that we're sharing technical insights on DLSS4 in this report. So proud of the teams!

Bryan Catanzaro@ctnzr

Technical insights into what makes DLSS4 so great, including MFG, transformer models, and Reflex Frame Warp: research.nvidia.com/labs/adlr/DLSS…

English

1.4K

Fujun Luan@fujun_luan·7 Oca

@chosenundeadone Bonfire lit!

Français

chosen undead@chosenundeadone·6 Oca

non aesthetic things@PicturesFoIder

What would yall use this space for?

ZXX

2.1K

20.5K

353.8K

Fujun Luan@fujun_luan·16 Ara

@RubenEVillegas Very cool!

English

216

Ruben Villegas@RubenEVillegas·16 Ara

A broccoli wearing a leather jacket and carrot wearing a tank top having a steak dinner #veo2

English

328

39.8K

Fujun Luan@fujun_luan·14 Ara

@drjingjing2026 This sucks. Cannot imagine an MIT professor being such ignorant and racist. Even worse, she did this publicly in front of the world-class researchers at NeurlPS. She should be fired.

English

654

Jing-Jing Li@drjingjing2026·14 Ara

1/3 Today, an anecdote shared by an invited speaker at #NeurIPS2024 left many Chinese scholars, myself included, feeling uncomfortable. As a community, I believe we should take a moment to reflect on why such remarks in public discourse can be offensive and harmful.

English

177

552

3.5K

Fujun Luan@fujun_luan·4 Ara

@NathanYan2012 lol

J.Nathan Yan@NathanYan2012·3 Ara

Just when you think you've achieved state-of-the-art results and you're feeling unstoppable, you hop on Twitter only to discover someone just dropped a paper on twitter with even better numbers 📉#ResearchLife

English

1.3K

Fujun Luan@fujun_luan·13 Haz

@NathanYan2012 @Haian_Jin Thank you, Nathan 😊！

English

J.Nathan Yan@NathanYan2012·13 Haz

@Haian_Jin and @fujun_luan !

205

J.Nathan Yan@NathanYan2012·13 Haz

super cool work by @Haian_Jin

AK@_akhaliq

Neural Gaffer Relighting Any Object via Diffusion Single-image relighting is a challenging task that involves reasoning about the complex interplay between geometry, materials, and lighting. Many prior methods either support only specific categories of images, such as portraits, or require special capture conditions, like using a flashlight. Alternatively, some methods explicitly decompose a scene into intrinsic components, such as normals and BRDFs, which can be inaccurate or under-expressive. In this work, we propose a novel end-to-end 2D relighting diffusion model, called Neural Gaffer, that takes a single image of any object and can synthesize an accurate, high-quality relit image under any novel environmental lighting condition, simply by conditioning an image generator on a target environment map, without an explicit scene decomposition. Our method builds on a pre-trained diffusion model, and fine-tunes it on a synthetic relighting dataset, revealing and harnessing the inherent understanding of lighting present in the diffusion model. We evaluate our model on both synthetic and in-the-wild Internet imagery and demonstrate its advantages in terms of generalization and accuracy. Moreover, by combining with other generative methods, our model enables many downstream 2D tasks, such as text-based relighting and object insertion. Our model can also operate as a strong relighting prior for 3D tasks, such as relighting a radiance field.

English

623

Keşfet

@karpathy @shuangz @edliu1105 @chosenundeadone @RubenEVillegas @drjingjing2026 @NathanYan2012 @elonmusk