Fujun Luan

16 posts

Fujun Luan banner
Fujun Luan

Fujun Luan

@fujun_luan

AI researcher @Apple | ex-@Adobe | Ph.D. @Cornell

Sunnyvale, CA Katılım Mayıs 2023
365 Takip Edilen324 Takipçiler
Fujun Luan retweetledi
Anthropic
Anthropic@AnthropicAI·
You can read a detailed technical report on the software vulnerabilities and exploits discovered by Claude Mythos Preview here: red.anthropic.com/2026/mythos-pr…
English
52
180
1.8K
638.9K
Fujun Luan retweetledi
Jason Wei
Jason Wei@_jasonwei·
Becoming an RL diehard in the past year and thinking about RL for most of my waking hours inadvertently taught me an important lesson about how to live my own life. One of the big concepts in RL is that you always want to be “on-policy”: instead of mimicking other people’s successful trajectories, you should take your own actions and learn from the reward given by the environment. Obviously imitation learning is useful to bootstrap to nonzero pass rate initially, but once you can take reasonable trajectories, we generally avoid imitation learning because the best way to leverage the model’s own strengths (which are different from humans) is to only learn from its own trajectories. A well-accepted instantiation of this is that RL is a better way to train language models to solve math word problems compared to simple supervised finetuning on human-written chains of thought. Similarly in life, we first bootstrap ourselves via imitation learning (school), which is very reasonable. But even after I graduated school, I had a habit of studying how other people found success and trying to imitate them. Sometimes it worked, but eventually I realized that I would never surpass the full ability of someone else because they were playing to their strengths which I didn’t have. It could be anything from a researcher doing yolo runs more successfully than me because they built the codebase themselves and I didn’t, or a non-AI example would be a soccer player keeping ball possession by leveraging strength that I didn’t have. The lesson of doing RL on policy is that beating the teacher requires walking your own path and taking risks and rewards from the environment. For example, two things I enjoy more than the average researcher are (1) reading a lot of data, and (2) doing ablations to understand the effect of individual components in a system. Once when collecting a dataset, I spent a few days reading data and giving each human annotator personalized feedback, and after that the data turned out great and I gained valuable insight into the task I was trying to solve. Earlier this year I spent a month going back and ablating each of the decisions that I previously yolo’ed while working on deep research. It was a sizable amount of time spent, but through those experiments I learned unique lessons about what type of RL works well. Not only was leaning into my own passions more fulfilling, but I now feel like I’m on a path to carving a stronger niche for myself and my research. In short, imitation is good and you have to do it initially. But once you’re bootstrapped enough, if you want to beat the teacher you must do on-policy RL and play to your own strengths and weaknesses :)
English
127
342
3.4K
345.2K
Fujun Luan retweetledi
Kimi.ai
Kimi.ai@Kimi_Moonshot·
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with learned, input-dependent attention over preceding layers. 🔹 Enables networks to selectively retrieve past representations, naturally mitigating dilution and hidden-state growth. 🔹 Introduces Block AttnRes, partitioning layers into compressed blocks to make cross-layer attention practical at scale. 🔹 Serves as an efficient drop-in replacement, demonstrating a 1.25x compute advantage with negligible (<2%) inference latency overhead. 🔹 Validated on the Kimi Linear architecture (48B total, 3B activated parameters), delivering consistent downstream performance gains. 🔗Full report: github.com/MoonshotAI/Att…
Kimi.ai tweet media
English
333
2K
13.5K
4.9M
Fujun Luan retweetledi
Figure
Figure@Figure_robot·
Today we're showing Helix 02 that can tidy a living room fully autonomously Figure is designed so when you leave the house, your home resets exactly how you like it
English
717
1.2K
9.4K
2.1M
Fujun Luan retweetledi
vixhaℓ
vixhaℓ@TheVixhal·
Inspired by @karpathy microgpt, I built microgpt.c with fully manual forward and backward propagation. It is about 600 lines of pure C with no external libraries or dependencies, just raw computational power.
vixhaℓ@TheVixhal

x.com/i/article/2022…

English
33
163
1.7K
157.5K
Fujun Luan retweetledi
Qwen
Qwen@Alibaba_Qwen·
🚀 Qwen3.5-397B-A17B is here: The first open-weight model in the Qwen3.5 series. 🖼️Native multimodal. Trained for real-world agents. ✨Powered by hybrid linear attention + sparse MoE and large-scale RL environment scaling. ⚡8.6x–19.0x decoding throughput vs Qwen3-Max 🌍201 languages & dialects 📜Apache2.0 licensed 🔗Dive in: GitHub: github.com/QwenLM/Qwen3.5 Chat: chat.qwen.ai API:modelstudio.console.alibabacloud.com/ap-southeast-1… Qwen Code: github.com/QwenLM/qwen-co… Hugging Face: huggingface.co/collections/Qw… ModelScope: modelscope.cn/collections/Qw… blog: qwen.ai/blog?id=qwen3.5
Qwen tweet media
English
271
867
5.3K
1.3M
Fujun Luan retweetledi
Boris Cherny
Boris Cherny@bcherny·
I'm Boris and I created Claude Code. I wanted to quickly share a few tips for using Claude Code, sourced directly from the Claude Code team. The way the team uses Claude is different than how I use it. Remember: there is no one right way to use Claude Code -- everyones' setup is different. You should experiment to see what works for you!
English
926
5.8K
50.9K
9.1M
Fujun Luan retweetledi
Boris Cherny
Boris Cherny@bcherny·
I'm Boris and I created Claude Code. Lots of people have asked how I use Claude Code, so I wanted to show off my setup a bit. My setup might be surprisingly vanilla! Claude Code works great out of the box, so I personally don't customize it much. There is no one correct way to use Claude Code: we intentionally build it in a way that you can use it, customize it, and hack it however you like. Each person on the Claude Code team uses it very differently. So, here goes.
English
1.3K
7K
54.3K
8.1M
Shuang Zhao
Shuang Zhao@shuangz·
After ten rewarding years at UC Irvine, I will be joining the Siebel School of Computing and Data Science at the University of Illinois at Urbana-Champaign this fall. I am deeply grateful for the support and friendships I have experienced at UCI, which I will miss a great deal!
English
3
0
15
591
Ruben Villegas
Ruben Villegas@RubenEVillegas·
A broccoli wearing a leather jacket and carrot wearing a tank top having a steak dinner #veo2
English
13
15
328
39.8K
Fujun Luan
Fujun Luan@fujun_luan·
@drjingjing2026 This sucks. Cannot imagine an MIT professor being such ignorant and racist. Even worse, she did this publicly in front of the world-class researchers at NeurlPS. She should be fired.
English
1
0
9
654
Jing-Jing Li
Jing-Jing Li@drjingjing2026·
1/3 Today, an anecdote shared by an invited speaker at #NeurIPS2024 left many Chinese scholars, myself included, feeling uncomfortable. As a community, I believe we should take a moment to reflect on why such remarks in public discourse can be offensive and harmful.
Jing-Jing Li tweet media
English
177
552
3.5K
1M
J.Nathan Yan
J.Nathan Yan@NathanYan2012·
Just when you think you've achieved state-of-the-art results and you're feeling unstoppable, you hop on Twitter only to discover someone just dropped a paper on twitter with even better numbers 📉#ResearchLife
English
2
0
8
1.3K
J.Nathan Yan
J.Nathan Yan@NathanYan2012·
super cool work by @Haian_Jin
AK@_akhaliq

Neural Gaffer Relighting Any Object via Diffusion Single-image relighting is a challenging task that involves reasoning about the complex interplay between geometry, materials, and lighting. Many prior methods either support only specific categories of images, such as portraits, or require special capture conditions, like using a flashlight. Alternatively, some methods explicitly decompose a scene into intrinsic components, such as normals and BRDFs, which can be inaccurate or under-expressive. In this work, we propose a novel end-to-end 2D relighting diffusion model, called Neural Gaffer, that takes a single image of any object and can synthesize an accurate, high-quality relit image under any novel environmental lighting condition, simply by conditioning an image generator on a target environment map, without an explicit scene decomposition. Our method builds on a pre-trained diffusion model, and fine-tunes it on a synthetic relighting dataset, revealing and harnessing the inherent understanding of lighting present in the diffusion model. We evaluate our model on both synthetic and in-the-wild Internet imagery and demonstrate its advantages in terms of generalization and accuracy. Moreover, by combining with other generative methods, our model enables many downstream 2D tasks, such as text-based relighting and object insertion. Our model can also operate as a strong relighting prior for 3D tasks, such as relighting a radiance field.

English
1
0
2
623