Shuaichen Chang

566 posts

Shuaichen Chang

@ShuaichenChang

Researcher at AWS AI (@AmazonScience) Ex: PhD @OhioState Opinions are my own #NLProc #LLMs #AI

NYC Katılım Ağustos 2016

1.2K Takip Edilen1.8K Takipçiler

Shuaichen Chang@ShuaichenChang·3d

@natolambert Agree. I once heard @srush_nlp described cursor as a RL company then I knew they were aiming for big things.

English

Nathan Lambert@natolambert·3d

I’ve been saying it for a while, cursor’s research team is insanely high on talent density. So many people I respected from my PhD / early career ended up there. Seems like that’s bearing fruit.

English

702

37.9K

Shuaichen Chang@ShuaichenChang·4d

Gemini sometimes triggers random things lol

English

299

Shuaichen Chang@ShuaichenChang·4d

Super

Tianzhu Ye@ytz2024

(1/n) Introduce Online Experiential Learning toward the era of experience. Beyond offline pre-constructed training data, models can learn online from their own deployment experience across infinite, unsimulable real-world environments. Accumulate, consolidate, self-improve 🔄

English

624

Shuaichen Chang retweetledi

Nathan Lambert@natolambert·4d

GPT 5.4 didn't get enough praise for how big of a step it was in OpenAI's agent arc. At the same time, with better context management, speed, rate limits, instruction following, code -- it's revealing that I still turn to the "warmth" of Claude. interconnects.ai/p/gpt-54-is-a-…

English

302

27.1K

Shuaichen Chang retweetledi

Sasha Rush@srush_nlp·5d

@eliebakouch It’s 100% learned in RL. We thought we might have to start with a complex prompt to kickstart it, but even the initial summaries are good enough for it to get some signal.

English

14.3K

Shuaichen Chang@ShuaichenChang·5d

It gives me a 2D search or dynamic programming vibe. For each cell, it feels like we search over the submatrix in its lower-left region to find the cells or spans that maximize utility. We can either combine multiple cells (to the left and below) or simply inherit the value from the cell below (a residual connection).

Rosinality@rosinality

ByteDance also implemented attention over depth. They literally combined it with sequence attention.

English

398

Shuaichen Chang@ShuaichenChang·6d

@sarahookr @adaption_ai @XBusiness The hacker also uses a fake website url. Hope nobody gets fooled.

English

332

Sara Hooker@sarahookr·16 Mar

I’m in the middle of high stakes negotiation with who hacked our @adaption_ai account. I would prefer @XBusiness handled it. But it is the wild Wild West, no response from support at X. Support doesn’t exist. Ignore @adaption_ai for the next 24h while we sort this out.

English

185

43K

Shuaichen Chang@ShuaichenChang·15 Mar

This is fantastic! Is this a state-to-action model, or is there an intermediate intent? The big question is whether the robot learns to have a goal/plan in mind or is just executing learned reflexes.

Zhikai Zhang@Zhikai273

🎾Introducing LATENT: Learning Athletic Humanoid Tennis Skills from Imperfect Human Motion Data Dynamic movements, agile whole-body coordination, and rapid reactions. A step toward athletic humanoid sports skills. Project: zzk273.github.io/LATENT/ Code: github.com/GalaxyGeneralR…

English

586

Shuaichen Chang retweetledi

Lorenzo Xiao@lrzneedresearch·14 Mar

I dont know who needs this but this is like an overview of notes I wrote to prepare for my "Agentic system design" Let me know if anyone feel like this is helpful and I can write more in details for each particular section algoroxyolo.github.io/blog/2026/llm-…

English

106

Shuaichen Chang@ShuaichenChang·14 Mar

@eigenron It’s because the experiments were only conducted with Qwen2.5 models which are known to improve under RL with any reward. While the finding may generalize to other models (at least I think it is intuitively convincing), people care less about the RL results based on Qwen2.5 only.

English

1.2K

eigenron@eigenron·14 Mar

i don't understand why this paper did not get much traction. they GRPO'd a small base model on its own confidence scores (internal rewards) instead of external rewards and it shows comparable results on math and coding benchmarks compared to models trained with GRPO with external rewards.

English

819

47.6K

Shuaichen Chang@ShuaichenChang·12 Mar

It made my day. The Chinese phrase was probably learned from the gambling ads on some webpages. Now we have codex ads 🤣

Aaron Francis@aarondfrancis

Codex is randomly hitting me with some ancient wisdom... for some reason.

English

1.3K

Shuaichen Chang retweetledi

Ravid Shwartz Ziv@ziv_ravid·11 Mar

x.com/i/article/2031…

ZXX

232

45.5K

Shuaichen Chang retweetledi

Xingyi Zhou@zhouxy2017·10 Mar

Excited to share that I've joined @amilabs as a founding member. I’ll be working on world models. It is super fun building things from scratch with such a talented team. Looking forward to the journey ahead!

AMI Labs@amilabs

Advanced Machine Intelligence (AMI) is building a new breed of AI systems that understand the world, have persistent memory, can reason and plan, and are controllable and safe. We’ve raised a $1.03B (~€890M) round from global investors who believe in our vision of universally intelligent systems centered on world models. This round is co-led by Cathay Innovation, Greycroft, Hiro Capital, HV Capital, and Bezos Expeditions, along with other investors and angels across the world. We are a growing team of researchers and builders, operating in Paris, New York, Montreal and Singapore from day one. Read more: amilabs.xyz AMI - Real world. Real intelligence.

English

580

37.7K

Shuaichen Chang@ShuaichenChang·9 Mar

@shi_weiyan We need a human purpose coach even before AGI! Post-AGI philosopher 👀

English

Weiyan Shi@shi_weiyan·9 Mar

What could be some post-AGI jobs that don’t exist today? Asked Claude, Gemini, and GPT, and they all seem to think we’ll need a “human purpose coach” 😂 what do you think?

English

1.5K

Shuaichen Chang retweetledi

Anthropic@AnthropicAI·6 Mar

New on the Anthropic Engineering Blog: In evaluating Claude Opus 4.6 on BrowseComp, we found cases where the model recognized the test, then found and decrypted answers to it—raising questions about eval integrity in web-enabled environments. Read more: anthropic.com/engineering/ev…

English

255

367

3.2K

1.1M

Shuaichen Chang@ShuaichenChang·8 Mar

After reading the tech report, I feel like I should repost the tweet again to highlight how transparent the team has been. It’s one of the best tech reports I’ve read. I especially loved the synthetic state-based recall experiments and the scaling law analysis.

Ai2@allen_ai

Introducing Olmo Hybrid, a 7B fully open model combining transformer and linear RNN layers. It decisively outperforms Olmo 3 7B across evals, w/ new theory & scaling experiments explaining why. 🧵

English

4.7K

Shuaichen Chang retweetledi

Grigory Sapunov@che_shr_cat·7 Mar

1/ RNNs compress history into fixed states. Perfect for O(L) scaling, fatal for recall. What if we stop overwriting history and checkpoint the states instead? You get Transformer-level Needle-in-a-Haystack recall with RNN efficiency. 🧵

English

255

17.2K

Shuaichen Chang@ShuaichenChang·7 Mar

I often think about two possible paths toward AGI. The first scenario is a single extremely powerful model. In this world, one super-intelligent system can perform almost any task as long as we provide clear instructions and sufficient context. It can continuously improve itself, becoming smarter over time. The model becomes a universal problem solver, capable of operating across nearly every domain. The second scenario is very different. Instead of one universal model, we have a pipeline that continually adapts models for new tasks and environments. We start with reasonably strong base models that have good alignment properties (e.g., safe, cooperative, and generally benign). When a new task appears, an existing model is adapted specifically for that task. It may not improve on other tasks, but it becomes very good at the one it was trained for. To achieve this, models and their specific environments need to continuously co-evolve with each other. Over time, we end up with many specialized models. These models can communicate and collaborate. In other words, the first AGI is a "winner-takes-all" monolithic model with simple maintenance and tremendous commercial value, while the second AGI is an ecosystem that lowers the barrier to entry but comes with higher ongoing maintenance cost. I don’t know which future will actually happen. But either way, we will need our models to be able to continually evolve. Personally, I think the second scenario is technically more plausible. And it’s closer to the world I want to live in. P.s. the image was generated by Nano Banana 2.

English

231

Shuaichen Chang@ShuaichenChang·7 Mar

@yanndubs @prenochar @DimitrisPapail Now it makes sense after seeing Yann’s explanation. I didn’t read the context

English

Yann Dubois@yanndubs·6 Mar

@prenochar @DimitrisPapail Ohh I see that’s because it’s no / low / medium / high / xhigh reasonin

English

143

Yann Dubois@yanndubs·6 Mar

🔥Two things I'm esp excited about 5.4: 1. Unification: we merged our codex & mainline models 2. Efficiency: we brought the efficiency of 5.3-codex to CUA & knowledge work. We only showed 3 such plots in the blog but many of our evals required less time (tokens/tools) than 5.2. What should we fix for the next model?