Ruby Scanlon

280 posts

Ruby Scanlon

@rubyscanlon

US-China Tech @CNASdc | Formerly Antitrust @TheJusticeDept

Washington, D.C. Katılım Ocak 2014

392 Takip Edilen1.2K Takipçiler

Sabitlenmiş Tweet

Ruby Scanlon@rubyscanlon·16 Tem

Last year, I wrote a 100+ page thesis on China's AI industrial policy. It covers Beijing's spending across: (1) basic R&D, (2) chips, (3) B2G partnerships, and (4) traditional subsidies. Over the next few days, I'll break down the paper's findings, starting today with R&D.🧵

English

187

802

96.1K

Ruby Scanlon retweetledi

Caleb Withers@CalebWithersDC·6d

✍️ NEW PAPER ✍️ The Pentagon’s AI Acceleration Strategy, released in January, targets an “AI-first” warfighting force, accepting that “the risks of not moving fast enough outweigh the risks of imperfect alignment.” The urgency is right. But I worry this elides how quickly alignment could become a central bottleneck on realizing AI’s potential in the national security enterprise. New paper from me (w/ Jay Kim and Ethan Chiu) on this challenge and what to do about it 🧵👇

English

3.4K

Ruby Scanlon retweetledi

Andy Masley@AndyMasley·13 Mar

Each frontier AI model seems to use a little under a year's worth of a square mile of farmland's water to train. I think about this as the country having 4 square miles of farmland sectioned off to grow some of the most popular consumer products in history.

English

214

477

8.2K

598.8K

Ruby Scanlon@rubyscanlon·12 Mar

This isn't the first time Claude has gone to war TIL one of Japan’s most coveted weapons in WWII was a Mitsubishi-made fighter also named Claude

English

Ruby Scanlon retweetledi

Janet Egan@janet_e_egan·5 Mar

I thought policy was supposed to mitigate the risks of AI, not its benefits...?

English

222

8.9K

Ruby Scanlon@rubyscanlon·3 Mar

Enjoyed chatting with @Jonathan1Gibson about the US' new Tech Corps Forward deployed engineering capacity is a major gap for American AI exports While I question its placement under the Peace Corps, this is an important step for broadening global access to transformative AI

English

285

Ruby Scanlon@rubyscanlon·25 Şub

Enjoyed @Gracemzshao's on why China's top models keep "catching up": DeepSeek supplies a frontier-ish base model via open release, and every other lab then closes the remaining gap through post-training Recent OpenRouter data on CN models makes the argument even more compelling

English

485

Ruby Scanlon retweetledi

Isaac Stone Fish@isaacstonefish·24 Şub

Fascinatingly visceral detail about Trump, Xi, and Taiwan, in this blockbuster @trippmickle semiconductors story.

English

112

94.5K

Ruby Scanlon@rubyscanlon·23 Şub

"Asked if he would choose to have an AGI company or a profitable one, he says definitely the AGI one"

Kyle Chan@kyleichan

Zhipu's CEO Zhang Peng did a fascinating 2.5-hour podcast interview in Chinese before going IPO. He is hyperfocused on AGI, which he brings up repeatedly as the company's raison d’être. Here are my rough notes in case anyone finds them useful: youtu.be/toy8RLeFZ08 •Chinese AI researchers have been dreaming of AGI for a long time, but was previously so distant a goal. •Back then, people were focused on cognitive AI or 认知人工智能 •Tsinghua has a special focus on industry collaboration, which he said helped him understand AI needs from a practical standpoint. •Tsinghua has an emphasis on P2P: paper to product. •As early as 2016, Zhang Peng was thinking about launching his own startup. But at Tsinghua at the time, there wasn’t an approved process for doing this, even though some people did it. Later there was an official process created for China. •Launching a startup out of Tsinghua was quite a process, involving the university president, figuring out stakes and shareholding structure, took 1.5 years, they were basically the first •Early AI systems “don’t know what they don’t know.” Is kind of self-awareness is still not there. •He talks about a field called intelligence studies 情报学 that can out of libraries. How to generate new ideas and knowledge from consuming large sets of texts. •Some of the earliest work his lab did for clients was to forecast which technologies would be hot in 3-5 years •While still a lab at Tsinghua, their AMiner AI system performed quite well and got some international recognition •When GPT-3 came out, he was talking to another researcher who had high praise, said it was a milestone for the field. But he said that it still hadn’t solved the fundamental problem of “not knowing what it doesn’t know”. It would just give you an answer regardless. •They quickly set about trying to understand the new technology behind GPT-3, what was different from BERT which they had known well. •Within a year, they had done the research for GLM •GLM tried to combine the best of both BERT and GPT •He really likes Ilya, thinks his thought process is really good •Asked if he ever thought of making Zhipu like OpenAI, he said actually in the early days Zhipu was quite similar in structure •They had to decide whether they wanted to pursue similar approach as OpenAI, weighed whether the massive investment required was worth it, estimated at the level of tens of millions RMB or more •They decided to do it themselves rather than wait to see if someone else would •They had to explain to investors for their series B fundraising round that they were building something like GPT-3, going for massive training, and they would focus on open source •Why open source? They wanted recognition, including globally, like OpenAI had gotten •Also US-China relations were not so complicated back then •A major Stanford report came out comparing AI models and GLM was the only one from China near the top •Other Chinese labs working on this at the time included Baidu, Alibaba, Zhiyuan Institute •Did investors understand? Absolutely not. Asked how they were going to make money. •Then ChatGPT came out and suddenly investors understood. Then investors were looking for them, asking whether they could build something like a ChatGPT •From early on they wanted to create a small version of the model that researchers could use, like an 6B model that a single GPU can run •ChatGPT made him very excited because it validated his approach, made the right bet •Until then, everyone in China—investors, users—had such a short term view of AI. Com back to us when you have something useful. •2023 was the battle of a hundred models. A lot of Chinese startups were founded to chase this new trend •He was excited but also worried the whole market would swing from low to high and then crash and not recover •There was over-optimism. Despite the excitement, there were so many issues that were not fixed yet. Chatbot alone that can talk back is not so impressive. Would you trust its advice on what medicine to take? Still long road ahead. •A lot of the work on vertical AI went away. Don’t hear so much about that anymore. He thinks the general capabilities is important. •A lot of people didn’t understand. They said medicine or other specialized areas can make money, but not general foundation model. •But then he started MaaS model as a service •Why not do consumer AI like ChatGPT? Because in the US people are willing to pay for subscription but not true in China. You’ll have to keep giving out discounts and freebies to keep them. •Going after enterprise market is not as sexy as consumer market, but it’s more stable, not just race to bottom on price •Zhipu understands the technology. They can do the same job, better, for lower price. •Their headcount now is 800. It doubled basically every year. •He says they’re absolutely determined to pursue AGI. But the path is long with many challenges •He talks about the challenges of training for self driving cars. Too many corner cases. And the model can’t try and learn from mistakes and self correct like a human. •So now it’s not just pre-training but also mid- and post-training •There are levels of AI: L1 to L5. L4 has theory of mind and knows what it doesn’t know. L5 is human-like consciousness. Right now we’re at L3. L2 was about supervised fine tuning and test time scaling. Now L3 is RL. Each phase, the scaling laws change. •The scaling laws are not so set. But when you have a pattern like that in science, then it’s useful because you then ask what’s driving it and can we harness that •On computer scaling, there’s no way to recoup the cost •OpenAI is going for more compute. But Zhipu is going more for optimization •They’re using 1/4 of what OpenAI used to train GPT-3 •This is a Chinese strength. Drive down cost. •How is GLM-4.7 so good? Engineering, architecture. Not by doubling the parameters. •Speaking of scaling laws, he talks about the need to understand the real essence to the relationship between intelligence and compute •Transformers may not be the ultimate answer. Zhipu is already looking at trying to figure out what the next architecture might be •He uses the phrase 柳暗花明 which is kind of like out of confusion comes clarity •But he believes that still within the transformer paradigm, there’s a lot more space to explore •DeepSeek in 2025 was huge for them on every front: engineering, research, market. They discussed intensively. •Zhipu felt like they were hitting a wall on their GLM-4 plus model. But DeepSeek helped show that you could go a level deeper on the engineering, helped validate some things about leaping over supervised fine tuning and strengthening the base that they had been testing •After DeepSeek, Zhipu boosted RL •But DeepSeek didn’t change their approach to open source which they’d already been doing •US labs were moving away from open source. Closed source has greater commercialization advantages •DeepSeek’s comprehensive open source approach got people in China confused, they associated open source with free, asked Zhipu why should we pay you when we can just use DeepSeek’s totally open source model •He explained that you can do this. But many people tried and came back because DeepSeek doesn’t offer enterprise services •He thinks DeepSeek is doing full open source because they don’t care about enterprise market or even commercialization, just focused on the technology •DeepSeek also benefited from other open source innovations. Don’t underestimate or overestimate anyone. •China open source will spur more adoption. He quotes Jensen Huang as saying the technology has no country borders but the applications and people do, benefits too •Cannot have the technology concentrated in just a few companies or individuals. Open source is a way to open this up to people around the world and give more options •Without China open source, then rest of world just has US closed source options. So China open source helping other people benefit from AI •Agents: doing more research here, need data, need to figure out how to break a problem down into smaller parts •Next paradigm after scaling will be self-learning, no clear line between training and inference •Need to figure out how to close the loop on learning and training. Not sure if this will be a tech breakthrough or engineering breakthrough •Another major area to research is how to really combine all the data across modalities and integrate into a single model. Right now the needs for coding agents planning vs other tasks might be quite distinct. Also VLA which they’re working on •He says he can see the first light 曙光 of AGI, especially if you get online learning 在线学习 •When will you let an AI go out into the world and explore the world and learn? Maybe starts in 2027? Then maybe 5-8 years later, after more adjustment and improve, deal with safety, etc •AGI is not a short term thing. It’s running a marathon. •IPO is a very natural route. Zhipu has been planning on IPO for years •Looking at OpenAI’s plans for IPO, he says they follow a logic of 3 “highs”: high risk, high investment, high return •A big misunderstanding people have of Zhipu is they think it’s a for-government company targeting just government contracts. But that’s not true, lots of private companies as customers, government only 20% of Zhipu’s business •Xiaojun the interview said she was talking to an AI 1.0 person who said the reason why Chinese AI companies are going IPO now is they’re fleeing from a potential bubble in 2026 •In China, a lot of comparisons to the internet bubble these days. But he argues that even after the internet bubble burst, there was still a lot of useful stuff left from that. (An argument often made by AI leaders in the US) •US investment in AI is orders of magnitude greater than in China. So China AI investment is not very big. He says US might be a bubble, but China there’s actually still not enough investment. •Zhipu is definitely thinking about more consumer facing applications but something useful, not just entertainment. •Consumer AI battle is driven by the logic of the internet platforms, which need to get traffic. •Xiaojun asks if some people think Zhipu is kind of boring given its focus on tech rather than popularity. •But he says it’s like how they say the Tsinghua engineers are boring. But they’re very smart and capable. •Asked if Kimi is more cool, they’re also Tsinghua •He talks about the way Tsinghua teaches you to learn, so you can face new challenges never seen before. •Different stages of a company’s growth have their own challenges. •At 100 people, he basically knew every employee by name. But larger and you start to see people you don’t know. You can still control but through a different level. •They have people from many backgrounds. Very open culture. Not just Tsinghua. Also Shanghai Jiaotong, Peking University, ByteDance, Alibaba. •Really focused on trying to fully integrate all modalities into a single model •Also focused on internationalization •Xiaojun asks who in China is really going for AGI. But he says he doesn’t know. Many definitions of AGI. •Asked when Zhipu will break even. He says they have a forecast in their IPO prospectus. But he feels like it’s going in a good direction. Revenue growing quickly, optimizing costs. •Asked if he would choose to have an AGI company or a profitable one, he says definitely the AGI one •If in the next 5 years they just make money but don’t make technical progress, then he’ll be unsatisfied. •They opened a branch in Shenzhen because they have a large customer there. •How do you want Zhipu to be remembered by history? As the company that pioneered AGI. He uses the Chinese expression 吃螃蟹 first to eat crab

English

267

Ruby Scanlon retweetledi

Joey Politano 🏳️‍🌈@JosephPolitano·19 Şub

Crazy milestone in the US trade data released today—America now imports more directly from Taiwan than from China for the first time since 1992 The trade war has cut US direct trade with China, and the AI boom has caused a surge in spending on Taiwanese-made semiconductors

English

327

1.7K

121.7K

Ruby Scanlon retweetledi

Victor Shih@vshih2·18 Şub

Great paper and results we had suspected all along but will this have a meaningful impact on the competitiveness of the Chinese models?

Xu Xu@xuxupolitics

China’s chatbots are censored by the state. In our @PNASNexus paper with @jenjpan, we find substantially higher levels of political censorship in large language models (LLMs) originating from China than those developed outside China. doi.org/10.1093/pnasne…🧵

English

Ruby Scanlon retweetledi

James Sanders@james_s48·20 Şub

Happy to see more coverage on tradeoffs of selling AI chips to China Thanks @Jonathan1Gibson & @thedispatch for including some of my thoughts

English

788

Ruby Scanlon retweetledi

Callum Williams@econcallum·12 Şub

Datacentres now account for 7% of US electricity demand

English

106

617

2.8K

746K

Ruby Scanlon retweetledi

Janet Egan@janet_e_egan·16 Oca

Trump is implementing his promise to sell H200 AI chips to China in exchange for 25% of revenue. Two policies make this work: a BIS export rule + a 25% import tariff. 🧵1/10

English

10.6K

Ruby Scanlon@rubyscanlon·15 Ara

Read here! open.substack.com/pub/rubyscanlo…

English

153

Ruby Scanlon@rubyscanlon·15 Ara

My first Substack essay is about China's response to H200 sales Drawing on Chinese media, I look at whether floated restrictions are due to industrial policy, kill switch concerns, confidence in the Ascend 910C, or a secret fourth thing? Link below 🔗

English

382

Ruby Scanlon@rubyscanlon·5 Ara

While noble, the NSS' goal of 'maintaining a genuinely mutually advantageous economic relationship' with China keeps getting harder China's overcapacity means surging exports at the expense of trading partners' domestic industries This should top Trump's April agenda in Beijing

English

580

Ruby Scanlon retweetledi

Kyle Chan@kyleichan·16 Eki

From a huge new report on China’s Digital Silk Road: cnas.org/publications/r…

Ruby Scanlon@rubyscanlon

Chinese smart city tech (IoT, cameras, analytics for traffic, public health) operates in >100 countries China's edge: bundled packages + leveraging existing telecom relationships US firms have no comprehensive alternative, even as the market hits ~$4B by 2030

English

168

22.5K

Ruby Scanlon@rubyscanlon·15 Eki

@colemcfaul @vivekchil Appreciate the much too kind words Cole!

English

Cole McFaul@colemcfaul·15 Eki

Highly recommend reading @rubyscanlon and @vivekchil here! Fantastic work on China's DSR (per usual). Their team consistently produces some of the most insightful analysis available on China's global tech engagement efforts.

Ruby Scanlon@rubyscanlon

After 18 months of research, I'm excited to release my and @vivekchil's new report, Countering the Digital Silk Road It examines where the U.S. and China each hold advantages in key tech exports, how Chinese firms compete abroad, and how U.S. tech promote efforts compare 🧵 1/

English

520

Ruby Scanlon@rubyscanlon·15 Eki

Read the full behemoth of a report here! cnas.org/publications/r…

English

578

Ruby Scanlon@rubyscanlon·15 Eki

The US has powerful export promotion tools (EXIM, DFC, USTDA), but they operate in silos with no shared strategy The result is fragmented efforts and missed opportunities Our report lays out how to fix it with concrete steps to better promote the US tech stack in key markets

English

676

Ruby Scanlon@rubyscanlon·15 Eki

English

Keşfet

@Jonathan1Gibson @Gracemzshao @trippmickle @thedispatch @colemcfaul @vivekchil @elonmusk @BarackObama