CSET

5.2K posts

CSET banner
CSET

CSET

@CSETGeorgetown

The Center for Security and Emerging Technology within Georgetown University’s Walsh School of Foreign Service. Visit https://t.co/0HMynaF0ZI to sign up for updates.

Washington, DC Katılım Mayıs 2019
450 Takip Edilen12.9K Takipçiler
CSET
CSET@CSETGeorgetown·
📢CSET is hiring!📢 Don't miss your chance to join our fantastic team and work at the intersection of tech and national security. We are looking for a UX Engineer, a Data Scientist, and a Fellow to lead our work on Frontier AI. Details below: cset.georgetown.edu/careers/
English
0
3
6
6.1K
CSET
CSET@CSETGeorgetown·
Lawrence has spearheaded leadership of CSET’s Inclusion Alliance, which is committed to fostering a workplace where all CSETers feel valued, respected, and supported. Congrats again, Lawrence, on this well-deserved award!
English
0
0
1
299
CSET
CSET@CSETGeorgetown·
Congrats to CSET's own Lawrence Hailes for winning this year's @georgetownsfs Community in Diversity Individual Leader Award, which recognizes outstanding individuals who have created a more inclusive School of Foreign Service community.
CSET tweet media
English
1
0
2
358
CSET
CSET@CSETGeorgetown·
In their piece in @barronsonline, @steph_batalis, Katherine Quinn, and Rebecca Gelles paint a picture of the key role federal research investments play in spurring U.S. innovation competitiveness. Be sure to check out their latest analysis below! barrons.com/articles/gover…
English
0
9
17
5K
CSET
CSET@CSETGeorgetown·
⭐️New Report⭐️ As AI introduces new risks, some potentially catastrophic or even existential, there is little data or detailed theory to assess them. Our report helps to shed light on this challenge and offers helpful solutions for policymakers. cset.georgetown.edu/publication/be…
English
0
2
5
505
CSET
CSET@CSETGeorgetown·
🎉New job alert!🎉 Come join our team as a Data Scientist and build tools for identifying emerging areas in S&T. If you want to work on cutting-edge data science at the intersection of national security and tech be sure to apply below! cset.georgetown.edu/job/data-scien…
English
2
2
2
341
CSET
CSET@CSETGeorgetown·
“It is not clear yet who benefits between attackers and defenders," @AndrewJLohn testified to @USCC_GOV in discussing the potential for AI to boost Chinese cyber theft. Read more on this and other emerging cyber threats in his full testimony below. uscc.gov/sites/default/…
English
0
0
2
321
CSET
CSET@CSETGeorgetown·
What do the PLA’s latest challenges and competitions reveal about emerging technologies such as AI, drones, and cyber? @jules__george’s new CSET blog explores multi-domain integration, unmanned systems, and military-civil fusion. cset.georgetown.edu/article/chinas…
English
0
5
7
5K
CSET
CSET@CSETGeorgetown·
✨New CSET blog ✨ on the PLA’s latest efforts on challenges and competitions on emerging technologies: @jules__george highlights key takeaways on multi-domain integration, unmanned technological innovation and countermeasures, and further evidence of military-civil fusion. cset.georgetown.edu/article/chinas…
English
0
2
5
306
CSET
CSET@CSETGeorgetown·
What counts as an “AI job”? The answer shapes how we measure talent shortages, design workforce policy, and understand AI’s labor market impact. CSET’s new blog compares prevailing approaches and explains CSET’s new definition of AI development jobs: cset.georgetown.edu/article/defini…
English
0
0
0
226
CSET retweetledi
Kyle Miller
Kyle Miller@KyauMill21·
Recently, there has been a lot of interest in distillation, particularly regarding “adversarial” distillation from Chinese labs. At the core of this issue is the question of ‘how much performance gains (ie uplift) do student models achieve when they are distilled from closed-source teacher models.’ Months ago, we started reviewing the literature to try and answer this question. We’ve so far reviewed ~40 papers on black-box distillation, focusing on papers that distill the reasoning, math, and code capabilities of closed-source teacher models. All white-box distillation research is omitted. But there's a problem. The data is not straightforward. Here are some of the challenges we ran into, and why the distillation literature doesn't paint a clear picture (long 🧵): 1/ Old research. To start, much of the literature on this topic – that we reviewed – was published between 2022-2024. The student and teacher models are old, and the benchmarks used to evaluate performance are saturated. And even newer papers often used older models. This means that we can’t simply take the data from the literature and assume it applies to Chinese "adversarial" distillation today. Some student models saw extreme performance gains, while, at times, others saw degraded performance on out-of-distribution benchmarks. In the most extreme case we identified, a distilled model improved by 25600% relative to its baseline score; the student went from 0.2 to 51.2 on the ASDIV benchmark (this is from the Program-Aided Distillation paper). But that large percentage is due to the poor baseline performance it started with. The ASDIV benchmark is saturated, the student was a 0.22B parameter CodeT5 model, and the teacher was GPT-3.5-Turbo. In my view, we cannot infer much from this example because it's not applicable to frontier models and challenging unsaturated benchmarks. We saw this same issue across most of the papers we reviewed. Most of the student models were years old and relatively small (<13B params). Llama-1/2 and CodeLLama were the most common students, and GPT-3.5/4 the most common teachers (likely because OpenAI permits distillation in their ToS for research purposes). On average across the papers, student models reached 74% of the teacher’s performance (primarily related to older code, math, and reasoning benchmarks), with a high standard deviation. But again, we can’t really infer too much from this, and we cannot assume this is the uplift China gets on frontier benchmarks. It's safe to say they see some degree of uplift, but we already knew that (they probably wouldn't distill in the first place if they didn't see any gains from it). 2/ The next issue is that most of the literature is small-scale, experimental, and focused on compression. None of it is “adversarial,” rather the intent is to compress knowledge into smaller parameter models. The difference in intent in the literature vs. Chinese distillation matters. To highlight my earlier point, most of the literature is trying to compress knowledge into small <13B parameter students. That doesn't translate to distilling into, say, the 1 trillion param Kimi-K2.5 model. Moreover, the quantity (and quality) of the synthetic teacher-generated data is likely different from what Chinese labs are using. Anthropic identified distillation campaigns involving over 16 million exchanges with Claude, likely generating billions of tokens. The literature, on the other hand, is much smaller scale. Back-of-the-envelope calculations show that, on average, researchers were likely distilling on ~185 million tokens (these are rough calculations). Only a few papers distilled with over a billion tokens.
English
1
6
10
941
CSET
CSET@CSETGeorgetown·
✨New blog✨ Who counts as part of the AI workforce? Existing definitions often blur together workers who build AI systems, deploy them, or simply use AI tools. CSET’s new blog explains why that matters, and introduces CSET’s new definition of AI development jobs: cset.georgetown.edu/article/defini…
English
0
1
2
209
CSET retweetledi
Sam Bresnick
Sam Bresnick@SamBresnick·
.@colemcfaul and I have documented the Chinese military's interest in @nvidia chips; it's clear that H200s will contribute to the PLA’s modernization, either through direct purchases or through the use of LLMs trained on them. Link below. @CSETGeorgetown @emergingtechobs
Sam Bresnick tweet media
Kristina Partsinevelos@KristinaParts

Worth noting - at GTC, Jensen Huang told me that China did purchase H200s, that he had the green light from both sides. He also told a room full of journalists that $NVDA is "in the process of restarting our manufacturing. And so, so that's new news for all of you".

English
1
8
19
15.4K
CSET retweetledi
McCourt School
McCourt School@McCourtSchool·
New research from McCourt Associate Research Professor Renée DiResta and @CSETGeorgetown's Josh Goldstein breaks down the rise of full-spectrum propaganda. 🧵
English
1
1
1
1K
CSET retweetledi
Cole McFaul
Cole McFaul@colemcfaul·
NEW @CSETGeorgetown + @emergingtechobs piece! Does China's access to US semiconductor technology help the PLA develop and deploy military AI? After 3 years reading thousands of PLA procurement docs, @sambresnick and I say yes. Here’s how, and why it matters: 🧵/13
Cole McFaul tweet media
English
6
34
90
27.5K
CSET
CSET@CSETGeorgetown·
"Well-designed governance does not suppress innovation. Instead, it shapes the direction of innovation in socially beneficial ways..." @oschinski and @MinaNrn reframe the discussion around AI innovation and governance. Read more in @Newsweek below: newsweek.com/stop-asking-wh…
English
0
1
2
237