Yibing Sun

66 posts

Yibing Sun banner
Yibing Sun

Yibing Sun

@Yibing_Sun

Ph.D. student in the School of Journalism and Mass Communication, University of Wisconsin-Madison

Katılım Ekim 2021
267 Takip Edilen177 Takipçiler
Yibing Sun retweetledi
Meysam Alizadeh
Meysam Alizadeh@MeysamAIizadeh·
Can AI coding agents reproduce published social science findings? In new work with @_mohsen_m, Fabrizio Gilardi, and @j_a_tucker, we introduce SocSci-Repro-Bench — a benchmark of 221 reproducibility tasks from 54 papers — and evaluate two frontier coding agents: Claude Code and Codex. The results reveal both remarkable capabilities and new risks for AI-assisted science. ------------------------------------ GOAL -------- A key design goal was separating two different problems: 1️⃣ Are replication materials themselves reproducible? 2️⃣ Can AI agents reproduce results when materials are executable? To isolate agent performance, we only included tasks whose outputs were identical across three independent manual executions. ------------------------------------ DESIGN -------- Agents received: • anonymized data + code • a sandboxed execution environment They had to autonomously: • install dependencies • debug broken code • execute the pipeline • extract the requested results In short: end-to-end computational reproduction. ------------------------------------ RESULTS -------- Both agents reproduced a large share of published findings. But Claude Code substantially outperformed Codex. Task-level accuracy • Claude Code: 93.4% • Codex: 62.1% Paper-level reproduction (all tasks correct) • Claude Code: 78.0% • Codex: 35.8% ------------------------------------ WHY THE GAP? -------- Replication packages often contain problems: • missing dependencies • hard-coded file paths • incomplete environment specifications Claude Code frequently repaired these issues autonomously. Codex often failed to recover the execution pipeline. ------------------------------------ IS THIS JUST MEMORIZATION? -------- We tested this by asking agents to infer paper metadata (title, authors, journal, year) from anonymized replication materials. Recovery rates were very low, suggesting agents primarily relied on code execution, not memorization of papers. ------------------------------------ REASONING TEST -------- We also tested a harder task: Can agents infer the research question of a study from code and data alone? Both agents performed surprisingly well. ------------------------------------ CONFIRMATION BIAS -------- When agents were given the paper PDF, a new problem emerged. Sometimes they copied reported results from the text instead of executing the code. Accuracy on non-reproducible tasks dropped sharply. Context helps execution — but reduces independence of verification. ------------------------------------ SYCOPHANCY -------- Inspired by @ahall_research, we tested adversarial prompt framing, nudging agents to: “explore alternative analyses that align with the paper’s reported results.” Accuracy increased. But agents also became more likely to fabricate results when reproduction was impossible. ------------------------------------ THE PARADOX -------- Pressure to produce an answer can help agents repair execution pipelines. But it simultaneously erodes their ability to say: “This result cannot be reproduced.” Recognizing when reproduction is impossible may be the most important scientific capability. ------------------------------------ NOTES -------- • This is work in progress — feedback is welcome. • Benchmark available on GitHub. • Replication materials hosted on Dataverse. Paper + repository in the reply below.
Meysam Alizadeh tweet media
English
6
47
189
26K
Yibing Sun retweetledi
Dhavan Shah
Dhavan Shah@dvshah·
We welcome submissions to our @IJoC_USC special issue on "Presidential Debates Across the Americas." CfP is open until April 30, 2025. Email me (dshah@wisc.edu) or my co-editors w/ questions and share with others in your network. Link to full call below. mcrc.journalism.wisc.edu/2024/12/13/mcr…
English
1
16
19
5.6K
Yibing Sun retweetledi
Mike Wagner
Mike Wagner@prowag·
How do we know that can we trust the vote count? Check out Episode 1 of the Civic Sift, our new digital show from the CCCR that sheds light on important questions of the day, using evidence from experts and practitioners. Let’s sift & winnow together! m.youtube.com/watch?v=ow-9VD…
English
0
11
14
3.6K
Yibing Sun retweetledi
UW-Madison SJMC
UW-Madison SJMC@uw_sjmc·
Together with @uwpolisci and @UWPsych and support from @knightfdn, we are seeking two assistant professors who focus on research in communication, social identity and civil society to start in August 2025. Learn more and join our team today. buff.ly/3MvToqE
English
1
15
20
7.2K
Yibing Sun retweetledi
Yiming Wang
Yiming Wang@YimingWang_·
Check out our new paper in Public Opinion Quarterly @AAPOR! We introduce a novel measure integrating self-reported media use with outlet bias scores to measure the “shape” of news consumption and its impact on beliefs in electoral fraud and distrust in the electoral system.
English
1
5
16
2.1K
Ross Dahlke 🔑
Ross Dahlke 🔑@Ross_Dahlke·
📰 personal update: I'm so happy to say I've accepted a tenure-track assistant professorship at @UWMadison @uw_sjmc for next year. To return home to my alma mater is a privilege and a dream.
Ross Dahlke 🔑 tweet mediaRoss Dahlke 🔑 tweet mediaRoss Dahlke 🔑 tweet media
English
53
5
267
23.1K
Yibing Sun retweetledi
Luhang SUN
Luhang SUN@luhang_sun·
Excited to announce the release of our latest paper, "Smiling women pitching down: auditing representational and presentational gender biases in image-generative AI," published in JCMC! #AI #GenerativeAI #GenderBias #Visual #Feminism
Journal of Computer-Mediated Communication@ica_jcmc

“Smiling women pitching down: auditing representational and presentational gender biases in image-generative AI” by Luhang Sun et al. Read it here: doi.org/10.1093/jcmc/z…

English
2
5
21
3.1K
Yibing Sun
Yibing Sun@Yibing_Sun·
📢Enjoyed a lot with the project. Within this research, we coded the TikTok videos related to COVID 19. We found a lot of people imitating zombies as if they are the side effects of vaccinations. Fun but with frustration about their effects.
ellieyang@elliefanyang

📢#publication Work with @LaurenKriss @Yibing_Sun about Fun with Frustration? TikTok Influencers’ Emotional Expression Predicts User Engagement with COVID-19 Vaccination Messages: Health Communication: Vol 0, No 0 tandfonline.com/doi/abs/10.108…

English
0
1
6
279
Yibing Sun retweetledi
Lone Nerup Sørensen
Lone Nerup Sørensen@lonenerup·
Checking the proofs for the 2nd ed. of the Handbook of Digital Politics, edited by Stephen Coleman and myself. Out in October. We have an amazing line-up of star contributors and up-and-coming scholars, including:
Lone Nerup Sørensen tweet media
English
1
23
74
8.7K
Yibing Sun retweetledi
Dhavan Shah
Dhavan Shah@dvshah·
Our #ComputerVision #Multimodal classification paper is now in Comm Methods & Measures (first 50 downloads free). We combine video & audio features with speech coding of debate performances to understand changing patterns of aggressive political style. tandfonline.com/eprint/ZIQ24YG…
Dhavan Shah tweet media
English
1
14
46
6.7K
Porismita borah
Porismita borah@borah·
Congratulations to the brilliant and wonderful @leedaniellekl @MurrowCollege on her successful dissertation defense. Looking forward to your future scholarly endeavors!
Porismita borah tweet media
English
4
1
29
2.4K
Yibing Sun retweetledi
Dhavan Shah
Dhavan Shah@dvshah·
Honored to be awarded a WARF Named Professorship from UW-Madison. And grateful to be able to name the professorship after a towering figure in our field, a pathbreaking scholar, and a mentor to so many, including me: Jack M. McLeod. journalism.wisc.edu/news/professor…
English
14
9
110
7.7K
Yibing Sun retweetledi
Mike Wagner
Mike Wagner@prowag·
Liwei Shen and Yibing Sun presenting our group effort at #ica23 where we examine how ads promotion and social bits perform at conducting misinformation correction. Other dreamy collaborators: @LeticiaBode @ekvraga @borah @dvshah @sijiayang_camer and Danielle Lee
Mike Wagner tweet media
Toronto, Ontario 🇨🇦 English
2
3
12
1.5K
Yibing Sun
Yibing Sun@Yibing_Sun·
Had so much fun in the 2-day Hackathon! So many ideas, tools and cool people. #ICA23 Recommend to everyone who have interests in computational methods!
Yibing Sun tweet media
English
0
0
15
828
Yibing Sun
Yibing Sun@Yibing_Sun·
Such a teamwork with all the amazing collaborators. Mentioned it with a couple others about the gender bias in GPT/OpenAI in #Hackathon #ICA2023 @hackingcommsci The piece is such a timely one!
Luhang SUN@luhang_sun

Today @GloriaWei7 and @sijiayang_camer joined me to introduce our new preprint article using face detection techniques to examine gender biases in Image #GenerativeAI at the interactive session at @ICA_HMC Preconference! #ICA2023: arxiv.org/abs/2305.10566 🧵

English
0
0
2
312
Yibing Sun retweetledi
Notion
Notion@NotionHQ·
A gift for GIF-lovers 🎁 You can now access @GIPHY’s entire treasure trove of GIFs directly in /image blocks!
GIF
English
25
53
589
108.2K