Neil Gong

161 posts

Neil Gong

Neil Gong

@NeilGong

Security, trustworthy AI. Associate Professor, Duke University

Katılım Haziran 2011
244 Takip Edilen1.4K Takipçiler
Neil Gong
Neil Gong@NeilGong·
I’d like to take this opportunity to highlight their work and share a brief summary of our research on federated learning security over the past several years (2018–2025). Many thanks to my amazing former and current students and collaborators—all credit goes to them!
Neil Gong tweet mediaNeil Gong tweet mediaNeil Gong tweet media
English
0
0
1
148
Neil Gong
Neil Gong@NeilGong·
[Late Advertisement] My student Yuqi Jia presented two posters on federated learning security at NeurIPS last week (which is why I attended NeurIPS for the first time in over a decade!).
Neil Gong tweet mediaNeil Gong tweet mediaNeil Gong tweet media
English
1
0
4
291
Neil Gong
Neil Gong@NeilGong·
@shi_weiyan Very interesting work! I was at your poster yesterday.
English
1
0
1
397
Weiyan Shi
Weiyan Shi@shi_weiyan·
Left: My first poster @ #ACL 2018, Melbourne Right: My first poster as faculty @ #NeurIPS 2025, San Diego 7 years later, still working to make chatbots better -- now with amazing students by my side 🥹♥️🤩
Weiyan Shi tweet mediaWeiyan Shi tweet media
English
10
5
271
15.7K
Neil Gong
Neil Gong@NeilGong·
@AISecHub Yes, we’ll release both the code and data publicly. Thanks for sharing our work!
English
0
0
1
21
AISecHub
AISecHub@AISecHub·
@NeilGong - Please share the link to LLMPrint when ready.
English
1
0
4
285
AISecHub
AISecHub@AISecHub·
Fingerprinting LLMs via Prompt Injection - arxiv.org/pdf/2509.25448 As models proliferate across organizations, questions of provenance–specifically, verifying whether a given model has been derived from a particular released model–become critical. Establishing provenance is important both for safeguarding intellectual property since training a competitive LLM requires substantial compute, data, and engineering effort, and for ensuring accountability by detecting unauthorized redistribution. However, reliably establishing provenance is far from trivial, especially once models have been altered through post-processing such as post-training or quantization. Authors: Yuepeng Hu, Zhengyuan Jiang, Mengyuan Li, Osama Ahmed, Zhicong Huang, Cheng Hong, @NeilGong - @DukeU, @AntGroup #AISecurity #LLMFingerprinting #PromptInjection #Provenance #ModelOwnership #GenAI #AdversarialML #TrustworthyAI #ModelSafety #CyberSecurity #AIResearch #DataIntegrity #DukeU #AntGroup
AISecHub tweet media
English
1
0
11
768
Neil Gong
Neil Gong@NeilGong·
Our paper "DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks" (arxiv.org/abs/2504.11358) received a Distinguished Paper Award at @IEEESSP! Huge thanks and congratulations to my amazing co-authors Yupei Liu, Yuqi Jia, Jinyuan Jia, and @dawnsongtweets!
English
10
3
57
2.7K
Somesh Jha
Somesh Jha@jhasomesh·
Such an excellent line-up of speakers! Please attend SAGAI @IEEESSP
earlence@EarlenceF

Our @IEEESSP SAGAI workshop on systems-oriented security for AI agents has speaker details (abs/bio) on the website now: sites.google.com/ucsd.edu/sagai… We look forward to seeing you in San Francisco on May 15! As a reminder, we are running this "Dagstuhl" style - real discussions.

English
2
1
4
1.7K
Neil Gong
Neil Gong@NeilGong·
Our study (arxiv.org/abs/2408.07291) demonstrates that LLMs excel at such information extraction. This highlights the potential for LLMs to automate cyberattacks at scale, posing significant security challenges.
English
0
0
4
291
Neil Gong
Neil Gong@NeilGong·
Many cyberattacks begin with spear phishing or social engineering, which often involve collecting personal information about potential victims.
English
1
0
1
321
Neil Gong
Neil Gong@NeilGong·
Still using symbol replacement, image conversion, and similar strategies (shown below) to protect your email addresses from automated scraping? Our research shows they offer limited effectiveness against LLM-based extraction while making it harder for regular users to email you.
Neil Gong tweet media
English
1
0
3
542
Neil Gong retweetledi
Kexin Pei
Kexin Pei@Kexin_Pei·
The 8th Deep Learning Security and Privacy workshop co-located with IEEE S&P @IEEESSP May 15, 2025, San Francisco (dlsp2025.ieee-security.org) is calling for papers, posters and talks! The workshop seeks your awesome contributions on all aspects of deep learning and security, aiming to bring complementary views together by (a) investigating the security and privacy of deep learning, such as the recent generative models, and (b) exploring the application of deep learning for security and privacy. We are calling for both proceeding papers (up to 6 pages) and non-archival extended abstracts (up to 3 pages). We will have one best paper award for the accepted papers and one best extended abstract for the accepted non-archival extended abstracts. For the first time, in addition to the talks, we will encourage the authors of the accepted papers to also present the posters for more in-depth discussions!
English
0
8
44
3.1K
Neil Gong retweetledi
Huan Sun
Huan Sun@hhsun1·
We @OSUbigdata and @osunlp are very excited to host Neil Gong @NeilGong tmr (10:30AM-11:30AM ET, Dec 6th) to give an invited talk on Safe and Robust Generative AI. He will cover several critical safety and robustness issues in generative AI, including preventing the generation of harmful content, detecting AI-generated content through watermarks, and addressing prompt injection in large language models. The talk is open to people outside OSU. DM me for a zoom link, if interested!
English
1
3
9
1.5K
Neil Gong retweetledi
Lun Wang
Lun Wang@lunwang1996·
Excited that our paper on audio watermark benchmarking is accepted to @NeurIPSConf! Congrats to all my amazing collaborators @hbliuustc, Mo Yang, Zheng Yuan, and @NeilGong. Audio authenticity has become a real issue now and we will keep working on this topic. Stay tuned :)
Lun Wang@lunwang1996

AudioMarkBench: Benchmarking Robustness of Audio Watermarking [arxiv.org/pdf/2406.06979] Despite rapid progress in #audiodeepfake, I feel the related safety risks are still underestimated. Imagine getting a call from somebody you trust who's actually a scammer-controlled bot – this is already happening when a scammer used voice-cloning tech to impersonate President Biden in a series of illegal robocalls during a New Hampshire primary election🚨. Audio watermarking is a powerful tool against misuse of synthetic audio, but our research with @NeilGong's group reveals - Vulnerabilities to even unintentional perturbations. For example, compression/decompression can remove watermarks without impacting audio quality too much. - Uneven robustness across attributes, raising fairness concerns. We still need more robust ways to watermark synthetic audios.

English
0
3
13
4.8K
Elaine Shi
Elaine Shi@ElaineRShi·
i'd been low key about my promotions, but whoops, i let the cat out of the bag! many thx to those that supported me along the way! i'd promised my closest collaborators that i'd try my best to be supportive of others just like how the community supported me. (3/3)
English
20
0
67
3.8K
Elaine Shi
Elaine Shi@ElaineRShi·
after putting out a fire, i walk into the theory lunch for new phd students. host was introducing our theory faculty, and i arrived exactly when it was my turn. host said, "elaine, wanna introduce urself?" i looked at the slide absent-mindedly and tried reading thru it (1/3)
English
16
1
119
16.9K
Neil Gong
Neil Gong@NeilGong·
This "contrastive" membership inference combined with hypothesis testing enables us to derive formal guarantees for the FPR.
English
0
0
0
442
Neil Gong
Neil Gong@NeilGong·
The key idea is to create two versions of each data sample and then publish one of them, selected uniformly at random. If a model was trained on the published version, it is more likely to be recognized as a member than the unpublished version.
English
1
0
1
535
Neil Gong
Neil Gong@NeilGong·
Was my data used to train an AI model? In our CCS'24 paper (with Zonghao Huang and Michael Reiter), we propose a framework to audit data use in model training, with a formal guarantee on false positive rate (probability of falsely detecting data use) arxiv.org/abs/2407.15100
English
6
8
59
7K