Shenglai Zeng@NeurIPS-2025

28 posts

Shenglai Zeng@NeurIPS-2025

@snowzeng2

PhD student in MSU DSE lab @dse_msu. Intern at @AmazonScience Research interests: RAG, Agentic AI, LLM privacy/safety

East Lansing, MI Katılım Ağustos 2022

318 Takip Edilen165 Takipçiler

Shenglai Zeng@NeurIPS-2025@snowzeng2·1 Ara

📍 "Uncovering Graph Reasoning in Decoder-only Transformers with Circuit Tracing"（arxiv.org/pdf/2509.20336） Workshop: The First Workshop on Efficient Reasoning (Dec 6, 8:00AM-5:00PM)

English

Shenglai Zeng@NeurIPS-2025@snowzeng2·1 Ara

My co-authors and I will be presenting two papers: 📍 "Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy" （arxiv.org/abs/2506.00359） Poster #14022025 | Exhibit Hall C, D, E (Dec 4, 11:00AM-2:00PM)

English

Shenglai Zeng@NeurIPS-2025@snowzeng2·1 Ara

🎉 Heading to NeurIPS 2025 in San Diego (Dec 2-7)! Looking forward to great discussions and exploring collaboration & internship opportunities related to LLMs, RAG(Agentic) systems, and trustworthy AI! See you in San Diego! 🌊

English

111

Shenglai Zeng@NeurIPS-2025@snowzeng2·4 Kas

🎯 Stage 1: Attribute-based extraction to preserve key contextual information 🤖 Stage 2: Agent-based iterative refinement to enhance privacy protection ✅ Results: Comparable performance to original data while substantially reducing privacy risks arxiv.org/abs/2406.14773

English

Shenglai Zeng@NeurIPS-2025@snowzeng2·4 Kas

Paper 2: "Mitigating Privacy Issues in RAG via Pure Synthetic Data" 🛡️ Privacy-preserving solution: SAGE - a two-stage synthetic data generation framework for RAG

English

Shenglai Zeng@NeurIPS-2025@snowzeng2·28 Nis

Heading to #NAACL2025 in Albuquerque! 🎉 Presenting our paper on knowledge checking in RAG systems on May 2, 9:00-10:30 AM in Mesilla. Stop by to chat about how representation can help LLMs better integrate external knowledge! Coffee chats welcome! ☕ arxiv.org/abs/2411.14572

Shenglai Zeng@NeurIPS-2025@snowzeng2

🎯 Detect & Filter RAG Contexts with LLM Representations Excited to share our work on Representation-based knowledge checking in #RAG! arxiv.org/abs/2411.14572 We show how LLM representations detect & filter misleading/unhelpful knowledge and improve performance.

English

208

Shenglai Zeng@NeurIPS-2025@snowzeng2·5 Ara

3. Enhanced performance: Filtering based on knowledge-checking results significantly improves RAG performance, even in noisy environments. #AI #LLM #RAG #MachineLearning #Representation

English

Shenglai Zeng@NeurIPS-2025@snowzeng2·5 Ara

2. Representation vs Traditional Methods: Traditional methods (e.g., answer-based or probability-based) struggle with these tasks, while representation-based approaches (e.g., rep-PCA and rep-Con) achieve superior performance by leveraging distinct patterns in representations.

English

Shenglai Zeng@NeurIPS-2025@snowzeng2·5 Ara

English

534

Shenglai Zeng@NeurIPS-2025 retweetledi

Yuping Lin@yuplin2333·8 Tem

✨ Excited to share our new preprint "Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis"! arxiv.org/abs/2406.10794 🔍 We delve into why some jailbreak attacks succeed by exploring harmful and harmless prompts in the LLM's representation space.

English

907

Shenglai Zeng@NeurIPS-2025@snowzeng2·8 Tem

Our results show that SAGE achieves comparable performance to using original data while significantly reducing privacy risks! 📊✨

English

133

Shenglai Zeng@NeurIPS-2025@snowzeng2·8 Tem

SAGE works in two steps: 1️⃣Attribute-based extraction and generation: Identifies and generates synthetic data based on key attributes. 2️⃣Agent-based refinement: Ensures privacy through iterative assessment and refinement by privacy and rewriting agents.

English

175

Shenglai Zeng@NeurIPS-2025@snowzeng2·8 Tem

🚀 Excited to share our latest research on enhancing privacy in RAG systems! arxiv.org/pdf/2406.14773 Our paper introduces SAGE, a novel approach using synthetic data to protect sensitive information while maintaining high utility. #AI #Privacy #MachineLearning #RAG #DataSecurity

English

1.8K

Shenglai Zeng@NeurIPS-2025@snowzeng2·24 Mar

Interesting work!

Jie Ren@rjthuer

Exciting News! Our new paper on memorization in text-to-image diffusion is now available. We delve into the understanding of memorization via attention, and throw a light on the internal model behavior when memorization happens. Please find our paper at arxiv.org/abs/2403.11052

English

339

Shenglai Zeng@NeurIPS-2025@snowzeng2·29 Şub

Thanks for sharing! Welcome to discover the dual-edged sword of RAG technology in our paper~ arxiv.org/pdf/2402.16893

Sumit@_reachsumit

The Good and The Bad: Exploring Privacy Issues in Retrieval-Augmented Generation (RAG) Targeted attacks show RAG risks leaking retrieval data but mitigates training data exposure. 📝arxiv.org/abs/2402.16893 👨🏽‍💻github.com/phycholosogy/R…

English

618

Shenglai Zeng@NeurIPS-2025@snowzeng2·29 Şub

3️⃣ Training Data Safeguard: RAG shows promise in protecting training data, offering a strategy to bolster privacy in AI systems. Our code is also available at github.com/phycholosogy/R…

English

262

Shenglai Zeng@NeurIPS-2025@snowzeng2·29 Şub

2️⃣ Mitigation Efforts: We've explored naive defenses such as summarization and retrieval thresholds. These methods help mitigate risks but don't completely resolve the issue, indicating the gravity of privacy risks in RAG.

English

309

Shenglai Zeng@NeurIPS-2025@snowzeng2·29 Şub

🔒💡 Excited to share our latest #RAG #Privacy research! We've uncovered two pivotal aspects: 1️⃣ Privacy challenges within RAG's own data 2️⃣ RAG's potential to safeguard training data 🔍 Discover the dual-edged sword of RAG technology in our paper arxiv.org/pdf/2402.16893

English

1.8K

Keşfet

@elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine @katyperry