EuroSafeAI

10 posts

EuroSafeAI

EuroSafeAI

@EuroSafeAI

Research non-profit for AI safety and democracy defense. Cofounded by @ZhijingJin, @x_angelohuang and Pepijn Cobben

Zürich Katılım Şubat 2026
2 Takip Edilen19 Takipçiler
EuroSafeAI retweetledi
Zhijing Jin
Zhijing Jin@ZhijingJin·
Excited for our #ICML2026 papers at @JinesisLab @MPI_IS @UofTCompSci @TorontoSRI @VectorInst! We present papers that advance the research frontiers of (1) Causal LLMs, (2) AI for Science (physics), (3) Multi-Agent LLMs via mechanism design, and (4) Adversarial Defense by honeypot. Congrats to all our student authors and collaborators, esp. @TerryJCZhang @SimkoSamuel @EmanuelTewolde @ivakshi_s @andrewkihyun @PepijnCobben @yahang_qi @FurkanDanismann @bschoelkopf and many others!🎉
Zhijing Jin tweet media
English
0
11
75
3.7K
EuroSafeAI retweetledi
Zhijing Jin
Zhijing Jin@ZhijingJin·
⚠️Can we trust #LLM agents to keep their promises? We tested 9 frontier LLMs in game-theoretic settings, where the agents (1) publicly commit to an action, (2) privately choose what to do -- breaking promises ~57% of the time, and most do it without even realizing they lied. 📖Paper: "Cheap Talk, Empty Promise: Frontier LLMs easily break public promises for self-interest" 🔗Link: arxiv.org/abs/2604.04782 🤝Authors: @Jerick1380 @TerryJCZhang @ZhijingJin @conitzer🎉 #AIAgents #AISafety #MultiAgentAI @MPI_IS @ELLISforEurope @UofTCompSci @VectorInst @TorontoSRI @CIFAR_News @JinesisLab @EuroSafeAI @ELLISInst_Tue @CarnegieMellon @SCSatCMU
Zhijing Jin tweet media
English
12
31
116
9.3K
EuroSafeAI retweetledi
Zhijing Jin
Zhijing Jin@ZhijingJin·
10 days left to submit to the 1st Trustworthy AI for Good (AI4GOOD) workshop at #ICML2026! @icmlconf We're giving out multiple awards and travel funds sponsored by @schmidtsciences and @coop_ai: 🏆 Best Paper Awards (including targeted prizes for cooperative AI theme) 🏆 Top Reviewer Awards ✈️ Travel Funds Submit here → openreview.net/group?id=ICML.… ⏰ Deadline: May 3, 2026 (AoE) 📌 Notification: May 18, 2026 🔗(We extended our deadline to accommodate more submissions!) Join us in Seoul for discussions bridging AI safety, social good, and governance with keynote speakers @Yoshua_Bengio, @OanaIgnatRo, @jzl86, @maksym_andr, and more!
Zhijing Jin tweet media
English
3
17
73
13.3K
EuroSafeAI retweetledi
Zhijing Jin
Zhijing Jin@ZhijingJin·
Excited for our "Trustworthy AI for Good" (AI4GOOD) Workshop at #ICML2026! As AI agents increasingly affect our lives, it is key to bridge #ResponsibleAI, social good, and governance. Let’s build solutions together! ⏰ Submission deadline: April 30, 2026 (AoE) 🎙️Confirmed speakers: @Yoshua_Bengio, Joel Z. Leibo (@jzl86), Maksym Andriushchenko (@maksym_andr), @OanaIgnatRo [More to come!] 📍July 10-11, 2026 · Seoul🇰🇷 🔗 trustworthy-ai-for-good.github.io 📝 Submit: openreview.net/group?id=ICML.… 📣 Be a reviewer: forms.gle/7cXvUJCW1FdEgh…
Zhijing Jin tweet media
English
3
30
163
12.3K
EuroSafeAI retweetledi
Zhijing Jin
Zhijing Jin@ZhijingJin·
We are hosting a Dagstuhl seminar on Causality & LLMs this week (Apr 7–10). Bringing together world experts to explore: 1️⃣ Integrating LLMs 🤖 into causal workflows 2️⃣ Evaluating & improving LLMs’ causal reasoning 🧠 Co-organized w/ @amt_shrma @DominikJanzing @kunkzhang @ZhijingJin 📍Schloss Dagstuhl, Wadern, Germany 🔗 dagstuhl.de/26152 📖 cr-llm.github.io 📅 Apr 7–10 #CausalNLP #LLM #Dagstuhl @CausalNLP @MPI_IS @ELLISforEurope @UofTCompSci @VectorInst @TorontoSRI @CIFAR_News @JinesisLab @EuroSafeAI @ELLISInst_Tue Also joined with my student @rahulbshrestha to present our CauSciBench and Causal AI Scientist work :)!
English
2
5
38
3K
EuroSafeAI retweetledi
Zhijing Jin
Zhijing Jin@ZhijingJin·
📢We will present 5 papers to #ICLR2026, #CLeaR2026, and #ACL2026: - SocialHarmBench by @psyonp et al. - Causal LLMs on Instrumental Variable Method by @ivakshi_s et al. - LLM Data Contamination study by @TerryJCZhang et al. - Mech Interp for VLM by @francescortu et al. - DPO data selection method by Xuan & @rongwu_xu Thanks to all our collaborators and institutional support from @MPI_IS @ELLISforEurope @UofTCompSci @VectorInst @TorontoSRI @CIFAR_News @JinesisLab @EuroSafeAI @ELLISInst_Tue @ETH_en @ETH_AI_Center @michigan_AI @UMichiganAI @UMichCSE! Feel free to access the papers at arxiv.org/abs/2510.04891 arxiv.org/abs/2602.07943 arxiv.org/abs/2509.00072 arxiv.org/abs/2507.13868 arxiv.org/abs/2508.04149 🎉
Zhijing Jin tweet media
English
1
10
77
5.8K
EuroSafeAI retweetledi
Zhijing Jin
Zhijing Jin@ZhijingJin·
AI is threatening our democratic society—by concentrating power, narrowing how we think, and flooding institutions faster than they can keep up. These risks emerge at the system level, and technical work alone won't fix them. 👉Check out our whitepaper with 25+ researchers: zhijing-jin.com/d/2026-ai-risk… 💡We introduce 7 threat models and ways forward. ✍️Led by @davidguzman1120 with @DaveRBanerjee, @blin_kevin, @PepijnCobben, @gcorsi_, @x_angelohuang, @ChanglingXavier, Suvajit Majumder, @psyonp, @SimkoSamuel, @strauss_irene, and @TerryJCZhang Advised by senior co-authors: @ashton1anderson, @Yoshua_Bengio, @MatthiasBethge, @RogerGrosse, Karoline Helbig, @david_lie, Richard Mallah, @radamihalcea, Susan Nesbitt, Susan Perry, @presnick, Stuart Russell, @mrinmayasachan, @bschoelkopf @audreyt and @ZhijingJin Thank you to all the institutional support from @JinesisLab @EuroSafeAI @MPI_IS @CIFAR_News @iapsAI @CARMA_411 @Cambridge_Uni @UofTCompSci @VectorInst @TorontoSRI @Mila_Quebec @LawZero_ @uni_tue @michigan_AI @UMichCSE @AUParis @UNESCO @UCBerkeley @ETH_en @ETH_AI_Center @ELLISInst_Tue @ELLISforEurope @EthicsInAI #CivicAI #AISafety #AIGovernance #Democracy #ResponsibleAI
Zhijing Jin tweet media
English
13
153
367
30.5K
EuroSafeAI retweetledi
Jinesis Lab (UToronto)
Jinesis Lab (UToronto)@JinesisLab·
🎉 Our lab has 7 papers at #EACL2026 in Rabat this week 🇲🇦 Topics span democracy defense, multi agent safety, causal reasoning, hallucinations, and NLP for social good. Grateful to everyone who contributed to this work 🙌 🙌 Come find us! #NLProc #LLMs #ResponsibleAI
English
0
1
6
253
EuroSafeAI retweetledi
Zhijing Jin
Zhijing Jin@ZhijingJin·
Difficult times—but we keep pushing forward. ✅ Our Trustworthy AI for Good Workshop→accepted at @icmlconf Seoul (18% acceptance) ✅ @NLP4PosImpact Workshop→coming to @emnlpmeeting in Budapest 🇭🇺 AI research can be a force for good—and we’re committed to contribute. More soon.
English
5
4
40
2.8K