Samuel Simko

44 posts

Samuel Simko

Samuel Simko

@SimkoSamuel

Research Assistant @ ETH Zürich. Interested in AI Safety and AI for Science

Zürich Tham gia Ağustos 2015
148 Đang theo dõi143 Người theo dõi
Samuel Simko
Samuel Simko@SimkoSamuel·
I’ll be speaking at the AIxBio event in Zurich on May 13th at 18:00! Join for a discussion of what current AI systems can (or cannot) do and how their risks can be reduced. Registration link: luma.com/a3nyvkkp
English
0
0
9
349
Samuel Simko đã retweet
Zhijing Jin
Zhijing Jin@ZhijingJin·
Excited for our #ICML2026 papers at @JinesisLab @MPI_IS @UofTCompSci @TorontoSRI @VectorInst! We present papers that advance the research frontiers of (1) Causal LLMs, (2) AI for Science (physics), (3) Multi-Agent LLMs via mechanism design, and (4) Adversarial Defense by honeypot. Congrats to all our student authors and collaborators, esp. @TerryJCZhang @SimkoSamuel @EmanuelTewolde @ivakshi_s @andrewkihyun @PepijnCobben @yahang_qi @FurkanDanismann @bschoelkopf and many others!🎉
Zhijing Jin tweet media
English
0
11
75
3.7K
Samuel Simko
Samuel Simko@SimkoSamuel·
Stage I for MARS V closes Sunday, 3 May at 23:59 AoE!
Samuel Simko@SimkoSamuel

[Call for applicants] My supervisor @ZhijingJin (UofT, CIFAR AI Chair) and I will be mentoring a project for MARS V, a part-time research programme for AI safety research. MARS provides a one-week in-person kick-off in the UK, compute, and research management support! 🚀 The projects are: 🛡️ Adversarial defenses for LLMs using causal methods 🌐 Evaluating risks from AI-assisted authoritarianism 👉 Apply by May 3rd. Applications are reviewed on a rolling basis: caish.org/mars @CambridgeAISafetyHub

English
0
0
11
1.7K
Samuel Simko
Samuel Simko@SimkoSamuel·
Paper accepted ✅ See you in Seoul! 👋🇰🇷 #ICML
Samuel Simko tweet media
English
2
3
66
2.4K
Samuel Simko
Samuel Simko@SimkoSamuel·
[Call for applicants] My supervisor @ZhijingJin (UofT, CIFAR AI Chair) and I will be mentoring a project for MARS V, a part-time research programme for AI safety research. MARS provides a one-week in-person kick-off in the UK, compute, and research management support! 🚀 The projects are: 🛡️ Adversarial defenses for LLMs using causal methods 🌐 Evaluating risks from AI-assisted authoritarianism 👉 Apply by May 3rd. Applications are reviewed on a rolling basis: caish.org/mars @CambridgeAISafetyHub
Samuel Simko tweet media
English
2
9
92
17.9K
Samuel Simko đã retweet
Lancelot Da Costa
Lancelot Da Costa@lancelotdacosta·
We'll be organizing the Machine Learning Summer School in Tübingen to be held Aug 31st-Sept 11th, featuring top speakers across academia and industry. If you are a student or ML researcher, save those dates and stay tuned for updates! 🚀
English
13
19
267
17.9K
Samuel Simko đã retweet
Zhijing Jin
Zhijing Jin@ZhijingJin·
Excited for our "Trustworthy AI for Good" (AI4GOOD) Workshop at #ICML2026! As AI agents increasingly affect our lives, it is key to bridge #ResponsibleAI, social good, and governance. Let’s build solutions together! ⏰ Submission deadline: April 30, 2026 (AoE) 🎙️Confirmed speakers: @Yoshua_Bengio, Joel Z. Leibo (@jzl86), Maksym Andriushchenko (@maksym_andr), @OanaIgnatRo [More to come!] 📍July 10-11, 2026 · Seoul🇰🇷 🔗 trustworthy-ai-for-good.github.io 📝 Submit: openreview.net/group?id=ICML.… 📣 Be a reviewer: forms.gle/7cXvUJCW1FdEgh…
Zhijing Jin tweet media
English
3
30
163
12.3K
Samuel Simko đã retweet
ELLIS
ELLIS@ELLISforEurope·
What if the most dangerous AI isn’t rogue - but works as intended? A new ELLIS-affiliated paper shows aligned, policy-compliant AI can still undermine democracy at scale. Bottom line: alignment ≠ safety. Democratic resilience must keep pace. 📄 Paper: bit.ly/4snKdLN
ELLIS tweet media
English
0
1
10
934
Samuel Simko đã retweet
Zhijing Jin
Zhijing Jin@ZhijingJin·
📢We will present 5 papers to #ICLR2026, #CLeaR2026, and #ACL2026: - SocialHarmBench by @psyonp et al. - Causal LLMs on Instrumental Variable Method by @ivakshi_s et al. - LLM Data Contamination study by @TerryJCZhang et al. - Mech Interp for VLM by @francescortu et al. - DPO data selection method by Xuan & @rongwu_xu Thanks to all our collaborators and institutional support from @MPI_IS @ELLISforEurope @UofTCompSci @VectorInst @TorontoSRI @CIFAR_News @JinesisLab @EuroSafeAI @ELLISInst_Tue @ETH_en @ETH_AI_Center @michigan_AI @UMichiganAI @UMichCSE! Feel free to access the papers at arxiv.org/abs/2510.04891 arxiv.org/abs/2602.07943 arxiv.org/abs/2509.00072 arxiv.org/abs/2507.13868 arxiv.org/abs/2508.04149 🎉
Zhijing Jin tweet media
English
1
10
77
5.8K
Samuel Simko đã retweet
Samuel Simko
Samuel Simko@SimkoSamuel·
🚨 New paper: AI Poses Risks to Democratic and Social Systems. We discuss 7 failure modes showing how Al can degrade democracy and society through power concentration, narrowing how we think, or flooding institutions faster than they can keep up. We also proposed 7 research & governance recommendations, from simulation-based stress-testing to deliberative governance infrastructure. Honored to work with Yoshua Bengio, Stuart Russell, Roger Grosse, Bernhard Schölkopf, Rada Mihalcea, Ashton Anderson, Audrey Tang and many others. Full whitepaper here: zhijing-jin.com/d/2026-ai-risk…
Zhijing Jin@ZhijingJin

AI is threatening our democratic society—by concentrating power, narrowing how we think, and flooding institutions faster than they can keep up. These risks emerge at the system level, and technical work alone won't fix them. 👉Check out our whitepaper with 25+ researchers: zhijing-jin.com/d/2026-ai-risk… 💡We introduce 7 threat models and ways forward. ✍️Led by @davidguzman1120 with @DaveRBanerjee, @blin_kevin, @PepijnCobben, @gcorsi_, @x_angelohuang, @ChanglingXavier, Suvajit Majumder, @psyonp, @SimkoSamuel, @strauss_irene, and @TerryJCZhang Advised by senior co-authors: @ashton1anderson, @Yoshua_Bengio, @MatthiasBethge, @RogerGrosse, Karoline Helbig, @david_lie, Richard Mallah, @radamihalcea, Susan Nesbitt, Susan Perry, @presnick, Stuart Russell, @mrinmayasachan, @bschoelkopf @audreyt and @ZhijingJin Thank you to all the institutional support from @JinesisLab @EuroSafeAI @MPI_IS @CIFAR_News @iapsAI @CARMA_411 @Cambridge_Uni @UofTCompSci @VectorInst @TorontoSRI @Mila_Quebec @LawZero_ @uni_tue @michigan_AI @UMichCSE @AUParis @UNESCO @UCBerkeley @ETH_en @ETH_AI_Center @ELLISInst_Tue @ELLISforEurope @EthicsInAI #CivicAI #AISafety #AIGovernance #Democracy #ResponsibleAI

English
0
2
9
724
Samuel Simko đã retweet
Zhijing Jin
Zhijing Jin@ZhijingJin·
Mech interp or representation interp? We need to decode the causal computational graph of #LLMs—not just cataloguing representations (steering vectors etc). Analogy: we can’t understand biology by just blood composition. We need to understand how the body works. Same for LLMs.
Zhijing Jin tweet media
English
4
25
165
9.8K
Samuel Simko
Samuel Simko@SimkoSamuel·
🚀 We're launching EuroSafeAI, a nonprofit focused on multi-agent AI safety research, here in Zurich. Our launch event is on 6 March at 6:30 PM at the ETH Student Project House. Expect lightning talks and drinks! Info & Sign-up: luma.com/hwo46ach See you there! 🔥
English
1
3
11
844
Samuel Simko đã retweet
Hanna Yukhymenko
Hanna Yukhymenko@a_yukh·
❓Can we actually trust the quality of the existing multilingual benchmarks translated from English? Turns out many of them have some simple bugs, which hurts the evaluations - we try to fix that! Introducing Recovered in Translation 🌍 ritranslation.insait.ai 🧵below
Hanna Yukhymenko tweet media
English
1
4
19
2.2K
Samuel Simko đã retweet
Zhijing Jin
Zhijing Jin@ZhijingJin·
First day at UNESCO: We presented our Detecting LLM Historical Revisionism paper by @FrancescOrtu @JoeunYk05 @psyonp @KeenanSamway @BSchoelkopf @AlbeCazzaniga @RadaMihalcea @ZhijingJin and will present Accidental Vulnerability by @psyonp @SimkoSamuel @KellinPelrine @ZhijingJin!
Zhijing Jin tweet mediaZhijing Jin tweet mediaZhijing Jin tweet media
Zhijing Jin@ZhijingJin

Excited to have 3 accepted papers & 9 members of our @JinesisLab at #IASEAI2026, held at UNESCO, Paris🇫🇷! We reveal hidden authoritarian biases in #LLMs, and that fine-tuning can quietly erode model safety, exploring the risks we don't always see in AI 🔍🛡️ 🧵👇

English
2
5
53
6.3K