Ahmed Elgohary (@aagohary) - Twitter Profili | Zamantika Mersobahis Locabet

Ahmed Elgohary retweetledi

📢 Call for papers: Workshop on Methods and Reinforcement Learning Environments for Evaluating AI Agents @ ACM CAIS 2026 (inaugural edition!) Topics include: - Design principles for effective RL Environments - Methods to evaluate Agents, esp. causal/interventional techniques

English

1

3

7

6.6K

Ahmed Elgohary retweetledi

JHU Computer Science@JHUCompSci·13 Kas

and “Jailbreak Distillation: Renewable Safety Benchmarking” by @jackjingyuzhang, @ben_vandurme, @DanielKhashabi, @aagohary, @ASMIftekhar1, & more proposed a novel framework that “distills” jailbreak attacks into high-quality safety 🦺 benchmarks: aclanthology.org/2025.findings-… (7/7)

English

0

5

7

909

Ahmed Elgohary retweetledi

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang·5 Kas

Jailbreak Distillation will be presented at #EMNLP2025 in Suzhou in a couple of hours! Catch our poster to learn about ✨renewable safety benchmarks✨ that scale with rapidly evolving model capabilities. ⏰Nov 6, 12:30-1:30pm, Hall C 🎥Teaser video: youtu.be/amymwVUH6b0?si…

YouTube

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang

Introducing 𝐉𝐚𝐢𝐥𝐛𝐫𝐞𝐚𝐤 𝐃𝐢𝐬𝐭𝐢𝐥𝐥𝐚𝐭𝐢𝐨𝐧 🧨 (EMNLP '25 Findings) We propose a generate-then-select pipeline to "distill" effective jailbreak attacks into safety benchmarks, ensuring eval results are reproducible and robust to benchmark saturation & contamination🧵

English

0

7

22

2.1K

Ahmed Elgohary retweetledi

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang·25 Ağu

Introducing 𝐉𝐚𝐢𝐥𝐛𝐫𝐞𝐚𝐤 𝐃𝐢𝐬𝐭𝐢𝐥𝐥𝐚𝐭𝐢𝐨𝐧 🧨 (EMNLP '25 Findings) We propose a generate-then-select pipeline to "distill" effective jailbreak attacks into safety benchmarks, ensuring eval results are reproducible and robust to benchmark saturation & contamination🧵

English

1

17

32

6.3K

Ahmed Elgohary retweetledi

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang·21 Nis

Our Controllable Safety Alignment paper will be presented at #ICLR2025 this week in Singapore 🇸🇬! We've release our code and the human-authored CoSApien👥 dataset: 👉 aka.ms/controllable-s… Watch the short video summary here: 🎬 youtube.com/watch?v=kDioFn…

YouTube

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang

🤖 LLMs are powerful, but their "one-size-fits-all" safety alignment limits flexibility. Safety standards vary across cultures and users—what’s safe in one context might not be in another. 🌍 We propose ✨Controllable Safety Alignment✨ for inference-time safety adaptation! 🧵👇

English

0

7

27

4.1K

Ahmed Elgohary retweetledi

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang·23 Oca

Thrilled to share that 𝗖𝗼𝗻𝘁𝗿𝗼𝗹𝗹𝗮𝗯𝗹𝗲 𝗦𝗮𝗳𝗲𝘁𝘆 𝗔𝗹𝗶𝗴𝗻𝗺𝗲𝗻𝘁 has been accepted to #ICLR2025! 🚀 We propose a framework that adapt LLMs to varying social norms and diverse safety requirements without re-training. Link to our paper: arxiv.org/abs/2410.08968

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang

🤖 LLMs are powerful, but their "one-size-fits-all" safety alignment limits flexibility. Safety standards vary across cultures and users—what’s safe in one context might not be in another. 🌍 We propose ✨Controllable Safety Alignment✨ for inference-time safety adaptation! 🧵👇

English

0

6

27

2.3K

Ahmed Elgohary retweetledi

Jack Jingyu Zhang @ ICLR 🇧🇷@jackjingyuzhang·17 Eki

🤖 LLMs are powerful, but their "one-size-fits-all" safety alignment limits flexibility. Safety standards vary across cultures and users—what’s safe in one context might not be in another. 🌍 We propose ✨Controllable Safety Alignment✨ for inference-time safety adaptation! 🧵👇

English

5

42

135

29.1K

Ahmed Elgohary retweetledi

Huan Sun@hhsun1·22 Şub

Second call for papers! Please consider submitting your work to our workshop on NLP for Programming at ACL 2021 (@Nlp4Prog @aclmeeting)! w/ Royi Lachmy, @ZiyuYao @gregd_nlp , Milos Gligoric, @jessyjli @BoredRayMooney @gneubig @yusuOSU @hhsun1 @rtsarfaty

NLP4Prog@Nlp4Prog

Interested in using natural language processing💬 to assist computer programming💻? Consider submit to our workshop on NLP for Programming (NLP4Prog) at @aclmeeting 2021! 🤩🤩🤩Call for paper at nlp4prog.github.io/2021/ Deadline: April 26, 2021🗓️

English

0

16

32

0

Ahmed Elgohary retweetledi

Soheil Feizi@FeiziSoheil·8 Şub

While we are at it, can we grant international students #MultipleEntryVisa for the duration of their studies (instead of single-entry)? It may sound like a minor issue for many but it is actually a big deal for many international students. I explain it below. 👇

English

10

121

858

0

Ahmed Elgohary@aagohary·17 Haz

@ahmad_saeed @karimhabak congrats :)

English

0

2

0

Ahmed Saeed@ahmad_saeed·17 Haz

Today I defended by PhD thesis on Scalable Network Scheduling in Software. My work wouldn't have been possible without the continuous support of everyone in this picture (missing my co-advisor Mostafa Ammar and my buddy @karimhabak).

English

6

3

27

0

Ahmed Elgohary retweetledi

Matt Gardner@nlpmattg·17 Eki

#nlphighlights 72: the anatomy of a question answering task, with @boydgraber. This is our first episode in a new format, giving a more general overview of an area instead of discussing a specific paper. There will be more of these in the future. soundcloud.com/nlp-highlights…

English

1

16

47

0

Ahmed Elgohary retweetledi

UMD CLIP@umdclip·23 Ağu

Check out "Assessing Composition in Sentence Vector Representations" by Allyson Ettinger, Ahmed Elgohary, Colin Phillips, Philip Resnik next in Session 3-1-b #COLING2018 ling.umd.edu/~aetting/CompE…

English

0

2

9

0

Ahmed Elgohary@aagohary·1 Kas

@omerlevy_ relation extraction aclweb.org/anthology/P16-…

English

1

0

4

0