Christian Schroeder de Witt

245 posts

Christian Schroeder de Witt banner
Christian Schroeder de Witt

Christian Schroeder de Witt

@casdewitt

EPSRC Open Fellow (incoming) + RAEng RF + Schmidt AI2050 ECF, University of Oxford. Agentic Safety & Security / Multi-Agent Security.

Oxford, England Katılım Temmuz 2017
1K Takip Edilen1.6K Takipçiler
Christian Schroeder de Witt
While we cannot always detect steganography directly, sometimes the effects of sharing information secretly can be observed relative to the subsequent behaviour of the agents - an important decision-theoretic approach to steganography detection in CoT settings pioneered by @usmananwar391 @j_piskorz_
Usman Anwar@usmananwar391

✨New AI Safety work on Steganography and LLM monitoring✨ We propose ‘steganographic gap’: the first principled metric for detecting and quantifying encoded reasoning in LLMs, which can reveal hard-to-detect forms of steganography, e.g., paraphrasing-resistant steganography.

English
0
2
12
1.4K
Christian Schroeder de Witt retweetledi
Xander Davies
Xander Davies@alxndrdavies·
The Red Team at @AISecurityInst is hiring! We work with frontier AI companies to red team their misuse safeguards, control measures, and alignment techniques. As the stakes rise, we need much stronger red teaming and many more talented researchers working within gov 🧵
Xander Davies tweet media
English
3
33
225
65.4K
Christian Schroeder de Witt
Christian Schroeder de Witt@casdewitt·
🚀 I am recruiting MSc, undergraduate, and CDT/PhD students to join wittlab.ai at Oxford. Projects span autonomous agents, multi-agent security, interpretability, and evaluation science - ambitious, publication-oriented research at the frontier of AI capability & safety. Details: wittlab.ai/student_projec… 📩 christian.schroeder@eng.ox.ac.uk
English
10
70
457
29.5K
Christian Schroeder de Witt retweetledi
alex
alex@ObadiaAlex·
1. Introduction to ARIA by jenny read 2. Why are we here? by yours truly 3. Security Primitives: New Advances & State of the Art by @iamnotnicola 4. Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents by @casdewitt 5. Embodied AI: What’s happening and how fast are things progressing? by @rowstron 6. Hardness in Silicon by @0xquintus 7. Challenges in Securing Ultra-Large-Scale Cyber Physical Infrastructures by Awais Rashid 8. Verification in Physical Systems Enable Autonomous Engineering by Eder Medina 9. Trust Robots, Everywhere by @engineerEdith 10. Consumable Quantum Data by Dar Gilboa 11. Cryptographic Sensing by Yuval Ishai 12. Mathematical Formalization of Cognition as an Attack Surface by @babagley 13. Cryptographically-Verifiable Sustainability x AI: A Powerful Future Tool for Our Planet? by Jessica Man
English
0
2
6
824
Christian Schroeder de Witt
Christian Schroeder de Witt@casdewitt·
Huge congrats, Tim @frtimlive - joining David Silver's RL team at DeepMind is epic. Looking back fondly at our ICLR spotlight on Illusory Attacks. Onward! 🚀🥳
Tim Franzmeyer@frtimlive

I recently joined @GoogleDeepMind in London. Excited to be part of David Silver's RL team to work on Gemini, Reinforcement Learning and Agents. It’s been amazing speaking with so many fascinating people in the first weeks and learning from them!

English
0
0
5
1.8K
Christian Schroeder de Witt retweetledi
Cooperative AI Foundation
Thank you to ❇️Christian Schroeder de Witt @casdewitt (Open challenges in multi-agent security) and ❇️Nora Ammann @AmmannNora (Gradual Disempowerment) for their fantastic talks and office hours at the Cooperative AI Summer School today.
Cooperative AI Foundation tweet media
English
0
1
16
984
Christian Schroeder de Witt
Christian Schroeder de Witt@casdewitt·
🔐 TL;DR: AI security must be anticipatory, not reactive. We can't just defend what's already been exploited - we must prepare for what is mathematically possible.
English
1
4
19
1.9K
Christian Schroeder de Witt
Christian Schroeder de Witt@casdewitt·
Very excited to announce new work from @divgarg and the team at @agi_inc on REAL Bench - a benchmark designed to evaluate frontier web agents on realistic tasks!
GIF
English
1
1
12
878