This approach enhances the reliability of trigger reconstruction, making it capable of distinguishing between clean & trojaned models. 🚀
Congrats to all the authors who did an amazing job! 3/4
By employing a diffusion-based generator guided by the target classifier, #DISTIL iteratively produces candidate triggers that align with the model's internal representations associated with malicious behavior. 2/4
My lab has been pushing into explainable, robust, & theoretically-tractable AI models for science 💪
🚨At #ICCV2025 we introduce #DISTIL - led by amazing PhD student @hsirm96 - we propose a trigger-inversion method for DNNs that reconstructs malicious backdoor triggers 1/4
🎉 NeCo is accepted to @iclr_conf#ICLR25
TL;DR: Self-supervised patch-level neighbor consistency boosts dense features of foundation models like DINOv2—in just 19 GPU hours.
📍 Tomorrow,10:00–12:30@Hall3+Hall2B#133
📄 Paper: openreview.net/forum?id=Qro97…
🔥🙏🏼 #AROS 💍 is accepted to #ICLR2025@iclr_conf ! So proud of @hsirm96 - what an awesome way to kick off his first grad school project 👌👌 Check out the updated arXiv version of the paper, open code (including python package) below ⬇️
#AROS💍 leverages neural ODEs and Lyapunov stability theory to craft an embedding method to smartly detect OOD samples. Strikingly, we can improve performance on popular adversarial detection benchmarks such as CIFAR10 vs CIFAR100 by over 40% 👏 🔥🚀 we are excited to keep pushing this line of work together 💪
TL;DR need more robustness #PutARobustnessRingOnIt 💍 #adversarialattack#machinelearning
📝 arxiv.org/abs/2410.10744
💻 github.com/AdaptiveMotorC…
Downloaded a model's weights and wondering whether it has been poisoned or not? Meet our method, TRODO (Trojan scanning by Detection of Adversarial shift in Out-of-Distribution samples), accepted in #NeurIPS 2024. 1/
Paper: neurips.cc/virtual/2024/p…
Code: github.com/rohban-lab/TRO…
🚨adversarial robustness is becoming even more critical as AI systems are deployed in the real-world, but how can we detect outliers (adversarials) without having trained on them 👀?
In our new preprint, we introduce AROS💍: It leverages neural ODEs and Lyapunov stability theory to craft an embedding method to smartly detect OOD samples. Strikingly, we can improve performance on popular adversarial detection benchmarks such as CIFAR10 vs CIFAR100 by over 40%.
✨Led by the super talented @EPFL_en@mwmathislab PhD student @hsirm96 ~> check it out! arxiv.org/abs/2410.10744
September means it’s the labs 7th anniversary 🤯🔥🥂🎉🍾…
And we welcome three new amazing PhD students: Hossein Mirzaei @hsirm96 , Xiaohang Yu, and Ti Wang @TiwangCS !
All brilliant computer scientists ready to work on the next generation of ML & CV for Science 🔥🧠🦄
I am thrilled that our work on robust anomaly detection has been accepted in #ICML2024. TL;DR: We found that synthetic outliers with two key properties make this happen: near-OODness and diversity. 1/
Our paper titled as "Fake It Until You Make It : Towards Accurate Near-Distribution Novelty Detection" has been accepted for presentation at the ICLR 2023. This is a joint work with @hsirm96, @MrzSalehi, Sajjad Shahabi, @egavves, @cgmsnoek, @sabokrou.
A Unified Survey on Anomaly, Novelty, Open-Set, and Out of-Distribution Detection: Solutions and ...
Mohammadreza Salehi, Hossein Mirzaei, Dan Hendrycks et al.
openreview.net/forum?id=aRtjV…#NewPaper#PaperPost