
francesco ortu
34 posts

francesco ortu
@francescortu
NLP & Interpretability | PhD Student @UniTrieste & Data Engineering Lab @AreaSciencePark | Prev @MPI_IS intern https://t.co/RM03tgnbJP








Excited to have 3 accepted papers & 9 members of our @JinesisLab at #IASEAI2026, held at UNESCO, Paris🇫🇷! We reveal hidden authoritarian biases in #LLMs, and that fine-tuning can quietly erode model safety, exploring the risks we don't always see in AI 🔍🛡️ 🧵👇








🚨🚨 Excited to share our latest paper, now on @arxiv! 🖼️ We studied how unified VLMs, trained to generate both text and images (e.g., @MetaAI's Chameleon), exchange information between modalities, comparing them to standard VLMs. Deep dive:👇

Our "Competitions of Mechanisms" paper proposes an interesting way to interpret LLM behaviors thru how it handles multiple conflicting mechanisms. E.G., in-context knowledge vs. in-weights knowledge🧐This is an elegant philophical way of thinking --



Honored to receive the Best Paper Award (the top prize!) at the #NeurIPS2024 Workshop on Pluralistic Alignment @pluralistic_ai! Many thx to my wonderful coauthors, who taught me so much about this interdisciplinary field of #LLMs and moral reasoning: @maxhkw @giorgiopiatti @sydneymlevine @Jiarui_Liu_ @fer_adauto @francescortu Andras Strausz @mrinmayasachan @radamihalcea @YejinChoinka @bschoelkopf. Also thank you so much to all the co-organizers and especially @Ruyuan_Wan for the wonderful photo capturing this memorable moment! 🎉 Check out our "Language Model Alignment in Multilingual Trolley Problems" at arxiv.org/pdf/2407.02273!



