EdinburghNLP

1.3K posts

EdinburghNLP banner
EdinburghNLP

EdinburghNLP

@EdinburghNLP

The Natural Language Processing Group at the University of Edinburgh.

Edinburgh, Scotland Katılım Mayıs 2017
160 Takip Edilen13.6K Takipçiler
Sabitlenmiş Tweet
EdinburghNLP
EdinburghNLP@EdinburghNLP·
Join our PhD programme in Designing Responsible Natural Language Processing at the UKRI AI Centre for Doctoral Training, University of Edinburgh. Applications are now re-opened for Home fee status candidates (past candidates need not re-apply). responsiblenlp.org
English
0
5
13
5.2K
EdinburghNLP retweetledi
Yuxiang Huang
Yuxiang Huang@yxyxyyy6·
[1/n] Can a model learn *where* and *how much* information it should attend to, and do so efficiently? We introduce DashAttention: Differentiable and Adaptive Sparse Hierarchical Attention! This pushes the accuracy-efficiency frontier in LLMs.
GIF
English
1
13
62
12.3K
EdinburghNLP retweetledi
Pasquale Minervini
Pasquale Minervini@PMinervini·
Check out this new project by @MattAttimonelli et al.! TLDR -- multi-modal retrieval benchmarks can be solved by only using one modality, and we tried fixing this by creating manually curated subsets where all modalities are required
Pasquale Minervini tweet media
Matteo Attimonelli@MattAttimonelli

To address this, we introduce CIRCUS, a curated evaluation setup for testing genuine multimodal composition! 🌐 Website: matteoattimonelli.github.io/CIRCUS/ 📄 Paper: arxiv.org/pdf/2605.14787

English
1
2
14
1.7K
EdinburghNLP retweetledi
Matteo Attimonelli
Matteo Attimonelli@MattAttimonelli·
Do multimodal retrieval benchmarks actually require multimodal reasoning?? We analyse Composed Image Retrieval, which should require models to combine visual and textual information:
Matteo Attimonelli tweet media
English
1
4
8
1.1K
EdinburghNLP retweetledi
Edoardo Ponti
Edoardo Ponti@PontiEdoardo·
Critic-free RL (e.g. GRPO) is very effective in LLM post-training, but why? We propose the💥cancellation hypothesis💥: sequence-level rewards implicitly assign credits to individual tokens through the cancellation of gradients from pos/neg rollouts. x.com/crazycth0901/s…
Tianhao Cheng@tianhaoCheng_

🚀 The Cancellation Hypothesis in Critic-Free RL Conventional view: GRPO boosts successful rollouts and suppresses failed ones. We find Token Flipping: positive and negative rollouts show remarkably similar boosted/suppressed token ratios.

English
5
12
95
15.9K
EdinburghNLP retweetledi
Edoardo Ponti
Edoardo Ponti@PontiEdoardo·
I am moving to @ICComputing at @imperialcollege as an associate professor, where I will be expanding my lab! I am looking for PhDs and postdocs to join me on my quest to build foundation models with adaptive tokenisation and memory (AToM FMs, funded by @ERC_Research)
Edoardo Ponti tweet media
English
21
19
208
12.8K
EdinburghNLP retweetledi
Mikołaj Piórczyński
Mikołaj Piórczyński@AjPiorczynski·
🇧🇷 Bom dia @iclr_conf ! This evening, together with @f_szatkowski, we’re presenting our paper “Universal Properties of Activation Sparsity in Modern Large Language Models”. Stop by Poster 912 in Pavilion 3 and let’s have a chat.
Mikołaj Piórczyński tweet media
English
2
6
14
1.3K
EdinburghNLP retweetledi
Aryo Pradipta Gema
Aryo Pradipta Gema@aryopg·
Heading to Rio for #ICLR2026! 🇧🇷 Presenting 2 papers : 1. 𝐈𝐧𝐯𝐞𝐫𝐬𝐞 𝐒𝐜𝐚𝐥𝐢𝐧𝐠 𝐢𝐧 𝐓𝐞𝐬𝐭-𝐓𝐢𝐦𝐞 𝐂𝐨𝐦𝐩𝐮𝐭𝐞 (@TmlrOrg Featured Certification) — Sat Apr 25, 10:30 AM, Pavilion 3 (#903) 2. 𝐓𝐡𝐞 𝐇𝐨𝐭 𝐌𝐞𝐬𝐬 𝐨𝐟 𝐀𝐈: 𝐇𝐨𝐰 𝐃𝐨𝐞𝐬 𝐌𝐢𝐬𝐚𝐥𝐢𝐠𝐧𝐦𝐞𝐧𝐭 𝐒𝐜𝐚𝐥𝐞 𝐖𝐢𝐭𝐡 𝐌𝐨𝐝𝐞𝐥 𝐈𝐧𝐭𝐞𝐥𝐥𝐢𝐠𝐞𝐧𝐜𝐞 𝐚𝐧𝐝 𝐓𝐚𝐬𝐤 𝐂𝐨𝐦𝐩𝐥𝐞𝐱𝐢𝐭𝐲? — Sat Apr 25, 3:15 PM, Pavilion 3 (#110) Come find me if you're into LLM evaluation, Chain-of-Thought, and AI safety! 👋
Aryo Pradipta Gema tweet media
English
1
5
64
5.3K
EdinburghNLP retweetledi
Pasquale Minervini
Pasquale Minervini@PMinervini·
Aryo is amazing, catch up with him at ICLR'26! @iclrconf #ICLR2026
Aryo Pradipta Gema@aryopg

Heading to Rio for #ICLR2026! 🇧🇷 Presenting 2 papers : 1. 𝐈𝐧𝐯𝐞𝐫𝐬𝐞 𝐒𝐜𝐚𝐥𝐢𝐧𝐠 𝐢𝐧 𝐓𝐞𝐬𝐭-𝐓𝐢𝐦𝐞 𝐂𝐨𝐦𝐩𝐮𝐭𝐞 (@TmlrOrg Featured Certification) — Sat Apr 25, 10:30 AM, Pavilion 3 (#903) 2. 𝐓𝐡𝐞 𝐇𝐨𝐭 𝐌𝐞𝐬𝐬 𝐨𝐟 𝐀𝐈: 𝐇𝐨𝐰 𝐃𝐨𝐞𝐬 𝐌𝐢𝐬𝐚𝐥𝐢𝐠𝐧𝐦𝐞𝐧𝐭 𝐒𝐜𝐚𝐥𝐞 𝐖𝐢𝐭𝐡 𝐌𝐨𝐝𝐞𝐥 𝐈𝐧𝐭𝐞𝐥𝐥𝐢𝐠𝐞𝐧𝐜𝐞 𝐚𝐧𝐝 𝐓𝐚𝐬𝐤 𝐂𝐨𝐦𝐩𝐥𝐞𝐱𝐢𝐭𝐲? — Sat Apr 25, 3:15 PM, Pavilion 3 (#110) Come find me if you're into LLM evaluation, Chain-of-Thought, and AI safety! 👋

English
0
1
4
1.3K
EdinburghNLP retweetledi
Edoardo Ponti
Edoardo Ponti@PontiEdoardo·
We have just released AdaSplash 2, a highly efficient implementation of adaptively sparse attention! - Faster than FlashAttention 2 during training when block sparsity > 60% - More accurate than softmax attention on long-context benchmarks (+16 on HELMET ICL at 32k length)!
Marcos Treviso@MarcosTreviso

1/ We are excited to release AdaSplash-2 🚀 A big milestone from our lab on faster differentiable sparse attention. And honestly, one of my favorite examples of sparsity giving a real win-win: more efficiency + better downstream performance, especially for long-context tasks.

English
3
17
71
9.8K
EdinburghNLP retweetledi
Vivek Iyer
Vivek Iyer@remorax98·
Super excited to share my internship project at FAIR @AIatMeta 🚀 We introduce Spectrum -- an encoder-decoder LM pretrained using omnilingual & cross-modal sentence embeddings. Trained on English datasets alone, it outperforms strong baselines like Llama and SpiritLM on multilingual (900+ languages) and speech understanding benchmarks — despite never being directly exposed to multilingual or speech data during training. Curious how? Read on -- and check out the OmniSONAR technical report for the full details: ai.meta.com/research/publi… 👀🧵
Vivek Iyer tweet mediaVivek Iyer tweet mediaVivek Iyer tweet media
English
2
10
47
4.3K
EdinburghNLP retweetledi
Yifu Qiu@ICLR 2026
Yifu Qiu@ICLR 2026@yifuqiu98·
Glad to see model steering in the spectral space works for attention and the long context as well! We also show that spectral editing of activations can steer model behavior to alleviate hallucination and bias! proceedings.neurips.cc/paper_files/pa…
Waylon Li @ ICLR2026 🇧🇷@li_waylon

🚀 Excited to share our paper "Spectral Attention Steering for Prompt Highlighting" has been accepted to ICLR 2026 and the camera-ready version is finally live! We’ve found a way to steer LLM attention that is actually effective, fast and compatible with modern hardware.

English
2
5
18
2.9K
EdinburghNLP retweetledi
Waylon Li @ ICLR2026 🇧🇷
🚀 Excited to share our paper "Spectral Attention Steering for Prompt Highlighting" has been accepted to ICLR 2026 and the camera-ready version is finally live! We’ve found a way to steer LLM attention that is actually effective, fast and compatible with modern hardware.
Waylon Li @ ICLR2026 🇧🇷 tweet media
English
6
21
79
8.5K
EdinburghNLP retweetledi
Farooq Wani
Farooq Wani@wanifarooq848·
Your VLM gives the same answer before and after a tiny image change. So it's robust, right? Wrong. In our new paper, we show that VLMs can preserve their predictions while their internal representations drift to regions normally occupied by completely unrelated images. 🧵👇
Farooq Wani tweet media
English
1
8
11
1.7K
EdinburghNLP retweetledi
Filip Szatkowski
Filip Szatkowski@f_szatkowski·
We are presenting "Universal Properties of Activation Sparsity in Modern Large Language Models" at ICLR 2026! We ask a simple question: how sparse are modern LLMs, really — and does it matter? 👇
English
1
9
25
4.7K
EdinburghNLP retweetledi
EdinburghNLP retweetledi
Zheng Zhao
Zheng Zhao@zhengzhao97·
🎉 Thrilled to announce our paper "Verifying Chain-of-Thought Reasoning via Its Computational Graph" has been accepted as an ICLR 2026 ORAL! 🚨 We look inside the "black box" to detect reasoning errors by analyzing the model's internal circuit. 🧠⚡️ Read more on CRV 👇
Zheng Zhao@zhengzhao97

Thrilled to share our latest research on verifying CoT reasonings, completed during my recent internship at FAIR @metaai. In this work, we introduce Circuit-based Reasoning Verification (CRV), a new white-box method to analyse and verify how LLMs reason, step-by-step.

English
5
32
151
27.6K