Sebastian Riedel (@[email protected])

1.9K posts

Sebastian Riedel (@riedelcastro@sigmoid.social) banner
Sebastian Riedel (@riedelcastro@sigmoid.social)

Sebastian Riedel (@[email protected])

@riedelcastro

Researcher in NLP/ML @deepmind, @ucl_nlp, @[email protected] on Mastodon

London, England Katılım Eylül 2009
456 Takip Edilen16.4K Takipçiler
Sebastian Riedel (@[email protected]) retweetledi
Sohee Yang
Sohee Yang@soheeyang_·
🚨 New Paper 🧵 How effectively do reasoning models reevaluate their thought? We find that: - Models excel at identifying unhelpful thoughts but struggle to recover from them - Smaller models can be more robust - Self-reevaluation ability is far from true meta-cognitive awareness
Sohee Yang tweet media
English
4
27
129
10.1K
Sebastian Riedel (@[email protected]) retweetledi
Shrestha Basu Mallick
Shrestha Basu Mallick@shresbm·
The Gemini 2.0 era begins with 2.0 Flash Experimental release ⚡️ 📈2.0 Flash beats 1.5 Pro across factuality, reasoning, coding, math. 📳 More modalities - image and audio out (in EAP) 🔧 Native tool use for Google Search, code execution and 3P functions 🆕 a new multimodal, realtime API experience 🎬 3 new cool starter apps - spatial understanding, video analyzer and map explorer
Google AI Developers@googleaidevs

We just released Gemini 2.0 Flash Experimental ⚡ Available in the Gemini API and Google AI Studio for testing, it allows developers to build interactive experiences with better performance and multimodal capabilities. goo.gle/3BriaXd

English
6
13
183
13.8K
Sebastian Riedel (@[email protected]) retweetledi
Sohee Yang
Sohee Yang@soheeyang_·
🚨 New Paper 🚨 Can LLMs perform latent multi-hop reasoning without exploiting shortcuts? We find the answer is yes – they can recall and compose facts not seen together in training or guessing the answer, but success greatly depends on the type of the bridge entity (80%+ for country, 6% for year)! 1/N
GIF
English
7
51
203
47K
Pasquale Minervini
Pasquale Minervini@PMinervini·
TBH, all papers in my batch were outstanding; it was really tricky to rank them
Pasquale Minervini tweet media
English
1
1
5
1.6K
Pasquale Minervini
Pasquale Minervini@PMinervini·
Is it me or "LLMs can't plan" sounds like "Python can't plan"? (what does that even mean?)
English
4
0
16
1.7K
Pasquale Minervini
Pasquale Minervini@PMinervini·
@Niel_Eu25 Ok let's replace "Prolog can't do planning" with "A Prolog AI can't do planning" -- does it magically make sense to you now? 🙂
English
1
0
0
151
Tim Dettmers
Tim Dettmers@Tim_Dettmers·
After 7 months on the job market, I am happy to announce: - I joined @allen_ai - Professor at @CarnegieMellon from Fall 2025 - New bitsandbytes maintainer @Titus_vK My main focus will be to strengthen open-source for real-world problems and bring the best AI to laptops 🧵
English
155
85
2.4K
252.3K
Sebastian Riedel (@[email protected]) retweetledi
Nicola Cancedda
Nicola Cancedda@nicola_cancedda·
I am looking for a Research Scientist intern for 2025. If you have already published work that involves understanding behaviours of AI models looking at their parameters and activations, I would like to hear from you. metacareers.com/jobs/556063310…
English
5
48
323
45.1K
Sebastian Riedel (@[email protected]) retweetledi
Ledell Wu
Ledell Wu@LedellWu·
We are launching Design Your Own Avatar (DYOA)! With our latest innovations in multimodal generation at @Creatify_AI , you can now create ultra realistic AI avatars from text description and bring it to life! This unblocks a whole new level of possibilities. Check it out: buff.ly/4dsvt6x
Creatify AI@Creatify_AI

THE FUTURE IS HERE! For the first time ever, you can now create your own 100% AI avatar from scratch. Simply describe your ideal representative, from their appearance to their surroundings, and our technology will bring your vision to life! Available now at: buff.ly/4dsvt6x (examples in thread) [No real people were filmed in the making of this video.]

English
8
5
31
9.3K
Sebastian Riedel (@[email protected]) retweetledi
Eduardo Sánchez
Eduardo Sánchez@eduardosg_ai·
🚨NEW BENCHMARK🚨 Are LLMs good at linguistic reasoning if we minimize the chance of prior language memorization? We introduce Linguini🍝, a benchmark for linguistic reasoning in which SOTA models perform below 25%. w/ @b_alastruey, @artetxem, @costajussamarta et al. 🧵(1/n)
Eduardo Sánchez tweet media
English
3
21
115
22.3K