Marc-Alexandre Côté (@Cote_Marc) - Twitter-Profil

Marc-Alexandre Côté retweetet

🎮🤖 Can games teach AI to understand the physical world? Excited to announce a special session at the 2026 IEEE Conference on Games (@ieee_cog): Evaluating and Advancing Spatial Intelligence through Games. Submit your research and join us in Madrid this September! 🇪🇸 🧵👇 (1/5)

English

2

11

19

4.8K

Marc-Alexandre Côté retweetet

Minseon Kim@kim__minseon·7 Oca

🚀 Safer & less over-refusal LLMs without retraining? Yes, it’s possible. We need a good context. Here’s a simple but powerful idea for LLM safety without further training your models 👀🧠 (arxiv.org/abs/2512.11986)

English

1

10

27

1.3K

Marc-Alexandre Côté retweetet

hyunji amy lee@hyunji_amy_lee·4 Kas

🚨 Excited to announce Gistify!, where a coding agent must extract the gist of a repository: generate a single, executable, and self-contained file that faithfully reproduces the behavior of a given command (e.g., a test or entrypoint). ✅ It is a lightweight, broadly applicable evaluation that tests whether models can reason at the codebase level 😯 Even strong LLMs/frameworks struggle, especially on long, multi-file traces!

English

6

44

106

23.6K

Marc-Alexandre Côté retweetet

Alessandro Sordoni@murefil·23 Eki

This was a great group effort ❤️. Check the thread below! My 2c: we train a 32B coding agent by distilling a strong teacher model on a mix of real and synthetic bugs generated by our new approach BugPilot 🛩️! BugPilot creates bugs unintentionally, by asking the teacher to insert new features in a given repo, and by checking whether the synthetic feature breaks existing functionality... a bug is born 🐣! Claude is such a strong teacher model, paired with our bugs, our 32B FrogBoss 🐸 (cause it eats bugs) model achieves 54.6 pass@1 (avg of 3 seeds) and ~67 pass@3. Just selecting the shortest over 3 (minimal TTS) gets us to ~56.8. 🚨 Internships here: we have many ideas so we'd be excited if you want to work with us going forward, pls apply! jobs.careers.microsoft.com/global/en/job/…

Isadora White@isadorcw

Excited to introduce our SoTA coding models, FrogBoss (32B) and FrogMini (14B), on SWE-Bench-Verified! (FrogBoss eats bugs… like a boss) 🐸🪲 These models were trained with bugs from a mix of existing and our new synthetic bug generation approach, called BugPilot. (1/n)

English

0

8

39

9.7K

Marc-Alexandre Côté retweetet

Isadora White@isadorcw·23 Eki

Excited to introduce our SoTA coding models, FrogBoss (32B) and FrogMini (14B), on SWE-Bench-Verified! (FrogBoss eats bugs… like a boss) 🐸🪲 These models were trained with bugs from a mix of existing and our new synthetic bug generation approach, called BugPilot. (1/n)

English

3

15

45

15.5K

Marc-Alexandre Côté retweetet

IVADO@IVADO_Qc·19 Ağu

The IVADO #Bootcamp marked the launch of the Thematic Semester on Autonomous #LLM Agents last week at @UMontreal. Over 4 days, researchers, experts, and #AI enthusiasts gathered for conferences, tutorials, and rich discussions, laying the groundwork for our next two workshops.

English

1

6

14

8.1K

Marc-Alexandre Côté@Cote_Marc·15 Ağu

@clefourrier Cool work! Always happy to see more people involved in the space of language agents and text-games (textgames.org). You should consider submitting this work to our upcoming EMNLP workshop wordplay-workshop.github.io

English

0

2

19

Clémentine Fourrier 🍊 is off till Dec 2026 hiking@clefourrier·12 Ağu

One of the best way to evaluate agents are games, because they are: - understandable by most people - interesting to analyze & test mult-capabilities Check out TextQuests, latest in this category, a text adventures benchmark where GPT5 is only at 40%📚 huggingface.co/blog/textquests

English

5

6

61

16.1K

Marc-Alexandre Côté@Cote_Marc·15 Ağu

@hendrycks Cool work! Always happy to see more people involved in the space of language agents and text-games (textgames.org). You should consider submitting this work to our upcoming EMNLP workshop wordplay-workshop.github.io

English

0

2

72

Dan Hendrycks@hendrycks·12 Ağu

Can AIs beat long video games? We made TextQuests to test GPT-5, Grok 4, Deepseek, etc. These games can often take people dozens of hours to beat. - AIs can't beat any of the games (without clues) - some AIs behave more viciously than others - AIs are getting better rapidly

English

17

73

16.9K

Marc-Alexandre Côté@Cote_Marc·4 Ağu

Excited to be presenting aka.ms/debug-gym next week at IVADO's Bootcamp: Focusing on the Current State of Agents ivado.ca/en/events/boot…

English

0

1

3

183

Marc-Alexandre Côté retweetet

Lucas Caccia@LucasPCaccia·25 Haz

RAG and in-context learning are the go-to approaches for integrating new knowledge into LLMs, making inference very inefficient We propose instead 𝗞𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝗠𝗼𝗱𝘂𝗹𝗲𝘀 : lightweight LoRA modules trained offline that can match RAG performance without the drawbacks

GIF

English

1

13

45

8.4K

Marc-Alexandre Côté retweetet

Eric Xingdi Yuan@ericxyuan·16 Haz

CFP of the Wordplay 2025 (EMNLP) is live! wordplay-workshop.github.io

Eric Xingdi Yuan@ericxyuan

Announcing the 5th Wordplay Workshop at EMNLP 2025 (Suzhou, China). We are co-organizing the CPDC Challenge (total prize value USD 20K!!!), the warm-up round is starting now! wordplay-workshop.github.io

English

0

6

17

4.8K

Marc-Alexandre Côté retweetet

LawZero - LoiZéro@LawZero_·3 Haz

Every frontier AI system should be grounded in a core commitment: to protect human joy and endeavour. Today, we launch @LawZero_, a nonprofit dedicated to advancing safe-by-design AI. lawzero.org

English

27

86

304

116.9K

Marc-Alexandre Côté retweetet

AIcrowd@aicrowdHQ·14 Nis

🎮 You're exploring your favourite RPG city. The blacksmith greets you, remembers you saved his life recommends a customised weapon upgrade. Build better NPCs that respond naturally, adapt dynamically, and recall your actions.👇aicrowd.com/cpdc2025

English

1

2

4.8K

Marc-Alexandre Côté retweetet

Prithviraj (Raj) Ammanabrolu@rajammanabrolu·22 Nis

Introducing TALES - Text Adventure Learning Environment Suite A benchmark of a few hundred text envs: science experiments and embodied cooking to solving murder mysteries. We test over 30 of the best LLM agents and pinpoint failure modes +how to improve 👨‍💻pip install tale-suite

English

2

20

77

16.1K

Marc-Alexandre Côté retweetet

Prithviraj (Raj) Ammanabrolu@rajammanabrolu·15 Nis

The Wordplay Workshop is back! 5th edition with EMNLP in Suzhou this Dec. We're also hosting a competition this time on making more realistic LLM powered NPCs in games! As always come by and chat all things text agents!

Prithviraj (Raj) Ammanabrolu tweet media

AIcrowd@aicrowdHQ

🎮 You're exploring your favourite RPG city. The blacksmith greets you, remembers you saved his life recommends a customised weapon upgrade. Build better NPCs that respond naturally, adapt dynamically, and recall your actions.👇aicrowd.com/cpdc2025

English

1

4

17

12.7K

Marc-Alexandre Côté@Cote_Marc·15 Nis

Announcing the 5th Wordplay Workshop at EMNLP 2025 (Suzhou, China). We are co-organizing the CPDC Challenge (total prize value USD 20K!!!), the warm-up round is starting now! wordplay-workshop.github.io

AIcrowd@aicrowdHQ

🎮 You're exploring your favourite RPG city. The blacksmith greets you, remembers you saved his life recommends a customised weapon upgrade. Build better NPCs that respond naturally, adapt dynamically, and recall your actions.👇aicrowd.com/cpdc2025

English

0

7

272

Marc-Alexandre Côté retweetet

Microsoft Research@MSFTResearch·10 Nis

Developers spend a lot of time debugging code. Learn how debug-gym can equip AI agents to help, enabling them to set breakpoints, navigate the codebase, and print runtime variable values on demand, so they better understand the code and its execution flow: msft.it/6017qF6RT

English

4

19

93

13.8K

Marc-Alexandre Côté retweetet

Mila - Institut québécois d'IA@Mila_Quebec·8 Nis

👏 A huge thank you to our co-organizers who made the 2025 Mila Techaide AI Conference possible and to @CentraideMtl and our generous sponsors for supporting AI research! The event is just days away. Buy your tickets now: ow.ly/41I750VwUJy

Mila - Institut québécois d'IA tweet media

English

1

3

7

3.7K

Marc-Alexandre Côté retweetet

Mila - Institut québécois d'IA@Mila_Quebec·27 Mar

David Ifeoluwa Adelani (@davlanade) will be a keynote speaker at the 2025 Mila Techaide AI Conference on April 17! He is an Assistant Professor at McGill University, a Core Academic Member at Mila and a Canada CIFAR AI Chair holder. Get your tickets now! ow.ly/GZlJ50VfV6w

English

1

5

16

3.2K

Marc-Alexandre Côté retweetet

Mila - Institut québécois d'IA@Mila_Quebec·2 Nis

Why attend the Mila Techaide AI Conference on April 17? 🎙️ Talks by leading AI experts 💡 Networking opportunities ❤️ Support a great cause Buy your ticket here: ow.ly/ypnn50VfXC7

English

2

3

7

3K

Marc-Alexandre Côté

Entdecken