Eric Malmi

354 posts

Eric Malmi banner
Eric Malmi

Eric Malmi

@ericmalmi

Staff Research Scientist at Google DeepMind, building Gemini | Adj. Prof @AaltoUniversity

Zurich, Switzerland Inscrit le Ekim 2012
836 Abonnements1.1K Abonnés
Tweet épinglé
Eric Malmi
Eric Malmi@ericmalmi·
Language models can't play chess, right?♟️ Excited to share our latest experiment that let's you play chess against a Gemini model!
Eric Malmi tweet media
English
5
11
56
11.5K
Eric Malmi
Eric Malmi@ericmalmi·
come chat to Jakub Adamek, @anianruoss, and me at the poster
Eric Malmi tweet media
English
0
0
6
198
Eric Malmi
Eric Malmi@ericmalmi·
if you're at #icml2025, come check out our spotlight poster on "Mastering Board Games by External and Internal Planning with Language Models" ♟️ 📜: arxiv.org/abs/2412.12119 ⏲️: Wed 16 Jul 11 am - 1:30 pm PDT 📍: East Exhibition Hall A-B #E-2508 demo: goo.gle/ChessChamp
English
1
0
6
363
Eric Malmi retweeté
Petar Veličković
Petar Veličković@PetarV_93·
Poster Spotlight! 🔦 Mastering Board Games by External and Internal Planning with Language Models ♟️ arxiv.org/abs/2412.12119 On Wednesday (Poster Session 3 East) Presented by Jakub Adamek and @ericmalmi
Petar Veličković tweet mediaPetar Veličković tweet media
English
1
1
17
629
Eric Malmi retweeté
Arena.ai
Arena.ai@arena·
🚨Breaking: New Gemini-2.5-Pro (06-05) takes the #1 spot across all Arenas again! 🥇 #1 in Text, Vision, WebDev 🥇 #1 in Hard, Coding, Math, Creative, Multi-turn, Instruction Following, and Long Queries categories Huge congrats @GoogleDeepMind!
Arena.ai tweet media
Google DeepMind@GoogleDeepMind

Gemini 2.5 Pro - our most intelligent model, is getting an update before general availability. ✨ It’s even better at: coding 🖥️, reasoning 💡, and creative writing ✍️ Learn more. 🧵

English
21
121
1K
310.3K
Eric Malmi
Eric Malmi@ericmalmi·
thank you for the recognition @GaryMarcus! there's room for improvement, but I find it quite remarkable that an LLM learns to play creative sacrifices like this (best move according to Stockfish)
Eric Malmi tweet media
Gary Marcus@GaryMarcus

@cfchabris @ericmalmi has kind of done that and it does pretty well except in weird positions - where it still sometimes make illegal moves. Confirming your conjecture and mine, if I understand his results correctly. arxiv.org/pdf/2412.12119…

English
0
0
10
520
Eric Malmi
Eric Malmi@ericmalmi·
@GaryMarcus @RepresenterTh you're welcome to test the MAV model (w/o MCTS) at: goo.gle/ChessChamp a few things to note: * for now, comments come from a different model so they can be ungrounded * MAV can play chess960, Hex, Connect4, but the Gem only supports chess
English
1
0
4
300
Gary Marcus
Gary Marcus@GaryMarcus·
Has anyone found an LLM that can reliably play chess, without making illegal moves?
English
31
8
104
18K
Eric Malmi retweeté
Arena.ai
Arena.ai@arena·
🚨Breaking: @GoogleDeepMind’s latest Gemini-2.5-Pro is now ranked #1 across all LMArena leaderboards 🏆 Highlights: - #1 in all text arenas (Coding, Style Control, Creative Writing, etc) - #1 on the Vision leaderboard with a ~70 pts lead! - #1 on WebDev Arena, surpassing Claude for the first time This is the first-ever sweep across text, vision, and WebDev by any model!🥇 Huge congrats to @GoogleDeepMind on this incredible breakthrough!
Arena.ai tweet media
Google DeepMind@GoogleDeepMind

We’re releasing an updated Gemini 2.5 Pro (I/O edition) to make it even better at coding. 🚀 You can build richer web apps, games, simulations and more - all with one prompt. In @GeminiApp, here's how it transformed images of nature into code to represent unique patterns 🌱

English
37
216
1.5K
530.2K
Eric Malmi
Eric Malmi@ericmalmi·
multiple long-time dreams coming true at once: ✅ give a talk at NeurIPS ♟️ play chess on a stage 🤡 make my international debut as a rapper thanks to the audience for a lively discussion that went on for a good hour after the talk and to my amazing co-presenters @anianruoss @weballergy @MatejJusup!
Eric Malmi tweet media
English
2
7
35
3.6K
Eric Malmi retweeté
Google Gemini
Google Gemini@GeminiApp·
Think you can outsmart Gemini? We challenge you to a chess match! Play Gemini in a game of chess with our newest Gem: Chess champ. Explore different openings as you banter back and forth with Gemini. Available in the Gemini web app. ♟️Can you beat it? → goo.gle/ChessChamp
English
34
110
804
59.4K
Eric Malmi
Eric Malmi@ericmalmi·
our work establishes new test-time scaling results for chess-playing LLMs ♟️📈 honestly, I think it's quite mind blowing that an LLM can learn to perform minimax tree search within a single model call and smoothly improve its Elo the more output tokens you give it 🤯
Eric Malmi tweet media
English
1
3
13
480
Eric Malmi retweeté
Justin Zhao
Justin Zhao@justinxzhao·
LLMs can play chess! In-context minimax search bootstrapped with values from Stockfish, implemented in Gemini. Paper: storage.googleapis.com/deepmind-media… Breadth 4, depth 2, you start running out of context window. Chess Elo improves with more test-time compute. Really cool work from @ericmalmi @GoogleDeepMind
Justin Zhao tweet mediaJustin Zhao tweet mediaJustin Zhao tweet mediaJustin Zhao tweet media
English
1
5
17
4.2K
Eric Malmi
Eric Malmi@ericmalmi·
if you're at #NeurIPS2024, want to learn how to make LLMs really good at chess and see a live demo, come and visit the @GoogleDeepMind booth tomorrow at 9:30 am!
Eric Malmi tweet media
English
4
4
23
1.8K
Eric Malmi
Eric Malmi@ericmalmi·
@PreethiLahoti Haha, you know me :) This is actually a great example of Gemini's generalization capabilities (no, I did not produce training data for this use case 😁)!
English
0
0
2
118
Preethi Lahoti
Preethi Lahoti@PreethiLahoti·
@ericmalmi Effortlessly combining your two passions: AI chess and AI rappers :)
English
1
0
2
188
Eric Malmi
Eric Malmi@ericmalmi·
Language models can't play chess, right?♟️ Excited to share our latest experiment that let's you play chess against a Gemini model!
Eric Malmi tweet media
English
5
11
56
11.5K
Eric Malmi retweeté
Przemyslaw Grabowicz
Przemyslaw Grabowicz@przemyslslaw·
Our first talk at @IC2S2 today shows that publishing early on arXiv and tweeting leads to more citations in the 5 years from the initial publication. Excellent talk and work by @c_bagchi! Our (revised, soon to appear) paper is accepted to ICWSM'25: arxiv.org/abs/2401.11116
Przemyslaw Grabowicz tweet media
Philadelphia, PA 🇺🇸 English
0
1
15
901