Chenxi Liu

245 posts

Chenxi Liu

@chenxi116

Research scientist @Meta Superintelligence Labs. Previously @GoogleDeepMind Gemini post-trainer. Opinions my own.

Katılım Ağustos 2015

284 Takip Edilen1.6K Takipçiler

Chenxi Liu@chenxi116·23 May

or practice piano

Jesse Mu@jayelmnop

I recently moved to the Code RL team at Anthropic, and it’s been a wild and insanely fun ride. Join us! We are singularly focused on solving SWE. No 3000 elo leetcode, competition math, or smart devices. We want Claude n to build Claude n+1, so we can go home and knit sweaters.

English

830

Chenxi Liu@chenxi116·23 May

Way to go Tong!!

Google DeepMind@GoogleDeepMind

Watch Gemini 2.5 Pro Deep Think tackle the challenging "catch a mole" problem from @Codeforces. 🪤 This new mode is based on our research in parallel thinking and considers multiple hypotheses before responding. See it in action ↓

English

2.5K

Chenxi Liu retweetledi

Sundar Pichai@sundarpichai·20 May

Having a deep think...

English

856

946

30.3K

2.7M

Chenxi Liu@chenxi116·7 May

GIF

Arena.ai@arena

🚨Breaking: @GoogleDeepMind’s latest Gemini-2.5-Pro is now ranked #1 across all LMArena leaderboards 🏆 Highlights: - #1 in all text arenas (Coding, Style Control, Creative Writing, etc) - #1 on the Vision leaderboard with a ~70 pts lead! - #1 on WebDev Arena, surpassing Claude for the first time This is the first-ever sweep across text, vision, and WebDev by any model!🥇 Huge congrats to @GoogleDeepMind on this incredible breakthrough!

ZXX

1.5K

Chenxi Liu@chenxi116·25 Nis

This retweet is savage! But the last sentence is true

Demis Hassabis@demishassabis

The Gemini team cooked hard with Gemini 2.5 Pro, it's an awesome model that continues to lead @lmarena_ai - huge congrats to the team! Try it for yourself in the @GeminiApp now. Can't wait for you all to see what else we've been cooking 👀

English

118

12.5K

Chenxi Liu@chenxi116·18 Nis

2.5 Flash 🚀 This IS the norm 👇

English

2.9K

Chenxi Liu@chenxi116·12 Nis

The twitterers and redditters are crazy good vibe checkers

ρ:ɡeσn@pigeon__s

The release version of Llama 4 has been added to LMArena after it was found out they cheated, but you probably didn't see it because you have to scroll down to 32nd place which is where is ranks

English

1.5K

Chenxi Liu@chenxi116·25 Mar

BOOM! You might have guessed pro thinking, but bet you didn't expect 2.5 :) Seriously, congratulations to everyone involved. Everything came together so beautifully (and so fast!)

Arena.ai@arena

BREAKING: Gemini 2.5 Pro is now #1 on the Arena leaderboard - the largest score jump ever (+40 pts vs Grok-3/GPT-4.5)! 🏆 Tested under codename "nebula"🌌, Gemini 2.5 Pro ranked #1🥇 across ALL categories and UNIQUELY #1 in Math, Creative Writing, Instruction Following, Longer Query, and Multi-Turn! Massive congrats to @GoogleDeepMind for this incredible Arena milestone! 🙌 More highlights in thread👇

English

820

Chenxi Liu@chenxi116·9 Mar

I’ll take it! Even for those who feel 20 is overpay, some of the value lies in making this ravens draft fully BPA. I can’t remember the last time ravens entering the draft without 1 or 2 clear needs, let alone free agency! Let alone 11 picks! No excuses now EDC: go cook hard!

Adam Schefter@AdamSchefter

Ravens are re-signing LT Ronnie Stanley to a three-year, $60 million deal.

English

572

Chenxi Liu@chenxi116·24 Oca

TIL mendelssohn has a violin concerto *in d*??

English

322

Chenxi Liu@chenxi116·22 Oca

More to come, in several senses :)

Demis Hassabis@demishassabis

Our latest update to our Gemini 2.0 Flash Thinking model (available here: goo.gle/4jsCqZC) scores 73.3% on AIME (math) & 74.2% on GPQA Diamond (science) benchmarks. Thanks for all your feedback, this represents super fast progress from our first release just this past Dec! Latest version also includes code execution, a 1M token content window & a reduced likelihood of thought-answer contradictions. We’ve been pioneering these types of planning systems for over a decade, starting with programs like AlphaGo, and it is exciting to see the powerful combination of these ideas with the most capable foundation models.

English

170

19.6K

Chenxi Liu@chenxi116·21 Ara

what is vacation

English

772

Chenxi Liu@chenxi116·20 Ara

Gemini thinks its way to #1 on LMSYS. Across the board. Again. Using Flash.

Arena.ai@arena

Gemini-2.0-Flash-Thinking #1 across all categories!

English

1.1K

Chenxi Liu@chenxi116·19 Ara

🚨Gemini thinks! With thoughts visible! And multimodal! 2.0 flash wasn't an end; it was truly a beginning Gemini ships! And rocks!

Logan Kilpatrick@OfficialLoganK

It’s still an early version, but check out how the model handles a challenging puzzle involving both visual and textual clues: (2/3)

English

6.3K

Chenxi Liu retweetledi

Logan Kilpatrick@OfficialLoganK·19 Ara

🤔

ART

747

212.1K

Chenxi Liu@chenxi116·12 Ara

@Frances84030451 Congrats to us all Francesco! What a year!

English

529

Francesco Bertolini@Frances84030451·12 Ara

@chenxi116 You really rocked it Chenxi! Congrats team 🎉🎉🎉🎉 @chenxi116

English

567

Chenxi Liu@chenxi116·12 Ara

We've obviously trained this model for a little while, so today was just a normal day at work. But seeing the somewhat-familiar numbers, now not from raw internal docs but from nicely-formatted CEO's tweet, is an odd feeling unlike any other. Amazing stories. Insane team work.

Sundar Pichai@sundarpichai

We’re kicking off the start of our Gemini 2.0 era with Gemini 2.0 Flash, which outperforms 1.5 Pro on key benchmarks at 2X speed (see chart below). I’m especially excited to see the fast progress on coding, with more to come. Developers can try an experimental version in AI Studio and Vertex AI today. It is also available to try in @GeminiApp on the web today, mobile coming soon.

English

467

41.9K

Chenxi Liu@chenxi116·7 Ara

we sipped a boba tea, then, you guessed it, carried on sprinting

Jeff Dean@JeffDean

What a way to celebrate one year of incredible Gemini progress -- #1🥇across the board on overall ranking, as well as on hard prompts, coding, math, instruction following, and more, including with style control on. Thanks to the hard work of everyone in the Gemini team and elsewhere at Google! 🎊

English

3.3K

Chenxi Liu@chenxi116·22 Kas

we ate a pudding, then carried on sprinting Gemini-Exp-1121 and Gemini-Exp-1114 score much higher than Gemini-1.5-Pro-002

Arena.ai@arena

Woah, huge news again from Chatbot Arena🔥 @GoogleDeepMind’s just released Gemini (Exp 1121) is back stronger (+20 points), tied #1🏅Overall with the latest GPT-4o-1120 in Arena! Ranking gains since Gemini-Exp-1114: - Overall #3 → #1 - Overall (StyleCtrl): #5 -> #2 - Hard Prompts (StyleCtrl): #3 → #1 - Coding: #3 → #1 - Vision: #1 - Math: #2 → #1 - Creative Writing #2 → #1 Congrats again @GoogleDeepMind! The LLM race is on fire — progress is now measured in days! See more analysis below👇

English

7.8K

Chenxi Liu@chenxi116·15 Kas

For me weirdly, this was the more exciting news today 🤣

Google Gemini@GeminiApp

The Gemini app, now available on iPhone. Download it now in the App Store → goo.gle/4hN1SZe

English

734

Keşfet

@Frances84030451 @elonmusk @BarackObama @taylorswift13 @cristiano @BillGates @NASA @nikifrancismediavine