Kefan XIAO (@KevinKiao) - Twitter Profili | Zamantika Mersobahis Locabet

Sabitlenmiş Tweet

Kefan XIAO@KevinKiao·18 Kas

Gemini3 is out! Personally super proud of the coding capabilities in dev’s daily usage. Have been dedicating to it with @pengchengyin for the last several months and partnering with @melvinjohnsonp team to launch it!

Sundar Pichai@sundarpichai

Introducing Gemini 3 ✨ It’s the best model in the world for multimodal understanding, and our most powerful agentic + vibe coding model yet. Gemini 3 can bring any idea to life, quickly grasping context and intent so you can get what you need with less prompting. Find Gemini 3 Pro rolling out today in the @Geminiapp and AI Mode in Search. For developers, build with it now in @GoogleAIStudio and Vertex AI. Excited for you to try it!

English

13

6

44

5.9K

Kefan XIAO@KevinKiao·15h

@AndrewDai @yinfeiy @ElorianAI congrats Andrew! Best of luck and can't wait to see what you will build!

English

0

18

Andrew M. Dai@AndrewDai·1d

After almost 12 years in Brain/DeepMind, I’ve finally decided to take the leap. My cofounders: @yinfeiy, Seth and I have kicked-off @ElorianAI. The first multimodal reasoning lab founded and led by former LLM pretraining, data and multimodal leads. youtu.be/YlvfNpOMeOY?si… (1/n)

YouTube

English

82

71

778

315.7K

Kefan XIAO@KevinKiao·15 Oca

Sad if it’s true. Thinking machine is nothing without its people…

Alex Heath@alexeheath

Sources: More Thinking Machines employees are in the process of joining OpenAI after three of the startup’s co-founders (all ex-OA) rejoined yesterday (More in the newsletter later today)

English

0

8

2.9K

Kefan XIAO@KevinKiao·15 Oca

Pure weightlifting take: they need to add 45lbs ones. Also where can I buy some?

Ryan Petersen@typesfast

But does your startup have a branded squat rack?

English

0

1

620

Kefan XIAO retweetledi

Shashwat Goel@ShashwatGoel7·20 Ara

New Blogpost: How to game the METR plot🚨 In 2025, a single graph changed AGI timelines, investments, research priorities, model quality assessments and much more. But if you squint harder, only 14 prompts shaped AI discourse over this year. Thats all the data in the 1-4 hour horizon length regime that matters. 🕵️ What's more? A majority of these are about Cybersecurity capture the flag contests, and training a Machine Learning model. > Post-train your model on CTF and ML codebases > profit 📈! its METR horizon length will increase. Exactly what OpenAI has been targeting in its Codex model releases... and is Anthropic underperforming in the 2-4hr range because it mostly consists of cybersecurity, which is dual-use for safety? To be clear, I think its an excellent idea to track horizon lengths instead of benchmark accuracy. But under the current modelling assumption of success probability being a logistic function of task length, SWAA+HCAST accuracy improvements alone might explain the exponential progress in horizon length 🔎 In the blog, I show detailed evidence for why we need to stop overindexing on the METR plot. Share it with anyone you see making decisions based on where the latest model lands on the METR plot. shash42.substack.com/p/how-to-game-…

English

37

69

764

206.3K

Kefan XIAO@KevinKiao·17 Ara

swebench is better than pro 🫡

Noam Shazeer@NoamShazeer

Gemini 3 Flash is live. ⚡️ We’ve packed Gemini 3’s Pro-grade reasoning into a leaner model with Flash-level latency, efficiency, and cost. It's my favorite model to use – the latency feels like a real conversation, with the deep intelligence intact. Available in the API, Gemini App, and Search. Give it a spin. bit.ly/4pTo5YU

English

2

0

36

9.6K

Kefan XIAO retweetledi

Logan Kilpatrick@OfficialLoganK·20 Kas

Introducing Nano Banana Pro 🍌 aka Gemini 3 Pro Image, our new SOTA image generation and editing model. It is all the things you loved about @NanoBanana, but with some wild new improvements. It is available right now for developers in the Gemini API and in the Gemini App!

English

118

99

1.9K

105K

Kefan XIAO@KevinKiao·20 Kas

♊️ 🚀

Vals AI@ValsAI

Gemini 3 is #1 on our independent SWE-Bench leaderboard

ART

0

3

46

3.9K

Kefan XIAO@KevinKiao·19 Kas

@siamaksha Thank you Siamak! Great teamwork with @pengchengyin

English

0

1

79

Siamak Shakeri@siamaksha·19 Kas

Kudos to @KevinKiao, he did lots of the heavy lifting for swebench and more.

Kefan XIAO@KevinKiao

New SOTA on official swebench leaderboard with Gemini 3 pro! swebench.com We carefully designed our RL so it works without overfitting. Please try the model out and give us feedbacks!

English

1

0

4

277

Kefan XIAO@KevinKiao·19 Kas

New SOTA on official swebench leaderboard with Gemini 3 pro! swebench.com We carefully designed our RL so it works without overfitting. Please try the model out and give us feedbacks!

Kilian Lieret@KLieret

Gemini 3 Pro sets new record on SWE-bench verified: 74%! (evaluated with minimal agent) Costs are 1.6x of GPT-5, but still cheaper than Sonnet 4.5. Gemini iterates longer than everyone; run your agent with a step limit of >100 for max performance. Details & full agent logs in 🧵

English

3

4

81

14.3K

Kefan XIAO retweetledi

Melvin Johnson@melvinjohnsonp·18 Kas

I’m especially proud of where we landed on coding and agentic use cases. Looking at the charts for Terminal-bench 2.0, SWE-Bench and 2-Bench compared to Gemini 2.5 shows the incredible jump, but using it to actually solve hard problems is the real win.

English

1

7

443

Kefan XIAO@KevinKiao·19 Kas

@melvinjohnsonp congrats Melvin! Has been a great push!

English

0

1

75

Melvin Johnson@melvinjohnsonp·18 Kas

Been waiting a long time to share this one. Meet Gemini 3 Pro. It’s our most intelligent multimodal model that’s deeply capable. x.com/sundarpichai/s…

Sundar Pichai@sundarpichai

Introducing Gemini 3 ✨ It’s the best model in the world for multimodal understanding, and our most powerful agentic + vibe coding model yet. Gemini 3 can bring any idea to life, quickly grasping context and intent so you can get what you need with less prompting. Find Gemini 3 Pro rolling out today in the @Geminiapp and AI Mode in Search. For developers, build with it now in @GoogleAIStudio and Vertex AI. Excited for you to try it!

English

11

17

226

28.4K

Kefan XIAO@KevinKiao·18 Kas

@sunjiao123sun_ Great work!

English

1

0

3

994

Jiao Sun@sunjiao123sun_·18 Kas

In the past two months, our small Webapp Coding team have been cooking hard to make Gemini great at WedDev, and we are thrilled to claim the 👑! Yes, we saw your enthusiasm — pelican riding a bike, game controller, please keep trying and sending your best WebDev prompts to our the way! We love them! Besides Webdev Arena, we also achieved #1 on Design Arena across categories: website gen, game gen, ui component gen etc! Website lovers, designers, we can’t wait to hear your feedback!

Google DeepMind@GoogleDeepMind

Our first release is Gemini 3 Pro, which is rolling out globally starting today. It significantly outperforms 2.5 Pro across the board: 🥇 Tops LMArena and WebDev @arena leaderboards 🧠 PhD-level reasoning on Humanity’s Last Exam 📋 Leads long-horizon planning on Vending-Bench 2

English

15

19

264

153.5K

Kefan XIAO retweetledi

Nicholas Moy@thenickmoy·18 Kas

And it’s great at software engineering too!

Sundar Pichai@sundarpichai

Introducing Gemini 3 ✨ It’s the best model in the world for multimodal understanding, and our most powerful agentic + vibe coding model yet. Gemini 3 can bring any idea to life, quickly grasping context and intent so you can get what you need with less prompting. Find Gemini 3 Pro rolling out today in the @Geminiapp and AI Mode in Search. For developers, build with it now in @GoogleAIStudio and Vertex AI. Excited for you to try it!

English

0

1

7

799

Kefan XIAO@KevinKiao·18 Kas

@Skiminok @pengchengyin @melvinjohnsonp Thanks Alex! You had been pushing hard on this as well!

English

0

1

69

🇺🇦 Alex Polozov@Skiminok·18 Kas

@KevinKiao @pengchengyin @melvinjohnsonp Such an impressive model, congrats!

English

1

0

5

305

Kefan XIAO@KevinKiao·18 Kas

Gemini3 is out! Personally super proud of the coding capabilities in dev’s daily usage. Have been dedicating to it with @pengchengyin for the last several months and partnering with @melvinjohnsonp team to launch it!

Sundar Pichai@sundarpichai

Introducing Gemini 3 ✨ It’s the best model in the world for multimodal understanding, and our most powerful agentic + vibe coding model yet. Gemini 3 can bring any idea to life, quickly grasping context and intent so you can get what you need with less prompting. Find Gemini 3 Pro rolling out today in the @Geminiapp and AI Mode in Search. For developers, build with it now in @GoogleAIStudio and Vertex AI. Excited for you to try it!

English

13

6

44

5.9K

Kefan XIAO@KevinKiao·18 Kas

@_arohan_ @pengchengyin @melvinjohnsonp Thanks Rohan!

English

0

1

137

rohan anil@_arohan_·18 Kas

@KevinKiao @pengchengyin @melvinjohnsonp Congrats!

English

1

0

3

142

Kefan XIAO@KevinKiao·18 Kas

@ankesh_anand Has been a great journey!🫡💪

English

1

0

2

228

Ankesh Anand@ankesh_anand·18 Kas

Gemini3 Pro is out, very exciting to be able to push the frontier with this one! There was never a dull day post-training this model, I hope the combination of a strong base model with sota reasoning is evident! This is obviously a big leap compared to 2.5 Pro, but I am excited about our research agenda more than ever. The models will continue to get smarter!

English

7

8

98

19.5K

Kefan XIAO@KevinKiao·18 Kas

And this has been a great team work with many friends!

English

0

1

152

Kefan XIAO@KevinKiao·18 Kas

One behavior I really like of gemini3 pro is that it actively uses tools to explore and verify. And it has been helping my daily works! Please apply it in your workflow and tell us how do you feel!

English

1

0

1

179

Kefan XIAO@KevinKiao·18 Kas

@_mohansolo Congrats Varun!

English

0

2

331