GS Oh

10 posts

GS Oh

@GS_Oh_AI

Member of Technical Staff @ xAI RL Post-training | Trained Grok 4.1, 4.2 Previously at DeepMind (Gemini ~2.5 + Deep Research). PhD for Generative models + RL

Inscrit le Ağustos 2022

65 Abonnements75 Abonnés

GS Oh retweeté

X Freeze@XFreeze·19 Mar

The new Grok 4.20 Beta benchmarks are wild 🥇 #1 lowest hallucinating AI (22%) 🥇 #1 at following instructions (83%) 🥈 #2 in agentic tool use (97%) Grok 4.20 ranks #1 in the lowest hallucination rate ever recorded across all AI models tested globally Most models race to sound smart. Grok 4.20 was built to never lie and still dominates on instruction following and agentic tasks This is literally a 500B model performing top-notch in the things that matter most

English

219

178

4.4M

GS Oh@GS_Oh_AI·18 Mar

@Designarena @xai 🔥 Great work by the Imagine team!

English

495

Design Arena@Designarena·18 Mar

BREAKING: xAI and Kling have the strongest video and video editing models, as measured by 50+ video models on Design Arena #1 Video Generation: Grok Imagine by @xai #1 Video Editing: Grok Imagine by @xai #1 Image to Video Generation: Grok Imagine by @xai #1 Multi-Input to Video Generation: O1 Edit by @Kling_ai Congrats to @xai and @Kling_ai for defining SOTA!

English

152

627.6K

GS Oh retweeté

Arena.ai@arena·17 Mar

Grok 4.20 Beta Reasoning has landed #7 for Text Arena & #28 for Code Arena. The model is on par with DeepSeek-v3.2- thinking and Qwen3.5-122b-a10b in Code Arena's agentic webdev tasks. More Highlights: - #7 in Text Arena overall tied with GPT-5.4-high - top 10 in Math, Multi-Turn, Creative Writing, Coding & Hard Prompts - top 15 in Expert Arena Congrats to @xAi and @elonmusk on this new milestone.

English

257

19.2K

GS Oh retweeté

Artificial Analysis@ArtificialAnlys·12 Mar

The Grok 4.20 Beta shows three major improvements over Grok 4: ➤ Our lowest ever hallucination rate on the AA-Omniscience evaluation. When Grok did not know the answer, it hallucinated an incorrect answer 22% of the time - this is the lowest hallucination rate of any model we have tested, topping Claude Haiku 4.5 (25%) ➤ Top scores for instruction following and prompt adherence. On IFBench, Grok 4.20 takes the #1 spot with 82.9% - a +29.2 point increase on Grok 4 ➤ Leading speed for its intelligence. At 265 tokens per second output speed on xAI’s API, Grok 4.20 is significantly faster than its peer and over 2x the output speed seen from Grok 4.1 Fast Congratulations to @xai and @elonmusk on the 4.20 Beta 0309 launch!

English

224

298

2.3K

5.6M

GS Oh@GS_Oh_AI·10 Mar

@veggie_eric +1 to the blue guy

English

3.3K

Eric Jiang@veggie_eric·10 Mar

profile pic of the best engineer at your company

English

208

866

22.5K

2.9M

GS Oh@GS_Oh_AI·9 Mar

@SeongsikKi5837 @xai I'll miss you a lot! it was really fun working with you last few weeks

English

299

Seongsik Kim@SeongsikKi5837·9 Mar

Friday was my last day at @xAI. It truly was a wild ride—pushing the frontier on Grok 3, Grok 4, Grok 4.1 Fast and Macrohard. Grateful to have been on this rocketship, working with the most intense, brilliant people I’ve ever met. Ad astra 🚀

English

421

25.9K

GS Oh@GS_Oh_AI·26 Şub

@arena Grok 4.20 🚀

English

Arena.ai@arena·25 Şub

In the Text Arena, Grok-4.20-Beta1 ranks #4, scoring 1492 closing the gap to Gemini 3.1 Pro

English

207

31.1K

Arena.ai@arena·25 Şub

Grok 4.20 beta1 (single agent) debuts #1 on Search Arena, and #4 overall in Text Arena! Highlights: - #1 in Search, scoring 1226, leading GPT-5.2 and Gemini-3 - #4 in Text, scoring 1492 on par with Gemini 3.1 Pro Congrats to the @xAI team and @elonmusk on this impressive milestone!

English

234

239

1.8K

10.1M

GS Oh@GS_Oh_AI·26 Şub

Grok 4.20 beta1 has been out for a few days and it is an exciting one! I am personally excited and honored to deliver RL training recipes and to train Grok 4.20 to achieve #4 overall on Arena and #1 overall on Search Arena!

Arena.ai@arena

English

11.2K

GS Oh@GS_Oh_AI·10 Şub

@Yuhu_ai_ best wishes to you and your next endeavor!

English

518

Yuhuai (Tony) Wu@Yuhu_ai_·10 Şub

I resigned from xAI today. This company - and the family we became - will stay with me forever. I will deeply miss the people, the warrooms, and all those battles we have fought together. It's time for my next chapter. It is an era with full possibilities: a small team armed with AIs can move mountains and redefine what's possible. Thank you to the entire xAI family. Onward. 🚀 And to Elon @elonmusk - thank you for believing in the mission and for the ride of a lifetime.

English

742

365

9.3K

3.6M

GS Oh retweeté

SpaceX@SpaceX·3 Şub

SpaceX has acquired xAI, forming one of the most ambitious, vertically integrated innovation engines on (and off) Earth → #xai-joins-spacex" target="_blank" rel="nofollow noopener">spacex.com/updates#xai-jo…

English

3.9K

7.9K

45.2K

19.3M

Découvrir

@Designarena @xai @Kling_ai @xAi @elonmusk @veggie_eric @SeongsikKi5837 @arena