Chen Zhu

96 posts

Chen Zhu banner
Chen Zhu

Chen Zhu

@chenzhucs

@Meta. Past: xAI, Google Brain/Deepmind, Nvidia, UMD

Mountain View, CA Katılım Şubat 2018
1.5K Takip Edilen1K Takipçiler
Chen Zhu retweetledi
Rui Hou
Rui Hou@magpie_rayhou·
We are releasing our new model muse spark today - our first step towards personal super-intelligence after 9 month great team effort! Please try it out and tell us what you think!
Alexandr Wang@alexandr_wang

1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵

English
10
12
93
12.1K
Chen Zhu retweetledi
Zhiqing Sun
Zhiqing Sun@EdwardSun0909·
Excited to share Muse Spark, the first model from whole team’s work in MSL! 🚀 It’s natively multimodal and agentic. I’ve been using it for my daily coding and research tasks. Still plenty of room to improve in agentic domains, but we’re moving with great velocity. It’s a seriously good model! Check out the full breakdown and try it out in meta.ai
Alexandr Wang@alexandr_wang

1/ today we're releasing muse spark, the first model from MSL. nine months ago we rebuilt our ai stack from scratch. new infrastructure, new architecture, new data pipelines. muse spark is the result of that work, and now it powers meta ai. 🧵

English
9
26
196
19.1K
Chen Zhu retweetledi
Hongyu Ren
Hongyu Ren@ren_hongyu·
Check out Muse Spark, our first milestone in the quest for personal superintelligence! Scaling this with the team has been a total blast. Give it a spin and let us know what you think! 🥑
Hongyu Ren tweet mediaHongyu Ren tweet media
English
18
60
317
67K
Eric Jiang
Eric Jiang@veggie_eric·
profile pic of the best engineer at your company
Eric Jiang tweet media
English
208
869
22.4K
2.9M
Hieu Pham
Hieu Pham@hyhieu226·
I have made the difficult decision to leave @OpenAI. Working here and at @xai before was a once-in-a-lifetime experience. I have met the best people. Not the best people in AI. Not the best people in tech. Simply the best people. At these companies, I have helped creating extremely intelligent entities that will meaningfully improve our lives. The work makes me proud. But the intensive work came with a price. I cannot believe I would say this one day, but I am burnt out. All the mental health deteriorating that I used to scoff at is real, miserable, scary, and dangerous. I am going to take a break from frontier AI labs, and will take my family to my home country Vietnam. There, I will try something new, and also search for a cure for my conditions. I hope I will heal. Until then.
English
1.1K
414
14K
1.2M
Mohit Reddy
Mohit Reddy@MohitReddy13·
This has been fun to work on! What began as a two-person effort months ago is now a small, talent-dense team striving to improve and ship new model versions. We've had great support from @Yuhu_ai_, @jimmybajimmyba, and @elonmusk, with exciting updates coming soon! I believe we can transform how software engineers and organizations work in the coming years. Want to join us? Apply at job-boards.greenhouse.io/xai/jobs/48000… or DM me directly.
xAI@xai

Introducing Grok Code Fast 1, a speedy and economical reasoning model that excels at agentic coding. Now available for free on GitHub Copilot, Cursor, Cline, Kilo Code, Roo Code, opencode, and Windsurf. x.ai/news/grok-code…

English
46
166
540
58.9K
Ziniu Hu
Ziniu Hu@acbuller·
Been training the Grok Code Fast 1 model with the incredible team. It's a blazing fast model 🚀 that can solve a broad range of real-world agentic coding tasks. Excited to share it with the world, hope it help with your work!
xAI@xai

Introducing Grok Code Fast 1, a speedy and economical reasoning model that excels at agentic coding. Now available for free on GitHub Copilot, Cursor, Cline, Kilo Code, Roo Code, opencode, and Windsurf. x.ai/news/grok-code…

English
56
39
305
28.5K
Bill Yuchen Lin
Bill Yuchen Lin@billyuchenlin·
Software development is logical reasoning at scale, and xAI offers unparalleled resources and expertise to train best reasoning models. Join us to build fast, intelligent coding models and agents. Let’s shape the future of AI + software with @Grok MacroHard!
Elon Musk@elonmusk

Join @xAI and help build a purely AI software company called Macrohard. It’s a tongue-in-cheek name, but the project is very real! In principle, given that software companies like Microsoft do not themselves manufacture any physical hardware, it should be possible to simulate them entirely with AI.

English
12
23
493
22.7K
Chen Zhu retweetledi
Karan Vaidya
Karan Vaidya@KaranVaidya6·
Grok CLI >>> Cursor and Claude Code We wanted an IDE for @xai's @grok so we did something meta: we prompted the newly released Grok 4 to create... Grok CLI itself! Grok CLI can: 1. Modify local files and use the shell 2. Go through huge codebases and fix them 3. Persist for longer and solve really complex math and physics problems Instant setup on @Replit Bonus: Can create and run ai agents with @langchain @composio it's completely open source!! link to the code: github.com/ComposioHQ/gro… link to the repl: @abishkpatil/Grok-CLI" target="_blank" rel="nofollow noopener">replit.com/@abishkpatil/G…
English
68
287
2.1K
345.6K
Chen Zhu retweetledi
Lichang Chen
Lichang Chen@LichangChen2·
We implemented our ODIN, a two-head RM, one correlating with lengths(ignored in RL) and another uncorrelated with lengths (final reward in RL), into RLHFlow, which is an easy-to-use Repo. It could be a great baseline for reward hacking research! Try our code if you are doing RM/hacking research!🧐
Wei Xiong@weixiong_1

Thanks to @LichangChen2 , we also add some mored advanced reward modeling technique like ODIN and RRM in our reward modeling repo.

English
1
1
12
2.8K
Chen Zhu retweetledi
Han Fang
Han Fang@Han_Fang_·
A new RLHF paper from our team- The Perfect Blend: Redefining RLHF with Mixture of Judges arxiv.org/abs/2409.20370
English
9
50
232
88.7K
Chen Zhu retweetledi
Arena.ai
Arena.ai@arena·
Moreover, we observe even stronger performance in English category, where Llama 3 ranking jumps to ~1st place with GPT-4-Turbo! It consistently performs strong against top models (see win-rate matrix) by human preference. It's been optimized for dialogue scenario with large amount of instruction data in post-training. More analysis still ongoing with topic distribution and agreement study. We also look forward to details in Llama-3's technical report.
Arena.ai tweet mediaArena.ai tweet media
English
11
39
371
346.8K
Chen Zhu retweetledi
Chen Zhu retweetledi
Ahmad Al-Dahle
Ahmad Al-Dahle@Ahmad_Al_Dahle·
It’s here! Meet Llama 3, our latest generation of models that is setting a new standard for state-of-the art performance and efficiency for openly available LLMs. Key highlights • 8B and 70B parameter openly available pre-trained and fine-tuned models. • Trained on more than 15T tokens, 7x+ larger than Llama 2's dataset! • Improved tokenizer with vocabulary of 128K tokens for better performance. • State-of-the-art performance across industry benchmarks. • New capabilities, including enhanced reasoning and coding. • 3x more efficient training than Llama 2. • New trust and safety tools with Llama Guard 2, Code Shield, and CyberSec Eval 2. • Integrated into Meta AI, and available in more countries across our apps. • And, just the beginning with more models and new capabilities coming soon! Visit the Llama 3 website to read more and download the models. llama.meta.com/llama3
Ahmad Al-Dahle tweet media
English
62
194
955
328.3K
Chen Zhu retweetledi
Jeff Dean
Jeff Dean@JeffDean·
As part of this, we have an updated version of the Gemini 1.0 Technical Report. Significant updates are in Section 6, on Post-training of the models, and Section 7 on Responsible Deployment. (Arxiv version will be updated in a few days) storage.googleapis.com/deepmind-media…
English
2
14
101
21.1K