Yanping Huang

66 posts

Yanping Huang

@bignamehyp

Katılım Eylül 2014

99 Takip Edilen313 Takipçiler

Yanping Huang retweetledi

Agentica Project@Agentica_·10 Şub

✨RL magic is in the air! Introducing DeepScaleR-1.5B-Preview—a fully open-source, 1.5B-parameter model trained with RL to surpass o1-preview for general math reasoning. 📜Blog: pretty-radio-b75.notion.site/DeepScaleR-Sur… 💻Github: github.com/agentica-proje…

English

150

40.8K

Yanping Huang retweetledi

@alexgnewmedia@alexgnewmedia·1 Eyl

Minimax new txt2video tool is great. I got quality video results. I went into their website (hailuoai.com/video - you can use your country phone to log in- not as Kling in the beginning that only allowed chinese numbers) and i translated their document with usage recommendations and prompts, and made some videos with the provided prompts. Have fun!

English

492

Yanping Huang retweetledi

Michael Luo@michaelzluo·30 Nis

[1/5] Introducing Stylus 🖌️ - an #AI tool that automatically finds and adds the best adapters (LoRAs, Textual Inversions, Hypernetworks) to #StableDiffusion based on your prompt. 🗞️ Paper: arxiv.org/abs/2404.18928 🌎 Project Page: stylus-diffusion.github.io

GIF

English

27.5K

Yanping Huang@bignamehyp·20 Kas

@topocheng 回来不

日本語

114

Ylc@topocheng·19 Kas

💙

Sam Altman@sama

i love the openai team so much

ART

6.6K

Yanping Huang@bignamehyp·8 Şub

Diffusion model for music generation! google-research.github.io/noise2music/

English

434

Yanping Huang@bignamehyp·18 May

iseeaswell꩜bʂky@iseeaswell

How many languages can we support with Machine Translation? We train a translation model on 1000+ languages, using it to launch 24 new languages on Google Translate without any parallel data for these languages.arxiv.org/abs/2205.03983 Technical 🧵below: 1/18

English

Yanping Huang retweetledi

Google AI@GoogleAI·4 May

Alpa is a framework that uses just one line of code to easily automate the complex model parallelism process for large #DeepLearning models. Learn more and check out the code. goo.gle/3y7xZ1f

GIF

English

101

356

Yanping Huang@bignamehyp·22 Mar

@ericjang11 Sorry to see you leave. Best wishes!

English

Eric Jang@ericjang11·21 Mar

This is my last week at Google Brain, after nearly 6 years on the robotics team. Thank you Google, it's been really fun!✌️ evjang.com/2022/03/21/lea…

English

873

Yanping Huang retweetledi

William Fedus@LiamFedus·21 Şub

Proud to release our last year of work on sparse expert models! This started over a year ago when we found Switch Transformers pre-trained well, but some variants were unstable or fine-tuned poorly. The new SOTA ST-MoE-32B addresses this. arxiv.org/abs/2202.08906

English

278

Yanping Huang retweetledi

Google AI@GoogleAI·21 Oca

Open-domain dialog—where a model converses about any topic—is a key challenge for language models. Learn about LaMDA, a project to build dialog models that are more safe, grounded, high quality, and in line with our Responsible AI Principles ↓ goo.gle/3KJ2oXJ

English

102

359

Yanping Huang retweetledi

Thang Luong@lmthang·15 Oca

MoE models can be a future for large language models, but having to distill into small models can be painful. Our work, task-level MoE for extracting subnetworks, is a simple idea to bypass distillation. And we're still at the very beginning :) Blog: ai.googleblog.com/2022/01/learni…

GIF

Google AI@GoogleAI

Read all about Task-level Mixture-of-Experts (TaskMoE), a promising step towards efficiently training and deploying large models, with no loss in quality and with significantly reduced inference latency ↓ goo.gle/3I5ulXj

English

300

Yanping Huang retweetledi

Sneha Kudugunta@snehaark·15 Oca

We wrote a blogpost about our work on Task-level Mixture-of-Experts (TaskMoE), and why they're a great way to efficiently serve large models (vs more common approaches like training-> compression via distillation).

Google AI@GoogleAI

English

108

Yanping Huang retweetledi

Sherry Tongshuang Wu@tongshuangwu·5 Oca

Adding my two cents as a grad student — I genuinely think people here are very supportive & diversity is an active topic here, & Pedro D’s behaviors have not gone unnoticed. Before he retired, most faculty members used to openly disagree and challenge him on multiple issues, 🧵

Amy Zhang@amyxzh

Not gonna amplify that other tweet but I’ll instead point to this one. Speaking as a relatively newish member, I’ll say that I’ve found @uwcse to be a supportive place for women, w/ a strong commitment to diversity and inclusion from our leadership.

English

Yanping Huang retweetledi

University of Washington@UW·4 Oca

This is not how @uwcse — or any of us — imagined kicking off winter quarter, but we want to amplify this statement calling out the misogyny of an emeritus (retired) faculty member.

Allen School@uwcse

Today is the start of UW’s Winter Quarter, and #UWAllen is excited to welcome our students back. Without the benefit of a long break given our quarter system, instructors spent the holiday weekend preparing for the challenge of conducting the 1st week remotely due to omicron. 1/5

English

156

Yanping Huang@bignamehyp·14 Ara

@AaronJaech @quocleix @lmthang @andrewdai It's using only ⅓ of compute energy (in MWh) for training

English

Quoc Le@quocleix·14 Ara

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts 1.2T weight model that has better average zero and one-shot results than GPT-3 while using only ⅓ of compute for training. Arxiv: arxiv.org/abs/2112.06905 Blog: goo.gle/3dBNLWQ

English

234

Yanping Huang@bignamehyp·14 Ara

We share the recipes for training the 1.2T parameters GLaM MoE model on Arxiv today!

Quoc Le@quocleix

English

Yanping Huang@bignamehyp·10 Ara

The latest GLaM model is also powered by GSPMD! ai.googleblog.com/2021/12/more-e…

Yanping Huang@bignamehyp

We wrote a blogpost to unveil Google's model parallelism engine that allows the efficient training of large scale models like GShard-M4, LaMDA, and BigSSL.

English

Yanping Huang@bignamehyp·10 Ara

Our work shows that even a basic version of MoE can work extremely well for generative tasks at scale, and hence should be used as default for future scaling!

Google AI@GoogleAI

Today we introduce the Generalist Language Model (GLaM), a sparsely activated model that achieves better overall performance on 29 few-shot #NaturalLanguageProcessing benchmark tasks with just a third of the training energy cost. Learn more on the blog ↓ goo.gle/3dBNLWQ

English

Yanping Huang@bignamehyp·9 Ara

We wrote a blogpost to unveil Google's model parallelism engine that allows the efficient training of large scale models like GShard-M4, LaMDA, and BigSSL.

Google AI@GoogleAI

Today we present an open-source system to scale neural networks — often critical for improving model performance — by automatically parallelizing the model across devices, which enables researchers to more efficiently build and train large-scale models. goo.gle/3EEwloj

English

Yanping Huang@bignamehyp·3 Ara

Google demonstrates 63% computational efficiency, cutting edge in the industry, when training 480B parameter models on 2048 Cloud TPU-V4. cloud.google.com/blog/topics/tp… Link to model code: #L279" target="_blank" rel="nofollow noopener">github.com/tensorflow/lin…

English

Keşfet

@topocheng @ericjang11 @uwcse @quocleix @lmthang @andrewdai @elonmusk @BarackObama