Yanping Huang

66 posts

Yanping Huang

Yanping Huang

@bignamehyp

Katılım Eylül 2014
99 Takip Edilen313 Takipçiler
Yanping Huang retweetledi
@alexgnewmedia
@alexgnewmedia@alexgnewmedia·
Minimax new txt2video tool is great. I got quality video results. I went into their website (hailuoai.com/video - you can use your country phone to log in- not as Kling in the beginning that only allowed chinese numbers) and i translated their document with usage recommendations and prompts, and made some videos with the provided prompts. Have fun!
English
1
1
5
492
Yanping Huang retweetledi
Google AI
Google AI@GoogleAI·
Alpa is a framework that uses just one line of code to easily automate the complex model parallelism process for large #DeepLearning models. Learn more and check out the code. goo.gle/3y7xZ1f
GIF
English
5
101
356
0
Eric Jang
Eric Jang@ericjang11·
This is my last week at Google Brain, after nearly 6 years on the robotics team. Thank you Google, it's been really fun!✌️ evjang.com/2022/03/21/lea…
English
51
19
873
0
Yanping Huang retweetledi
William Fedus
William Fedus@LiamFedus·
Proud to release our last year of work on sparse expert models! This started over a year ago when we found Switch Transformers pre-trained well, but some variants were unstable or fine-tuned poorly. The new SOTA ST-MoE-32B addresses this. arxiv.org/abs/2202.08906
William Fedus tweet media
English
5
62
278
0
Yanping Huang retweetledi
Google AI
Google AI@GoogleAI·
Open-domain dialog—where a model converses about any topic—is a key challenge for language models. Learn about LaMDA, a project to build dialog models that are more safe, grounded, high quality, and in line with our Responsible AI Principles ↓ goo.gle/3KJ2oXJ
English
7
102
359
0
Yanping Huang retweetledi
Thang Luong
Thang Luong@lmthang·
MoE models can be a future for large language models, but having to distill into small models can be painful. Our work, task-level MoE for extracting subnetworks, is a simple idea to bypass distillation. And we're still at the very beginning :) Blog: ai.googleblog.com/2022/01/learni…
GIF
Google AI@GoogleAI

Read all about Task-level Mixture-of-Experts (TaskMoE), a promising step towards efficiently training and deploying large models, with no loss in quality and with significantly reduced inference latency ↓ goo.gle/3I5ulXj

English
1
55
300
0
Yanping Huang retweetledi
Sneha Kudugunta
Sneha Kudugunta@snehaark·
We wrote a blogpost about our work on Task-level Mixture-of-Experts (TaskMoE), and why they're a great way to efficiently serve large models (vs more common approaches like training-> compression via distillation).
Google AI@GoogleAI

Read all about Task-level Mixture-of-Experts (TaskMoE), a promising step towards efficiently training and deploying large models, with no loss in quality and with significantly reduced inference latency ↓ goo.gle/3I5ulXj

English
3
20
108
0
Yanping Huang retweetledi
Sherry Tongshuang Wu
Sherry Tongshuang Wu@tongshuangwu·
Adding my two cents as a grad student — I genuinely think people here are very supportive & diversity is an active topic here, & Pedro D’s behaviors have not gone unnoticed. Before he retired, most faculty members used to openly disagree and challenge him on multiple issues, 🧵
Amy Zhang@amyxzh

Not gonna amplify that other tweet but I’ll instead point to this one. Speaking as a relatively newish member, I’ll say that I’ve found @uwcse to be a supportive place for women, w/ a strong commitment to diversity and inclusion from our leadership.

English
1
10
59
0
Yanping Huang retweetledi
University of Washington
This is not how @uwcse — or any of us — imagined kicking off winter quarter, but we want to amplify this statement calling out the misogyny of an emeritus (retired) faculty member.
Allen School@uwcse

Today is the start of UW’s Winter Quarter, and #UWAllen is excited to welcome our students back. Without the benefit of a long break given our quarter system, instructors spent the holiday weekend preparing for the challenge of conducting the 1st week remotely due to omicron. 1/5

English
8
24
156
0
Quoc Le
Quoc Le@quocleix·
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts 1.2T weight model that has better average zero and one-shot results than GPT-3 while using only ⅓ of compute for training. Arxiv: arxiv.org/abs/2112.06905 Blog: goo.gle/3dBNLWQ
Quoc Le tweet media
English
5
57
234
0
Yanping Huang
Yanping Huang@bignamehyp·
Our work shows that even a basic version of MoE can work extremely well for generative tasks at scale, and hence should be used as default for future scaling!
Google AI@GoogleAI

Today we introduce the Generalist Language Model (GLaM), a sparsely activated model that achieves better overall performance on 29 few-shot #NaturalLanguageProcessing benchmark tasks with just a third of the training energy cost. Learn more on the blog ↓ goo.gle/3dBNLWQ

English
0
10
63
0
Yanping Huang
Yanping Huang@bignamehyp·
Google demonstrates 63% computational efficiency, cutting edge in the industry, when training 480B parameter models on 2048 Cloud TPU-V4. cloud.google.com/blog/topics/tp… Link to model code: #L279" target="_blank" rel="nofollow noopener">github.com/tensorflow/lin…
English
0
2
13
0