Rowel Atienza 🇵🇭

102 posts

Rowel Atienza 🇵🇭 banner
Rowel Atienza 🇵🇭

Rowel Atienza 🇵🇭

@jacobe

Creator of ViTSTR and EfficientSpeech (ICASSP2023) and co-creator of PARSeq. Professor & Scientist at the University of the Philippines.

University of the Philippines Se unió Şubat 2007
1K Siguiendo338 Seguidores
Emilio
Emilio@Em_alej·
@jacobe Amazing work, it generates spectrograms, but what is it using as a vocoder? I just found this repo while looking for something more efficient than Tacotron.
English
1
0
0
34
Rowel Atienza 🇵🇭
Rowel Atienza 🇵🇭@jacobe·
Most ARM chips can't run decent AI models. Introducing EfficientSpeech, a 266k-param TTS model. Low cost ARM chips like in RPi4 can generate 104sec of speech mel spec in 1sec. Here's an AI-generated video w/ voice from EfficientSpeech. Info: github.com/roatienza/effi… #ICASSP2023
English
2
1
1
534
Rowel Atienza 🇵🇭
Rowel Atienza 🇵🇭@jacobe·
@arXiv_Daily Simple yet effective idea: Remove inefficient top-most layers & replace them with an efficient head. For VWW, param count reduced by 93% with only 0.65% accuracy decrease. Counterintuitively, the quantized pruned net increased its accuracy on ARM Cortex M0.
English
0
0
0
0
AK
AK@_akhaliq·
.@Gradio & @huggingface for Rapid Deep Learning App Development by @jacobe link: @rowel/gradio-hugging-face-for-rapid-deep-learning-app-development-709a78e7ccc0" target="_blank" rel="nofollow noopener">medium.com/@rowel/gradio-…
AK tweet media
English
2
11
52
0
Rowel Atienza 🇵🇭
Rowel Atienza 🇵🇭@jacobe·
Idea: If data augmentation improves model generalization, why not use it to generate 2 new inputs and force the representations to agree. Result: Additional model performance improvement. Comparison: Unlike Label Smoothing, the performance of our method, AgMax, is consistent.
DeepAI@DeepAI

Improving Model Generalization by Agreement of Learned Representations from Data Augmentation deepai.org/publication/im… by @jacobe #ComputerVision #ImageNet

English
0
0
0
0
Rowel Atienza 🇵🇭 retuiteado
AI at Meta
AI at Meta@AIatMeta·
We’re introducing GSLM, the first language model that breaks free completely of the dependence on text for training. This “textless NLP” approach learns to generate expressive speech using only raw audio recordings as input. Learn more and get the code: ai.facebook.com/blog/textless-…
AI at Meta tweet media
English
14
331
1.2K
0
Rowel Atienza 🇵🇭
Rowel Atienza 🇵🇭@jacobe·
Yesterday, my former grad student Daryl gave a talk at Sony CSL Paris about his thesis on Next View Policy for 3D Reconstruction. Youtube: youtu.be/KdyDj3bjU0I
YouTube video
YouTube
English
0
0
2
0