Indico Data Labs

83 posts

Indico Data Labs banner
Indico Data Labs

Indico Data Labs

@IndicoDataLabs

AI for Unstructured Data

Boston, MA Se unió Mayıs 2022
1 Siguiendo10 Seguidores
Indico Data Labs
Indico Data Labs@IndicoDataLabs·
There are some really compelling results in this paper (some intuitive, some not so much). The causality analysis shows some non-linearity worth further investigation and further analysis of the effect of parameter counts may be warranted, assuming the true dynamic is sigmoidal.
English
0
0
0
20
Indico Data Labs
Indico Data Labs@IndicoDataLabs·
Counterfactual analysis suggesting a causal relationship between removal of large numbers of relevant documents and QA performance.
Indico Data Labs tweet media
English
1
0
0
34
Indico Data Labs
Indico Data Labs@IndicoDataLabs·
Large Language Models Struggle to Learn Long-Tail Knowledge by: Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, and Colin Raffel arxiv.org/abs/2211.08411
English
1
0
1
90
Indico Data Labs
Indico Data Labs@IndicoDataLabs·
Overall, this work provides thorough and encouraging results for distilling pre-trained language models into recursive transformers. The idea of adding per-layer adaptors whilst re-using the MLP and Attention particularly interesting.
English
0
0
0
23
Indico Data Labs
Indico Data Labs@IndicoDataLabs·
The authors find that distilling the model with adaptors, that are different for each iteration of the recursive block, improves performance across all tasks. The adaptors seem to help the layers better mimic the behaviour of the separate layers from the teacher.
Indico Data Labs tweet media
English
1
0
0
31
Indico Data Labs
Indico Data Labs@IndicoDataLabs·
MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers by Nouriborji et at. Proposes a method for distilling Bert-Style transformers into Albert-style recursive transformers. arxiv.org/abs/2210.06425
Català
1
0
0
32
Indico Data Labs
Indico Data Labs@IndicoDataLabs·
Overall this was a refreshing work from OpenAI that shines light on often underappreciated aspects of ML -- dataset curation and generalization behavior! Models and code are openly available at: github.com/openai/whisper
English
0
0
0
0
Indico Data Labs
Indico Data Labs@IndicoDataLabs·
Finally, since a portion of the training examples were non-English audio transcriptions or non-English audio translated to English, the model can be used in these settings as well. Scaling trends show clear improvements from model scale, especially in multilingual settings.
Indico Data Labs tweet media
English
1
0
0
0