Miriam

437 posts

Miriam

Miriam

@superrzk

Junior NLP researcher, previously data scientist, CS @imperialcollege

Katılım Eylül 2018
1.2K Takip Edilen215 Takipçiler
Miriam retweetledi
MMitchell
MMitchell@mmitchell_ai·
Reminder to everyone starting to publish in ML: "Foundation models" is *not* a recognized ML term; was coined by Stanford alongside announcing their center named for it; continues to be pushed by Sford as *the* term for what we've all generally (reasonably) called "base models".
Stanford HAI@StanfordHAI

Oversight of foundation models requires multi-stakeholder partnerships, including independent organizations not driven by commercial incentives. We need to leverage the collective wisdom of the community and represent the diverse voices of the people that this technology impacts.

English
32
57
400
0
Miriam
Miriam@superrzk·
The "Schddule Sent" at Gmail
GIF
English
0
0
0
0
Miriam retweetledi
Oskar van der Wal
Oskar van der Wal@oskarvanderwal·
🔨 Our recommendations to foster fairness of LLMs: 1) Transparent bias evaluations via scoping and documentation 2) Diversity of tested stereotypes for increased inclusivity 3) Creation of culturally aware datasets 4) General bias measures that can compare different model setups
English
1
1
9
0
Miriam retweetledi
Oskar van der Wal
Oskar van der Wal@oskarvanderwal·
In this table with 25 very large LMs, we show that LLMs are overwhelmingly trained on English texts and by homogeneous teams located in the USA. Furthermore, most of the LLMs are not evaluated for biases by their original creators.
Oskar van der Wal tweet media
English
2
7
20
0
Miriam retweetledi
Oskar van der Wal
Oskar van der Wal@oskarvanderwal·
2) Few bias benchmarks cover other languages than English. This exclusive focus on Anglo-centric contexts, hinders the much-needed evaluation of multilingual contexts. Translating the existing benchmarks wouldn't solve the problem, as stereotypes can vary greatly across cultures.
English
1
1
6
0
Miriam retweetledi
Oskar van der Wal
Oskar van der Wal@oskarvanderwal·
Further complicating the bias analysis, it is often difficult to separate the bias measures from the specific LLM setup (eg architecture), complicating the comparison of different setups. How can we compare/validate bias metrics for different contexts and (future) models?
English
1
1
5
0
Miriam retweetledi
Oskar van der Wal
Oskar van der Wal@oskarvanderwal·
As @BigScienceLLM is creating a large multilingual language model, we (the bias, fairness, and social impact WG @BigscienceW) discuss the challenges that we face in evaluating these models for biases in multilingual settings. 🌏🌎🌍
Oskar van der Wal tweet media
English
1
4
16
0
Miriam retweetledi
Yacine Jernite
Yacine Jernite@YJernite·
For friends who ask what it is I've actually been doing for the last year+: well lots of this 😛 It's been a unique opportunity to connect with and learn from many amazing interdisciplinary collaborators, stay tuned for a summary thread and come talk to us about it 🤗🌸
MMitchell@mmitchell_ai

The data used in machine learning needs to be open for people to interrogate, while also controlled enough not to proliferate. Introducing our framework for Data Governance, which addresses these issues and more! A product of @BigscienceW at FAccT 2022. yjernite.github.io/content/LangDa…

English
0
4
23
0
Miriam retweetledi
MMitchell
MMitchell@mmitchell_ai·
The data used in machine learning needs to be open for people to interrogate, while also controlled enough not to proliferate. Introducing our framework for Data Governance, which addresses these issues and more! A product of @BigscienceW at FAccT 2022. yjernite.github.io/content/LangDa…
English
13
107
475
0
Miriam retweetledi
arbml
arbml@arabicml2·
10. أبحاث : مجتمع مفتوح لمناقشة آخر التطورات في معالجة اللغة العربية GitHub: github.com/ARBML/Research
العربية
1
1
7
0
Miriam retweetledi
arbml
arbml@arabicml2·
9. بيانات: توفر هذه الأداة القدرة على عرض إحصائيات مختلفة من البيانات ، مثل التعرف على أكثر الكلمات المتكررة ، الحروف ، عدد الأسطر ، إلخ GitHub: github.com/ARBML/bayanat Demo: colab.research.google.com/github/ARBML/b…
arbml tweet media
العربية
1
1
6
0
Miriam retweetledi
arbml
arbml@arabicml2·
6. تنقيح : هي مكتبة لتنظيف البيانات تحتوي على عدة أدوات للتعامل مع التشكيل، والحروف الإنجليزية ، وتنظيف بيانات وسائل التواصل مثل تويتر ، الخ GitHub: github.com/ARBML/tnkeeh
arbml tweet media
العربية
1
2
8
0