

Miriam
437 posts

@superrzk
Junior NLP researcher, previously data scientist, CS @imperialcollege



Announcing Masader v2.0 with +500 Arabic NLP datasets, we have added many features 🧵 website: arbml.github.io/masader/ code: github.com/ARBML/masader

Oversight of foundation models requires multi-stakeholder partnerships, including independent organizations not driven by commercial incentives. We need to leverage the collective wisdom of the community and represent the diverse voices of the people that this technology impacts.






The data used in machine learning needs to be open for people to interrogate, while also controlled enough not to proliferate. Introducing our framework for Data Governance, which addresses these issues and more! A product of @BigscienceW at FAccT 2022. yjernite.github.io/content/LangDa…












