Sultan Alrashed

4 posts

Sultan Alrashed banner
Sultan Alrashed

Sultan Alrashed

@srashedll

Pretraining multilingual language models

Saudi Arabia Katılım Şubat 2024
16 Takip Edilen18 Takipçiler
Sultan Alrashed retweetledi
Larry Dial
Larry Dial@classiclarryd·
New NanoGPT Speedrun WR at 97.8 (-1.2s) from @srashedll , with an update to the attention initialization. Motivated by mimetic initialization techniques, experiments uncovered that small random init outperformed zero init on attention out projection. github.com/KellerJordan/m…
English
0
9
68
8.7K
Sultan Alrashed retweetledi
Tyler Chang
Tyler Chang@tylerachang·
Very very excited that Global PIQA is out! This was an incredible effort by 300+ researchers from 65 countries. The resulting dataset is a high-quality, participatory, and culturally-specific benchmark for over 100 languages.
Multilingual Representation Workshop @ EMNLP 2025@mrl_workshop

Introducing Global PIQA, a new multilingual benchmark for 100+ languages. This benchmark is the outcome of this year’s MRL shared task, in collaboration with 300+ researchers from 65 countries. This dataset evaluates physical commonsense reasoning in culturally relevant contexts.

English
2
5
20
1.6K
Sultan Alrashed retweetledi
TianyLin
TianyLin@tianylin·
Announcing 𝐟𝐥𝐚𝐬𝐡-𝐦𝐮𝐨𝐧: a 🐍 pkg with customized CUDA kernel that aims to boost Muon optimizer: github.com/nil0x9/flash-m… 1/n
English
5
35
250
21.4K