
Theo Rekatsinas
748 posts

Theo Rekatsinas
@thodrek
Machine Learning Systems, Data Management & Knowledge Graphs @Apple; Ex-Professor @ETH & @UWMadison; Co-founder of inductiv (acquired by @Apple)






📣 The latest ERC Starting Grant competition results are out! 📣 494 bright minds awarded €780 million to fund research ideas at the frontiers of science. Find out who, where & why 👉 europa.eu/!hrxyBp 🇪🇺 #EUfunded #FrontierResearch #ERCStG @HorizonEU @EUScienceInnov





A 8B-3.5T hybrid SSM model gets better accuracy than an 8B-3.5T transformer trained on the same dataset: * 7% attention, the rest is Mamba2 * MMLU jumps from 50 to 53.6% * Training efficiency is the same * Inference cost is much less arxiv.org/pdf/2406.07887







9/ But we really need a step change here. Every major AI breakthrough over the past two decades has been driven by better and more data—dating back to the original deep neural network of AlexNet on ImageNet. Scaling laws clearly illustrate where we're headed—we need more data!



Not convinced about using random sampling for data pruning? Consider twice! In our recent work, we introduce Repeated Sampling of Random Subsets: arxiv.org/abs/2305.18424, where we sample a subset of data at each epoch of training instead of only once at the beginning!


