Yura Gorishniy

124 posts

Yura Gorishniy

Yura Gorishniy

@YuraFiveTwo

I do research on deep learning for tabular data

Katılım Haziran 2021
132 Takip Edilen420 Takipçiler
Sabitlenmiş Tweet
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
TabM now has a Python package! TabM is a simple and powerful DL architecture for tabular data that efficiently imitates an ensemble of MLPs 🏆 TabM has been used in winning solutions on Kaggle, and performs well on TabReD -- a challenging benchmark! 💻 pip install tabm 👇Link
Yura Gorishniy tweet media
English
3
54
277
20.4K
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
Detail 2: we don't even mention this in the text, but weight decay tuning spaces in this technical report are improved compared to our prior work (the spaces vary between models). That's it for today! No big plans for this technical report, but hopefully it will come handy :)
English
1
0
7
269
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
Detail 1: besides Muon, we also found AdamW+EMA to perform well for _plain_ MLPs. For stronger MLP-based models, such as TabM, Muon performs better.
Yura Gorishniy tweet media
English
1
1
10
491
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
A small yet practical update for tabular deep learning people: Muon is a strong alternative to AdamW for training modern tabular MLPs, including TabM. Give it a try! Overall, our technical report covers 15 optimizers: arxiv.org/abs/2604.15297 Details 👇
English
1
11
72
6.8K
Yura Gorishniy retweetledi
Dmitry Eremeev
Dmitry Eremeev@eremeev_d42·
Graph foundation model with SOTA results on real-world graphs! Our “GraphPFN: A Prior-Data Fitted Graph Foundation Model” paper recently got a major update, with better ICL performance, new ablations, code improvements and more! 🧵1/11
Dmitry Eremeev tweet media
English
2
5
32
2.2K
Yura Gorishniy retweetledi
Kirill Mazur
Kirill Mazur@makezur·
Introducing 4D Primitive-Mâché (4DPM), a new method for replayable 4D reconstruction from monocular videos. We split dynamic scenes into 3D primitives and recover their motion. 4DPM can infer object positions even after they leave view. Joint work with @marwan_ptr @AjdDavison
English
5
25
174
32.3K
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
@vnjogani @akshay_pachaar Hi, I am one of TabM authors. In the paper, we don't do any kind of subsampling during training, though this is definitely possible. As for the "adapters", they perform elementwise operations, i.e. they are linear transformations, but not in the sense of torch.nn.Linear.
English
0
0
1
26
Vinit Jogani
Vinit Jogani@vnjogani·
@akshay_pachaar Is there feature and row sampling for the ensemble? Is the adapter just a linear layer?
English
1
0
0
744
Akshay 🚀
Akshay 🚀@akshay_pachaar·
ML researchers just built a new ensemble technique. It even outperforms XGBoost, CatBoost, and LightGBM. Here's a complete breakdown (explained visually):
English
25
171
1.9K
399.7K
anshuman
anshuman@athleticKoder·
@_avichawla you wouldn’t be posting this if you would have ever participated in a Kaggle competition ;)
English
5
1
17
3.5K
Avi Chawla
Avi Chawla@_avichawla·
ML researchers just built a new ensemble technique. It even outperforms XGBoost, CatBoost, and LightGBM. Here's a complete breakdown (explained visually):
English
31
242
2.7K
488.4K
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
@heptoop @_avichawla Hi, I am one of the TabM authors. The size of the shared MLP is actually "standard", but it is reused across k MLPs (k=32 in the paper). Plus each of the k MLPs has a little amount of non-shared weights. Also, see the new illustration in the linked tweet x.com/YuraFiveTwo/st…
Yura Gorishniy@YuraFiveTwo

TabM now has a Python package! TabM is a simple and powerful DL architecture for tabular data that efficiently imitates an ensemble of MLPs 🏆 TabM has been used in winning solutions on Kaggle, and performs well on TabReD -- a challenging benchmark! 💻 pip install tabm 👇Link

English
0
0
0
28
harpreet
harpreet@heptoop·
@_avichawla I’m not quite getting what about this makes it outcompete xgboost. Is the shared MLP a lot smaller than the typical size MLP we would use if we were training directly singly on that?
English
1
0
0
710
hyd
hyd@hydantess1993·
@JFPuget @YuraFiveTwo TabM is truly amazing! It changed my mind for tabular NN models.
English
1
1
8
512
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
TabM now has a Python package! TabM is a simple and powerful DL architecture for tabular data that efficiently imitates an ensemble of MLPs 🏆 TabM has been used in winning solutions on Kaggle, and performs well on TabReD -- a challenging benchmark! 💻 pip install tabm 👇Link
Yura Gorishniy tweet media
English
3
54
277
20.4K
janosch
janosch@fabianjkrueger·
@_avichawla @_avichawla does this (already) have an easy to use API / Python package? Or is it (still) in some dev mode and it cannot be used just like that?
English
1
0
0
772
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
github.com/yandex-researc… The package makes efficient ensembles for tabular data more accessible by providing: 🤖 TabM 🔧 Layers for building custom TabM-like models ✨ Functions for turning existing models into efficient ensembles The screenshot covers all three use cases :
Yura Gorishniy tweet media
English
1
1
15
953
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
@felixo_dmv @_avichawla Hey, I am one of the TabM authors. TabM is basically an efficiently implemented ensemble of MLPs, i.e. a somewhat generic architecture. So it should be applicable in many different contexts, just like MLP itself. Though we did not benchmark TabM on time series.
English
0
0
1
27
F d M V 🌾
F d M V 🌾@felixo_dmv·
@_avichawla Could it be used for time series imputation/forecasting?
English
2
0
1
1.3K
Yura Gorishniy
Yura Gorishniy@YuraFiveTwo·
@wappledoobie @k_adeyemiai @_avichawla Hey, I am one of the authors. In the paper we report inference throughput on CPU and GPU. Since TabM is just a bunch of MLPs, it is surely slower than one plain MLP, but still practical and hardware-friendly. Furthermore, the number of MLPs can be greatly reduced, see Section 5.2
Yura Gorishniy tweet media
English
0
0
1
169
Wappledoobie
Wappledoobie@wappledoobie·
@k_adeyemiai @_avichawla Nope have been following the authors for a bit now, they use standard “academic” benchmarks simulating different feature column types for classification/regression
English
1
0
1
313
Yura Gorishniy retweetledi
Dmitry Baranchuk
Dmitry Baranchuk@DmitryBaranchuk·
I'd like to share our new diffusion distillation method, SwD, which produces few-step generators with progressive resolution scaling over the diffusion process. On SD3.5, SwD matches the speed of two full-size steps but with much better quality. Demo & models are released. (1/9)
Dmitry Baranchuk tweet media
English
2
2
9
1.2K
Yura Gorishniy retweetledi
Anastasiia Koloskova
Anastasiia Koloskova@Ana_koloskova·
I am excited to announce that I will join the University of Zurich as an assistant professor in August this year! I am looking for PhD students and postdocs starting from the fall. My research is on optimization, federated learning, machine learning, privacy, and unlearning.
Anastasiia Koloskova tweet media
English
29
85
1.1K
75.3K