Yura Gorishniy

0

7

269

Yura Gorishniy@YuraFiveTwo·17 Nis

Detail 1: besides Muon, we also found AdamW+EMA to perform well for _plain_ MLPs. For stronger MLP-based models, such as TabM, Muon performs better.

English

10

491

Yura Gorishniy@YuraFiveTwo·17 Nis

A small yet practical update for tabular deep learning people: Muon is a strong alternative to AdamW for training modern tabular MLPs, including TabM. Give it a try! Overall, our technical report covers 15 optimizers: arxiv.org/abs/2604.15297 Details 👇

English

11

72

6.8K

Yura Gorishniy retweetledi

Dmitry Eremeev@eremeev_d42·13 Şub

Graph foundation model with SOTA results on real-world graphs! Our “GraphPFN: A Prior-Data Fitted Graph Foundation Model” paper recently got a major update, with better ICL performance, new ablations, code improvements and more! 🧵1/11

English

2

5

32

2.2K

Yura Gorishniy retweetledi

Kirill Mazur@makezur·18 Ara

Introducing 4D Primitive-Mâché (4DPM), a new method for replayable 4D reconstruction from monocular videos. We split dynamic scenes into 3D primitives and recover their motion. 4DPM can infer object positions even after they leave view. Joint work with @marwan_ptr @AjdDavison

English

5

25

174

32.3K

Yura Gorishniy@YuraFiveTwo·15 Tem

@vnjogani @akshay_pachaar Hi, I am one of TabM authors. In the paper, we don't do any kind of subsampling during training, though this is definitely possible. As for the "adapters", they perform elementwise operations, i.e. they are linear transformations, but not in the sense of torch.nn.Linear.

English

1

26

Vinit Jogani@vnjogani·15 Tem

@akshay_pachaar Is there feature and row sampling for the ensemble? Is the adapter just a linear layer?

English

0

744

Akshay 🚀@akshay_pachaar·14 Tem

ML researchers just built a new ensemble technique. It even outperforms XGBoost, CatBoost, and LightGBM. Here's a complete breakdown (explained visually):

English

25

171

1.9K

399.7K

Yura Gorishniy@YuraFiveTwo·3 Tem

@kanpuriyanawab @_avichawla Hi! TabM has been used in winning solutions in recent Kaggle competitions, for example: (1) kaggle.com/competitions/u… (2) kaggle.com/competitions/e… Yesterday, I shared a Python package for TabM to make it easier to try in practice: x.com/YuraFiveTwo/st…

TabM now has a Python package! TabM is a simple and powerful DL architecture for tabular data that efficiently imitates an ensemble of MLPs 🏆 TabM has been used in winning solutions on Kaggle, and performs well on TabReD -- a challenging benchmark! 💻 pip install tabm 👇Link

English

1

107

anshuman@athleticKoder·12 Haz

@_avichawla you wouldn’t be posting this if you would have ever participated in a Kaggle competition ;)

English

5

1

17

3.5K

Avi Chawla@_avichawla·12 Haz

ML researchers just built a new ensemble technique. It even outperforms XGBoost, CatBoost, and LightGBM. Here's a complete breakdown (explained visually):

English

31

242

2.7K

488.4K

Yura Gorishniy@YuraFiveTwo·3 Tem

@heptoop @_avichawla Hi, I am one of the TabM authors. The size of the shared MLP is actually "standard", but it is reused across k MLPs (k=32 in the paper). Plus each of the k MLPs has a little amount of non-shared weights. Also, see the new illustration in the linked tweet x.com/YuraFiveTwo/st…

TabM now has a Python package! TabM is a simple and powerful DL architecture for tabular data that efficiently imitates an ensemble of MLPs 🏆 TabM has been used in winning solutions on Kaggle, and performs well on TabReD -- a challenging benchmark! 💻 pip install tabm 👇Link

English

28

harpreet@heptoop·12 Haz

@_avichawla I’m not quite getting what about this makes it outcompete xgboost. Is the shared MLP a lot smaller than the typical size MLP we would use if we were training directly singly on that?

English

0

710

Yura Gorishniy@YuraFiveTwo·3 Tem

@AlexShtf I was glad to chat, thanks for coming by!

English

1

49

Alex Shtoff@AlexShtf·2 Tem

Has been fun showing up at Yura's poster at ICLR'25 and talking about this interesting work.

TabM now has a Python package! TabM is a simple and powerful DL architecture for tabular data that efficiently imitates an ensemble of MLPs 🏆 TabM has been used in winning solutions on Kaggle, and performs well on TabReD -- a challenging benchmark! 💻 pip install tabm 👇Link

English

0

5

351

Yura Gorishniy@YuraFiveTwo·2 Tem

@hydantess1993 @JFPuget That's great to hear, thank you for sharing!

English

2

133

hyd@hydantess1993·2 Tem

@JFPuget @YuraFiveTwo TabM is truly amazing! It changed my mind for tabular NN models.

English

8

512

Yura Gorishniy@YuraFiveTwo·2 Tem

TabM now has a Python package! TabM is a simple and powerful DL architecture for tabular data that efficiently imitates an ensemble of MLPs 🏆 TabM has been used in winning solutions on Kaggle, and performs well on TabReD -- a challenging benchmark! 💻 pip install tabm 👇Link

English

3

54

277

20.4K

Yura Gorishniy@YuraFiveTwo·2 Tem

@JFPuget Hi! Here are the links: (1) kaggle.com/competitions/u… (2) kaggle.com/competitions/e… They can also be found at the very beginning of README along with other practical notes

English

2

11

542

JFPuget 🇫🇷🇺🇦🇨🇦🇬🇱@JFPuget·2 Tem

@YuraFiveTwo Hi, in which kaggle winning solution has it been used?

English

3

0

3

824

Yura Gorishniy@YuraFiveTwo·2 Tem

@fabianjkrueger @_avichawla Hi! TabM now can be installed via pip. The package still requires familiarity with PyTorch, but hopefully the Colab example will make it easier to get started x.com/YuraFiveTwo/st…

TabM now has a Python package! TabM is a simple and powerful DL architecture for tabular data that efficiently imitates an ensemble of MLPs 🏆 TabM has been used in winning solutions on Kaggle, and performs well on TabReD -- a challenging benchmark! 💻 pip install tabm 👇Link

English

0

1

45

janosch@fabianjkrueger·12 Haz

@_avichawla @_avichawla does this (already) have an easy to use API / Python package? Or is it (still) in some dev mode and it cannot be used just like that?

English

0

772

Yura Gorishniy@YuraFiveTwo·2 Tem

Note: the package requires familiarity with PyTorch To help users get started, we provide a Jupyter notebook with an end-to-end example of training TabM: colab.research.google.com/github/yandex-…

English

1

16

637

Yura Gorishniy@YuraFiveTwo·2 Tem

github.com/yandex-researc… The package makes efficient ensembles for tabular data more accessible by providing: 🤖 TabM 🔧 Layers for building custom TabM-like models ✨ Functions for turning existing models into efficient ensembles The screenshot covers all three use cases :

English

15

953

Yura Gorishniy@YuraFiveTwo·13 Haz

@felixo_dmv @_avichawla Hey, I am one of the TabM authors. TabM is basically an efficiently implemented ensemble of MLPs, i.e. a somewhat generic architecture. So it should be applicable in many different contexts, just like MLP itself. Though we did not benchmark TabM on time series.

English

1

27

F d M V 🌾@felixo_dmv·12 Haz

@_avichawla Could it be used for time series imputation/forecasting?

English

2

0

1

1.3K

Yura Gorishniy@YuraFiveTwo·13 Haz

@wappledoobie @k_adeyemiai @_avichawla Hey, I am one of the authors. In the paper we report inference throughput on CPU and GPU. Since TabM is just a bunch of MLPs, it is surely slower than one plain MLP, but still practical and hardware-friendly. Furthermore, the number of MLPs can be greatly reduced, see Section 5.2

English

1

169

Wappledoobie@wappledoobie·12 Haz

@k_adeyemiai @_avichawla Nope have been following the authors for a bit now, they use standard “academic” benchmarks simulating different feature column types for classification/regression

English