Daniël Vos

23 posts

Daniël Vos

@daniel_a_vos

PhD student in machine learning (decision tree whisperer) at @tudelft 👨‍🎓 and organizer for the TU Delft CTF Team 👨‍💻

Delft, Nederland Katılım Ocak 2016

288 Takip Edilen126 Takipçiler

Daniël Vos@daniel_a_vos·20 Eyl

@gabrielpeyre @YihongWu7 Classification and Regression Trees (CART) is a greedy heuristic that runs efficiently but does not offer a performance guarantee. (e.g. XOR-shaped data can still be problematic)

English

127

Daniël Vos@daniel_a_vos·20 Eyl

@gabrielpeyre @YihongWu7 Both the problems of finding the size-limited tree that minimizes loss and the smallest tree that achieves 0 loss are NP-complete. Dynamic programming can be used when the objective is separable w.r.t. the leaves, but this runs exponentially in tree size: proceedings.neurips.cc/paper_files/pa…

English

136

Gabriel Peyré@gabrielpeyre·20 Eyl

Classification and Regression Trees (CART) define structured recursive classification and regression functions. O(n*log(n)) time global optimization (despite the exponential number of models) is achieved by dynamic programming. en.wikipedia.org/wiki/Decision_…

English

145

825

35.7K

Daniël Vos@daniel_a_vos·4 Tem

@adad8m @_joaogui1 Yes, so input shape is 2 * p and output shape is p, in the figure p=97. I tried some other modular functions as well but I would have to do some digging to find my old code.

English

Daniël Vos@daniel_a_vos·4 Tem

@adad8m @_joaogui1 Yes exactly! One-hot encoding and a 1 hidden layer MLP, trained with AdamW. The task used in the figures is modular addition. I wanted to see if I could get a minimal example where grokking occurs 🙂

English

Daniël Vos@daniel_a_vos·3 Tem

@_joaogui1 When I replicated this work with 1 hidden layer ReLU networks it did seem like increasing width increased the sharpness of the grokking effect by a bit. (left: 128 neurons, right: 8192)

English

Daniël Vos@daniel_a_vos·1 Haz

@tverven @Hidde_Fokkema @RdeHeide Very interesting paper and I noticed it just while I was writing a section on the robustness of explanations! Great to see that the Dutch Railways were able to assist the paper with footnote 2 😄

English

Tim van Erven@tverven·1 Haz

We just proved the first impossibility result in explainable machine learning with @Hidde_Fokkema and @RdeHeide. arxiv.org/abs/2205.15834 🧵1/n

English

338

Daniël Vos@daniel_a_vos·26 Şub

If you are interested in robust optimization, decision trees, adversarial examples or all of the above, then come talk to me at #AAAI2022! Our poster is featuring now and tonight starting at 17:45 GMT+1

English

Daniël Vos@daniel_a_vos·7 Oca

@HochreiterSepp It’s interesting that you observed this with SGD! I have been working on reproducing the paper’s results and have only been successful with AdamW. For AdamW I agree with @ykilcher ‘s intuition that weight decay gives a smooth function, I wonder what happens with ‘grokking’ SGD.

English

Sepp Hochreiter@HochreiterSepp·7 Oca

ArXiv arxiv.org/abs/2201.02177: Grokking: After completely overfitting to the training set, generalization performance improves rapidly. We often observed this with SGD when a flat minimum is found by random searching even after strong overfitting. Is a trick to get SOTA.

English

283

Daniël Vos@daniel_a_vos·2 Ara

References: Chen et al., 2019: arxiv.org/abs/1902.10660 TREANT: arxiv.org/abs/1907.01197 GROOT: arxiv.org/abs/2012.10438

English

Daniël Vos@daniel_a_vos·2 Ara

Please check our paper for many more results: arxiv.org/abs/2109.03857 And let's see if we can improve the efficiency of ROCT in the future to train deeper optimal trees. Soon I will explain more about the adversarial accuracy bound from the paper.

English

Daniël Vos@daniel_a_vos·2 Ara

I'm proud to announce that my paper with @siccoverwer "Robust Optimal Classification Trees Against Adversarial Examples" has been accepted at the #AAAI2022 conference! 🎊 Paper: arxiv.org/abs/2109.03857 A thread with more details below 👇

English

Daniël Vos@daniel_a_vos·22 Tem

Excited to present the first paper for my PhD at #ICML2021, if you have any questions I would love to hear them at the virtual poster session (18:00-20:00 CEST)! #betterposter @analytics_cyber

English

Keşfet

@gabrielpeyre @YihongWu7 @tverven @HochreiterSepp @ykilcher @siccoverwer @analytics_cyber @elonmusk