Daniël Vos

23 posts

Daniël Vos

Daniël Vos

@daniel_a_vos

PhD student in machine learning (decision tree whisperer) at @tudelft 👨‍🎓 and organizer for the TU Delft CTF Team 👨‍💻

Delft, Nederland Katılım Ocak 2016
288 Takip Edilen126 Takipçiler
Daniël Vos
Daniël Vos@daniel_a_vos·
@gabrielpeyre @YihongWu7 Classification and Regression Trees (CART) is a greedy heuristic that runs efficiently but does not offer a performance guarantee. (e.g. XOR-shaped data can still be problematic)
English
0
0
4
127
Gabriel Peyré
Gabriel Peyré@gabrielpeyre·
Classification and Regression Trees (CART) define structured recursive classification and regression functions. O(n*log(n)) time global optimization (despite the exponential number of models) is achieved by dynamic programming. en.wikipedia.org/wiki/Decision_…
English
2
145
825
35.7K
Daniël Vos
Daniël Vos@daniel_a_vos·
@adad8m @_joaogui1 Yes, so input shape is 2 * p and output shape is p, in the figure p=97. I tried some other modular functions as well but I would have to do some digging to find my old code.
English
0
0
2
0
Daniël Vos
Daniël Vos@daniel_a_vos·
@adad8m @_joaogui1 Yes exactly! One-hot encoding and a 1 hidden layer MLP, trained with AdamW. The task used in the figures is modular addition. I wanted to see if I could get a minimal example where grokking occurs 🙂
English
0
0
2
0
Daniël Vos
Daniël Vos@daniel_a_vos·
@_joaogui1 When I replicated this work with 1 hidden layer ReLU networks it did seem like increasing width increased the sharpness of the grokking effect by a bit. (left: 128 neurons, right: 8192)
Daniël Vos tweet mediaDaniël Vos tweet media
English
0
0
8
0
Daniël Vos
Daniël Vos@daniel_a_vos·
@tverven @Hidde_Fokkema @RdeHeide Very interesting paper and I noticed it just while I was writing a section on the robustness of explanations! Great to see that the Dutch Railways were able to assist the paper with footnote 2 😄
English
1
0
1
0
Daniël Vos
Daniël Vos@daniel_a_vos·
If you are interested in robust optimization, decision trees, adversarial examples or all of the above, then come talk to me at #AAAI2022! Our poster is featuring now and tonight starting at 17:45 GMT+1
Daniël Vos tweet media
English
0
0
17
0
Daniël Vos
Daniël Vos@daniel_a_vos·
@HochreiterSepp It’s interesting that you observed this with SGD! I have been working on reproducing the paper’s results and have only been successful with AdamW. For AdamW I agree with @ykilcher ‘s intuition that weight decay gives a smooth function, I wonder what happens with ‘grokking’ SGD.
English
1
0
3
0
Sepp Hochreiter
Sepp Hochreiter@HochreiterSepp·
ArXiv arxiv.org/abs/2201.02177: Grokking: After completely overfitting to the training set, generalization performance improves rapidly. We often observed this with SGD when a flat minimum is found by random searching even after strong overfitting. Is a trick to get SOTA.
English
8
55
283
0
Daniël Vos
Daniël Vos@daniel_a_vos·
Please check our paper for many more results: arxiv.org/abs/2109.03857 And let's see if we can improve the efficiency of ROCT in the future to train deeper optimal trees. Soon I will explain more about the adversarial accuracy bound from the paper.
English
1
0
0
0
Daniël Vos
Daniël Vos@daniel_a_vos·
I'm proud to announce that my paper with @siccoverwer "Robust Optimal Classification Trees Against Adversarial Examples" has been accepted at the #AAAI2022 conference! 🎊 Paper: arxiv.org/abs/2109.03857 A thread with more details below 👇
English
3
1
18
0