Shyam Tailor

54 posts

Shyam Tailor

Shyam Tailor

@satailor96

PhD student supervised by @niclane7 at @cambridge_uni. Interested in making machine learning more efficient - especially irregularly structured data.

Cambridge, UK Katılım Temmuz 2020
186 Takip Edilen206 Takipçiler
Shyam Tailor
Shyam Tailor@satailor96·
We're not *quite* as good at pruning at init as the lottery ticket hypothesis (LTH) says is possible, but we've made a big jump in this work! A good question is: how can we get it to work better for ImageNet? We make progress but there's still work to do here 🙂
Milad Alizadeh@notmilad

Happy to share our latest work, where we use meta-gradients to prune networks *before training*. Check out the paper and code to see how we did it and how this connects to the lottery ticket hypothesis!✂️🍀#ICLR2022 📄 arxiv.org/abs/2202.08132 👨‍💻 github.com/mil-ad/prospr 🧵1/6👇

English
0
0
5
0
Shyam Tailor
Shyam Tailor@satailor96·
8/n @yitayML et al. found that pretrained convolutions performed well relative to pretrained (anisotropic) transformers. And @liuzhuang1234 et al. found that convnets are still competitive with ViTs. Is it really unbelievable that similar results apply for GNNs?
English
1
1
2
0
Shyam Tailor
Shyam Tailor@satailor96·
@chaitjo I'll send you an email today with some stuff :-)
English
0
0
0
0
Chaitanya K. Joshi
Chaitanya K. Joshi@chaitjo·
@satailor96 What are some key papers on hardware accelerators for GNNs? I'd like to add this aspect to the list, too.
English
1
0
0
0
Peter Battaglia
Peter Battaglia@PeterWBattaglia·
In new work led by Jonathan Godwin, Michael Schaarschmidt, et al, we show how very deep GNNs can be trained effectively using a simple noise/denoising scheme, and achieve top performance on several challenging molecular benchmarks: arxiv.org/abs/2106.07971
Peter Battaglia tweet media
English
9
58
257
0
Chaitanya K. Joshi
Chaitanya K. Joshi@chaitjo·
@PetarV_93 @urialon1 @yahave Sidenote on scaling: Another GAT implementation detail I found curious/interesting is that DGL's GAT is significantly more optimized than any other library. E.g. You cannot train full-batch PyG GATs on ogbn-ARXIV yet; current top performing models use DGL GAT (+ many tricks).
English
1
0
1
0
Petar Veličković
Petar Veličković@PetarV_93·
Important read of the day: GATv2 (Brody, @urialon1, @yahave): arxiv.org/abs/2105.14491 The exact attention mechanism I used in the GAT paper was intentionally 'weakened' to make it work on the easy-to-overfit datasets of the time. It was never meant to be a 'silver bullet'... 1/2
Petar Veličković tweet media
English
5
20
90
0
Shyam Tailor
Shyam Tailor@satailor96·
@weihua916 Thanks for sharing, this was a fun read! Out of curiosity -- do you have any insight into why Swish helps so much? In fig5 it looks like helps to nearly close the gap when using no edge features (!)
English
1
0
0
0
Weihua Hu
Weihua Hu@weihua916·
Excited to give a contributed talk on "ForceNet: A GNN for Large-Scale Quantum Chemistry Calculations” at Deep Learning for Simulation Workshop at ICLR 2021 Paper: simdl.github.io/files/62.pdf Talk: #collapse-sl-3685" target="_blank" rel="nofollow noopener">iclr.cc/virtual/2021/w… Done as part of Open Catalyst Project opencatalystproject.org
Weihua Hu tweet media
English
3
16
84
0