Danica Sutherland

205 posts

Danica Sutherland

Danica Sutherland

@d_j_sutherland

ML at @UBC_CS and @AmiiThinks. Trans 🏳️‍⚧️, she/her.

Vancouver เข้าร่วม Mart 2010
665 กำลังติดตาม892 ผู้ติดตาม
Danica Sutherland
Danica Sutherland@d_j_sutherland·
coming back to x, the everything app, to say: Submit work, sign up to review, come to the workshop! This is a great chance to bring together a lot of really cool work that's been happening, but not all as connected (nor as easy to publish) as it should be! testing.ml
Feng Liu@AlexFengLiu1

Excited to share our ICML 2026 Hypothesis Testing Workshop in Seoul, this July! @icmlconf 🎉This workshop aims to bring together researchers developing modern hypothesis testing methodology and applying it to machine learning problems such as robustness, distribution shift, security, medicine, and LLM evaluation. In other words, if you care about how we make ML claims rigorous, this workshop is for you. We now have four confirmed speakers: Arthur Gretton @ArthurGretton, Yao Xie @yaoxie21851119, Bo Li @uiuc_aisecure, and Yisong Yue @yisongyue. The organizing team includes Xiuyuan Cheng (Duke), Feng Liu @AlexFengLiu1, Lester Mackey @LesterMackey, Shayak Sen @shayaksen, Danica J. Sutherland @d_j_sutherland, and Nathaniel Xu (UBC). 📌 Submission deadline: 10 May 2026 📌 Notification: 26 May 2026 📌 Camera-ready: 17 June 2026 📌 Workshop date: July 10 or 11, 2026 (TBA) 🚩Check more information below! 🔗Website: testing.ml 🔗Submission Portal: openreview.net/group?id=ICML.… We’re also recruiting PC members/reviewers. 🔗 Reviewer interest form: docs.google.com/forms/d/e/1FAI… 🏁Please feel free to share this with colleagues, collaborators, and students who may be interested. #ICML #ICML26

English
0
2
3
352
Danica Sutherland รีทวีตแล้ว
Feng Liu
Feng Liu@AlexFengLiu1·
Excited to share our ICML 2026 Hypothesis Testing Workshop in Seoul, this July! @icmlconf 🎉This workshop aims to bring together researchers developing modern hypothesis testing methodology and applying it to machine learning problems such as robustness, distribution shift, security, medicine, and LLM evaluation. In other words, if you care about how we make ML claims rigorous, this workshop is for you. We now have four confirmed speakers: Arthur Gretton @ArthurGretton, Yao Xie @yaoxie21851119, Bo Li @uiuc_aisecure, and Yisong Yue @yisongyue. The organizing team includes Xiuyuan Cheng (Duke), Feng Liu @AlexFengLiu1, Lester Mackey @LesterMackey, Shayak Sen @shayaksen, Danica J. Sutherland @d_j_sutherland, and Nathaniel Xu (UBC). 📌 Submission deadline: 10 May 2026 📌 Notification: 26 May 2026 📌 Camera-ready: 17 June 2026 📌 Workshop date: July 10 or 11, 2026 (TBA) 🚩Check more information below! 🔗Website: testing.ml 🔗Submission Portal: openreview.net/group?id=ICML.… We’re also recruiting PC members/reviewers. 🔗 Reviewer interest form: docs.google.com/forms/d/e/1FAI… 🏁Please feel free to share this with colleagues, collaborators, and students who may be interested. #ICML #ICML26
Feng Liu tweet media
English
1
10
55
12.9K
Danica Sutherland รีทวีตแล้ว
Yi (Joshua) Ren
Yi (Joshua) Ren@JoshuaRenyi·
📢Curious why your LLM behaves strangely after long SFT or DPO? We offer a fresh perspective—consider doing a "force analysis" on your model’s behavior. Check out our #ICLR2025 Oral paper: Learning Dynamics of LLM Finetuning! (0/12)
Yi (Joshua) Ren tweet media
English
6
115
794
87.4K
Danica Sutherland รีทวีตแล้ว
Ameya Velingker | अमेय वेलिंगकर
In our Spexphormer method, a smaller network estimates the attention scores of a larger one. This led us to a fundamental question: How small can the network be while still producing accurate estimates? We tackled this question through rigorous theoretical analysis. 1/
English
1
1
10
1.5K
Danica Sutherland รีทวีตแล้ว
Hamed Shirzad
Hamed Shirzad@HamedShirzad13·
As a reminder, we will have our poster session tomorrow: 📍 East Exhibit Hall, Poster #3010 📄 arxiv.org/abs/2411.16278 💻 github.com/hamed1375/Sp_E… To motivate you further, we have some insights gained from the attention score analysis of this work, which I'll share in this thread:
Hamed Shirzad@HamedShirzad13

Graph Transformers (GTs) can handle long-range dependencies and resolve information bottlenecks, but they’re computationally expensive. Our new model, Spexphormer, helps scale them to much larger graphs – check it out at @NeurIPSConf next week, or the preview here! #NeurIPS2024

English
1
2
12
1.1K
Danica Sutherland รีทวีตแล้ว
Yi (Joshua) Ren
Yi (Joshua) Ren@JoshuaRenyi·
LLM's self-play is ubiquitous. What will happen if M[t] iteratively learns from M[t-1] for too many generations? Come and chat with us at NeurIPS 🗓️Friday, Dec 13 📍East Exhibit Hall A-C, Poster #3305 ⏰11:00 AM–2:00 PM PST. [1/7]
Yi (Joshua) Ren tweet media
English
1
3
11
958
Danica Sutherland รีทวีตแล้ว
Hamed Shirzad
Hamed Shirzad@HamedShirzad13·
Graph Transformers (GTs) can handle long-range dependencies and resolve information bottlenecks, but they’re computationally expensive. Our new model, Spexphormer, helps scale them to much larger graphs – check it out at @NeurIPSConf next week, or the preview here! #NeurIPS2024
Hamed Shirzad tweet media
English
1
3
24
3.3K
Danica Sutherland
Danica Sutherland@d_j_sutherland·
Grokking on modular arithmetic: early (kernel) phase can overfit but cannot generalize, but with small regularization GD eventually escapes the kernel regime and can provably generalize. Poster #913 this afternoon at #ICML, come hear about it!
Mohamad Amin Mohamadi@QuelMohamadAmin

What causes 𝙜𝙧𝙤𝙠𝙠𝙞𝙣𝙜 on modular addition problems? Our #ICML2024 work identifies the 𝙥𝙚𝙧𝙢𝙪𝙩𝙖𝙩𝙞𝙤𝙣 𝙚𝙦𝙪𝙞𝙫𝙖𝙧𝙞𝙖𝙣𝙘𝙚 of the task as the root cause of poor generalization early in training. Paper: arxiv.org/abs/2407.12332

English
0
1
12
2.6K
Danica Sutherland รีทวีตแล้ว
Gautam Kamath
Gautam Kamath@thegautamkamath·
Previous live link is now private. Cut version is here: youtu.be/O2gpl5l2eQA?si…
YouTube video
YouTube
English
0
1
9
1.5K
Danica Sutherland
Danica Sutherland@d_j_sutherland·
If you want to avoid accidentally uploading comments in your tex source to arXiv, consider using a script like github.com/djsutherland/a… that strips them automatically (among other niceties). :p
DV@DV2559106965076

You might know that MSFT has released a 154-page paper (arxiv.org/abs/2303.12712) on #OpenAI #GPT4 , but do you know they also commented out many parts from the original version? 🧵: A thread of hidden information from their latex source code [1/n]

English
0
2
10
2K