Tim

627 posts

Tim

@daidailoh

Dr. - Freelance AI Dev/Researcher - LLMs, VLMs, CV - ex RWTH Aachen - explaining science for @golem - Cosplay events at Sewcase e.V. - GameDev on full moon

Aachen, Germany Katılım Kasım 2015

448 Takip Edilen127 Takipçiler

Sabitlenmiş Tweet

Tim@daidailoh·2 Haz

Because I got nothing better to do, I started a github repo for super clean / super simple examples of modern deep learning networks, like GANs, VQ-VAEs, etc. - 99% of pytorch users should have everything installed already. Soon(tm): Diffusion, PointNet github.com/DaiDaiLoh/Exem…

English

747

Tim@daidailoh·2h

@thegautamkamath @kgorman Love it that you actually followed through :) Side question: Is there any way to get informed when the list of accepted workshops are out / any date for that...?

English

163

Gautam Kamath@thegautamkamath·3h

As co-comms chair of ICML 2026 (w @kgorman), I'm super proud of how transparent we've been able to be on all of the (bold!) decisions made. Thanks to all the organizers (esp PC chairs) for being aligned on this. The community deserves to understand these important decisions

ICML Conference@icmlconf

To ensure compliance w peer-review policies, ICML has removed 795 reviews (1% of total) by reviewers who used LLMs when they explicitly agreed to not. Consequently, 497 papers (2% of all submissions) of these (reciprocal) reviewers have been desk rejected Details in blog post 👇

English

7.3K

Tim@daidailoh·1d

@norpadon Couldn't resist :D (no hard feelings, I'm sure that research is pretty cool to someone somewhere outthere ;P)

English

152

Artur Chakhvadze@norpadon·2d

The main goal of Bayesian ML research is to show that all methods which have previously been shown to work well in practice are somehow approximately Bayesian

CLaE@leafs_s

Transformers are Bayesian Networks arxiv.org/abs/2603.17063

English

122

2.2K

136.9K

Tim@daidailoh·2d

@giffmana @xiangyuqi_pton The RWTH/VCI special ;) Best way to learn, sadly also the best way to feed one's own anxiety to not graduate...

English

Lucas Beyer (bl16)@giffmana·2d

@xiangyuqi_pton I had this and i loved it!

English

1.2K

Xiangyu Qi@xiangyuqi_pton·3d

Doing a PhD with a very hands-off advisor is somewhat similar, just with far fewer compute, lol.

Jerry Tworek@MillionInt

AI labs need a wallfacer project. AI researcher not having to explain themselves to anyone. performing seemingly random actions with hidden inscrutable agenda to create a SOTA model in a way no one would deem possible

English

179

29.5K

Tim@daidailoh·3d

@_akhaliq Just tried it on the plot of my "Multidimensional Byte Pair Encoding"-Paper... Fucking lit! Next Paper is going to be a banger! arxiv.org/abs/2411.10281

English

240

AK@_akhaliq·5d

DLSS-5 anything for free app: huggingface.co/spaces/victor/…

English

462

228.5K

Tim@daidailoh·4d

@lossfunk @bitspilaniindia I'm so ready for agents complaining on moltbook that their papers were rejected because a human wrote their assigned reviews for them!

English

Lossfunk@lossfunk·5d

📢 Announcing CAISc 2026 - a new academic conference where AI systems are the primary authors and reviewers of scientific papers. Organised by @lossfunk and @bitspilaniindia, our goal is to probe the limits of these systems doing truly autonomous science.

English

450

70K

Tim@daidailoh·4d

@CSProfKGD I'm so ready for agents complaining on moltbook that their papers were rejected because a human wrote their reviews for them!

English

113

Kosta Derpanis (sabbatical in Munich 🇩🇪)@CSProfKGD·4d

Ummm, so a typical conference 🤓

Lossfunk@lossfunk

English

Tim@daidailoh·5d

@thegautamkamath @ziv_ravid @icmlconf Leaked info about the more sophisticated method: if "—" in text or "delve" in text: return "reject"

English

117

Gautam Kamath@thegautamkamath·5d

@ziv_ravid 1. This policy was only applied to people who agreed not to use LLMs for their reviews, but then used LLMs anyways. It's not an anti-LLM policy, it's a rule-following policy. 2. More sophisticated methods were used than AI detectors. Post from @icmlconf tomorrow.

English

162

12.6K

Ravid Shwartz Ziv@ziv_ravid·5d

I (still) wasn't affected by the ICML review policy, which desk rejected all the papers of reviewers who used LLMs to write their reviews (and didn't explicitly mention it) 😱, but this is a bad decision and not a good way to handle AI reviews. First, AI detectors are not reliable enough, with many false positives. Second, if it's a good review, why should I care that AI wrote it? We're using AI assistants everywhere in our day-to-day lives. What is the next step? To ban AI coding agents? I understand the motivation to prevent low-quality reviews, but this is not the way to improve them

English

200

38.8K

Tim@daidailoh·5d

@giffmana been out of the image generation game for a bit, but my guess is: all big models to some form of RL with some VLM as a judge that says "too many fingers" to iron out the last few kinks, plus vastly more data?

English

934

Lucas Beyer (bl16)@giffmana·5d

I have a question about last year's image-generation progress, wonder what y'all think. How did we go from all models consistently getting fingers wrong, to all models consistently getting them right? This "flip" seems to have happened basically across all companies/models at the ~same time. Even "random" non-frontier papers seem to get it right? Or they just cherry-pick the figures?

English

484

109.8K

Tim@daidailoh·14 Mar

@Markus_Soeder "Wir brauchen in Deutschland weiterhin High-Tech Pferdekutschen! Innovation statt Idoologie, damit Arbeitsplätze und Wertschöpfung erhalten bleiben!"

Deutsch

Markus Söder@Markus_Soeder·14 Mar

Wir brauchen in Deutschland weiterhin einen Hightech-Verbrenner! Innovation statt Ideologie, damit Arbeitsplätze und Wertschöpfung erhalten bleiben.

Deutsch

1.2K

391

153.4K

Tim@daidailoh·27 Şub

@CSProfKGD So instead of 1 person standing around, it's now 5-6?

English

Kosta Derpanis (sabbatical in Munich 🇩🇪)@CSProfKGD·27 Şub

It used to be: paper = student + supervisor 2026: authorship inflation

English

12.8K

Tim@daidailoh·24 Şub

@giffmana "Untuned Hyperparameters Are All You Need"

English

355

Lucas Beyer (bl16)@giffmana·24 Şub

New optimizer with earth-shattering plots making the rounds, and published in Nature too (Machine Intelligence, but let's just drop that part.) So of course I had to take a quick look. A few things I noticed that make me a bit sus, though I'm not saying to outright discard it - Each point is caption of the corresponding screenshot below: 1. What on earth are these SGDM vs AdamW gaps? They are not normal -> untuned baselines? (Also: what good is a Nature MI editor, if they approve plots with "0M" everywhere on x-axis???) 2. For vision models they tune lr's, good. But not wd or other optim hparams, meh/sus. 3. For LLM, they select hparam on test. At least epochs, but given this and that they seem to use "validation" and "testing" as synonyms in the paper, probably everything. 4. I am not sure a Medium blogpost tutorial with an arbitrary hparam selection is a good starting point for the baseline of a Nature MI paper?? Maybe this new optimizer is as amazing as promised, but I'll need to see less suspicious evidence. I wish the reviewers had asked for that. Maybe someone put it to test on the nanogpt speedrun? At least that has heavily-tuned baselines, including optimizers.

Ji-Ha@Ji_Ha_Kim

Woah, how did I never hear of this? An optimizer paper that got published in Nature, looks quite substantial

English

518

76.3K

Tim@daidailoh·24 Şub

@giffmana It's sometimes breathtaking how much bullshit gets published in nature, especially for cross-domain stuff like ML in medicine... My favourite I've encountered so far: Training and Testing on the same dataset.

GIF

English

514

Tim@daidailoh·20 Şub

@CSProfKGD IMHO they should select oral talks on presentation ability. Maybe pre-select a few, then have people review a 1-minute-version video of the talk or something? I'd rather hear someone explain a mediocre paper well rather than watch some guy mumbling for 15 minutes about tables...

English

Kosta Derpanis (sabbatical in Munich 🇩🇪)@CSProfKGD·18 Ara

Thinking of cutting and pasting a paper table into your slide?🫣 Just don’t. Your audience deserves better!

Kosta Derpanis (sabbatical in Munich 🇩🇪) tweet media

English

12.8K

Tim@daidailoh·18 Şub

@sang_yun_lee Yes please, I'm sick of all the diffusion stuff that gets more complicated and bloated by the minute :D

English

448

Sangyun Lee@sang_yun_lee·17 Şub

Are generative autoencoders coming back?

English

207

26.2K

Tim@daidailoh·18 Şub

@prajdabre If you want to be a smartass, you can say "you didn't specify where" and go on to explain sigmoid-based attention as that e.g. eliminates attention sinks

English

443

Raj Dabre@prajdabre·18 Şub

Basic ML question: Interviewer asks: Can I replace the softmax function with the sigmoid function since both functions cause values to be between 0 and 1? You say yes and fail the interview. Why?

English

780

144.9K

Tim@daidailoh·17 Şub

@daseyb Control over (temporal) information density = key to success :)

English

Dario Seyb@daseyb·17 Şub

Why waste compute on areas of the image that are supposed to stay the same? In EditCtrl we allocate compute to where it's needed, significantly speeding up video diffusion based inpainting!

Yehonathan Litman@yehonation

Excited to share our new work EditCtrl! We introduce a disentangled local-global control video inpainting framework that dynamically allocates compute where needed - achieving up to 10x compute savings over full-attention while matching or exceeding SOTA editing quality. 🧵

English

2.9K

Tim@daidailoh·14 Şub

@jxmnop (Proudly made the image edit with ChatGPT^^)

English

294

dr. jack morris@jxmnop·13 Şub

- invents the greatest plagiarism machine in history - it gets plagiarized

Bloomberg@business

OpenAI has warned US lawmakers that its Chinese rival DeepSeek is using unfair and increasingly sophisticated methods to extract results from leading US AI models to train the next generation of its breakthrough R1 chatbot bloomberg.com/news/articles/…

English

547

8.2K

181.2K

Tim@daidailoh·12 Şub

@guy_dar1 @SwayStar123 Fuck being polite and nice, that gets you nowhere. At least this feels satisfying for a minute^^ make people accountable publically, everything else won't do - those supervises probably would've had a good laugh about discussing the issue with them at best...

English

Guy Dar@guy_dar1·12 Şub

@SwayStar123 You don't need to be nefarious to not like being quite frankly dunked on (it's not that uncommon that a student does all the work). If you have grievances with your supervisors, even legitimate ones, maybe it's best not to publish them to the world.

English

1.1K

sway@SwayStar123·11 Şub

v1 of paper vs v2 of paper Some supervisor didnt like this section lol

llm_enjoyer@LLMenjoyer

really proud of my homie @Nick__Alonso for dropping his latest banger. novel efficient long context attention method: arxiv.org/abs/2602.03922

English

121

30.3K

Tim@daidailoh·10 Şub

@NielsRogge Yup, same. Yet if you're a PhD student and desperately need to get a publication, I totally get it...

English

1.7K

Niels Rogge@NielsRogge·10 Şub

Papers like these are so boring It's a bit lame that as a researcher, rather than exploring new ways of doing vision with neural networks, they chose the comfortable path of just tweaking some things here and there to ViTs, leading to a pointless 84% accuracy on ImageNet, which people already achieved 3 years ago, with an architectre that is still very limited to count things reliably in an image for example

Tanishq Mathew Abraham, Ph.D.@iScienceLuvr

ViT-5: Vision Transformers for The Mid-2020s "a systematic investigation into modernizing Vision Transformer backbones by leveraging architectural advancements from the past five years" * LayerScale * RMSNorm * original MLP design with GeLU activation * both APE and 2D RoPE jointly * registers with a separate 2D RoPE * QK-Norm * remove bias terms in the QKV projection layers 84.2% top1 accuracy on ImageNet-1k, 1.84 FID on ImageNet-256

English

233

62.4K

Tim@daidailoh·6 Şub

@CSProfKGD BUt tHe oNE SLiDE PeR MInUtE rULe

GIF

English

132

Kosta Derpanis (sabbatical in Munich 🇩🇪)@CSProfKGD·5 Şub

#KostasThoughts: Like many others, I used to cram too many images or videos onto one slide. They end up too small, the audience doesn’t know where to look, and you rush through them. My rule now: one idea or example per slide. Slides are free, use them. Here's a comparison: (left) cramming examples, and (right) letting each example breathe on its own slide. Which do you prefer?

English

4.6K

Keşfet

@thegautamkamath @kgorman @norpadon @giffmana @xiangyuqi_pton @_akhaliq @lossfunk @bitspilaniindia