Guy Dar

2.6K posts

Guy Dar

Guy Dar

@guy_dar1

#AI Researcher | A jumped-up pantry boy who never knew his place

Katılım Nisan 2022
259 Takip Edilen511 Takipçiler
Guy Dar
Guy Dar@guy_dar1·
@torchcompiled But also I feel like softmax is kinda cursed in many ways maybe that's related too?
English
0
0
0
83
Ethan
Ethan@torchcompiled·
if this is an issue for asymmetry in the LLM head, would we expect it to similarly apply to the up matrices of FFN? Paper mentions softmax affecting the rank of the representation is a factor here, but curious if activation functions could play a similar role.
Nathan Godey@nthngdy

🧵New paper: "Lost in Backpropagation: The LM Head is a Gradient Bottleneck" The output layer of LLMs destroys 95-99% of your training signal during backpropagation, and this significantly slows down pretraining 👇

English
2
2
35
5K
florence ⏹️
florence ⏹️@morallawwithin·
okay but consider this: consciousness is magic, and matrix multiplication is not magic
English
57
16
515
17.1K
Guy Dar
Guy Dar@guy_dar1·
@skdh "You paper has been flagged for human-generated content"
English
0
0
2
134
Guy Dar
Guy Dar@guy_dar1·
@nikicaga That's hardly accurate. Israel's population is quite nationalist true (though not "insanely", certainly in comparison to other non-Western countries). "Not willing to share.." is a blatant inaccuracy to say the least.
English
0
0
3
959
Stephen King
Stephen King@StephenKing·
Just when you think it can’t get worse……it,does.
English
6.3K
5.9K
46.2K
2.5M
Guy Dar
Guy Dar@guy_dar1·
Appropriate name for an insomnia book
Guy Dar tweet media
English
0
0
1
45
Sharut Gupta
Sharut Gupta@sharut_gupta·
[1/n] Do distinct large models admit a simple map that aligns their embedding spaces? We show that across multimodal contrastive models—trained on different data and architectures—an orthogonal map aligns image embeddings. Strikingly, the same map also aligns text embeddings.
English
12
61
432
35K
Guy Dar
Guy Dar@guy_dar1·
@Andrew_Akbashev Kinda wild but seems like the arxiv plot continues almost uninterrupted despite AI tools. That's really surprising
English
0
0
0
177
Andrew Akbashev
Andrew Akbashev@Andrew_Akbashev·
A really dangerous situation. Too many submissions. Too many generated papers. Little responsibility. 1. In 2026, more than 24,000 submissions were made to the International Conference on Machine Learning (ICML). It’s TWO times more than in 2025. To fight it, the organizers now require researchers to pay $100 for every subsequent paper. 2. LLM adoption has increased researcher productivity by 90% (there’s a recent paper in Science). 3. The number of papers is becoming far too high. Submissions to arXiv have risen by 50% since 2022. 4. There are simply not enough reviewers. Plus, many scientists no longer want to invest precious time in it for free. 5. We can’t easily identify AI-made papers from the genuine ones. __ Important words from Paul Ginsparg, a co-founder of arXiv: “AI slop frequently can’t be discriminated just by looking at abstract, or even by just skimming full text. This makes it an “existential threat” to the system.” Basically, we’re getting closer to the tipping point. 📍 Many professors blame the AI. But the problem is likely elsewhere: 1. Without a sufficient number of papers, many PIs can’t get funded. They have to prove their credibility to reviewers. Their proposals have to rely on prior publications. In many countries, there are some informal (or even formal) expectations for how many papers a group with a certain size has to publish to survive (funding-wise). 2. Our students / postdocs need papers if they want to be hired in faculty roles. Yes, some departments hire people with few publications. But the majority still want to ensure their faculty can get funded. If funding is partly a function of papers, this is used in decision-making. 3. The number of papers is important if you want to get high-level awards. Many of them are not given because you published one paper (even if it’s great). They are given because you made a meaningful CONTRIBUTION to the field. How do you make it? Publish more papers. 4. Tenure promotions in many places take the number of your papers into account (often indirectly). Your tenure may get delayed if you don’t publish enough. Not everywhere, but for many mid- to low-ranked universities this story is more or less the same. + There are many more to mention. 📍My opinion: Much of this is rooted in how funding is distributed. There is a strong correlation between the requirements at a university and the funding acquisition criteria. If funding were based ONLY on the quality of published papers, universities would hire people for the quality of their science. If funding agencies strongly discouraged publishing too many papers, universities wouldn’t expect numbers from faculty during promotions. And some supervisors wouldn’t pressure students and postdocs to publish unfinished studies and low-quality data. Yes, we need good detectors of fake papers. But we also need the right policies and better funding allocation criteria.
Andrew Akbashev tweet media
English
94
378
1.4K
193K
Guy Dar
Guy Dar@guy_dar1·
@ValerioCapraro Have a few problems with the claims here. But one of the biggest confusions that repeat often -- Intelligence is orthogonal to robustness and reliability. Claiming otherwise would be counter to much of the science on human neurodivergent intelligence.
English
0
0
0
97
Valerio Capraro
Valerio Capraro@ValerioCapraro·
It’s time to make this point neat and clear. Attempts to show that AGI has already been achieved are just plainly wrong. For three reasons: 1) They shift the definition of general intelligence, originally based on robustness, generalization, and reliability, to behavioral alignment with benchmarks. 2) They confuse benchmark performance with capability to handle novelty. Spoiler: these are different. 3) They ignore that the same behavioral output can come from totally different epistemic pipelines. Joint work with @Walter4C and @GaryMarcus Links in the first reply
Valerio Capraro tweet media
English
65
191
892
50.4K
(((ل()(ل() 'yoav))))👾
i really like the idea of this paper: instead of interpreting intermediary transformer activations by projecting them to the vocabulary space through "unembedding", look instead for nearest neighbours in a set of intermediary activations resulting from known tokens and contexts.
Benno Krojer@benno_krojer

🚨New paper Are visual tokens going into an LLM interpretable 🤔 Existing methods (e.g. logit lens) and assumptions would lead you to think “not much”... We propose LatentLens and show that most visual tokens are interpretable across *all* layers 💡 Details 🧵

English
3
18
130
16.7K
Guy Dar
Guy Dar@guy_dar1·
@josephimperial_ That's truly a misuse of the concept of desk reject, which is of dubious utility as it is
English
0
0
1
585
Joseph Imperial
Joseph Imperial@josephimperial_·
Excited to share that my paper was desk rejected for referencing a table in the appendix #ACL2026
English
11
1
133
26.2K
Guy Dar
Guy Dar@guy_dar1·
@SwayStar123 To be clear I don't know the details in question abt this case, I just say how it looks to me from outside. But this outsider view isnt arbitrary intuition builds on fuzzy statistics in the end. Statistics may not work at the individual case but u can assume ppl think like that
English
0
0
0
56
Guy Dar
Guy Dar@guy_dar1·
@SwayStar123 You don't need to be nefarious to not like being quite frankly dunked on (it's not that uncommon that a student does all the work). If you have grievances with your supervisors, even legitimate ones, maybe it's best not to publish them to the world.
English
3
0
6
1.1K
Guy Dar
Guy Dar@guy_dar1·
@daidailoh @SwayStar123 Point being, you don't make them accountable bc saying supervisor did nothing is hardly a shock. Ofc, many times, even most I guess this is not the case, but you don't out them for anything they aren't allowed to do
English
0
0
0
33
Guy Dar
Guy Dar@guy_dar1·
@daidailoh @SwayStar123 Point is that while being polite gets you nowhere, not being polite gets you negative social points. And it's hardly the case where this bravery is oh so needed. Supervisors are allowed to not contribute, it is much worse if they are not responsive, but I reflect badly on urself
English
1
0
0
93
Guy Dar
Guy Dar@guy_dar1·
@javi_22025 @SwayStar123 much, but trying to glorify urself on expense of others looks bad to future employers and collaborators. It looks petty. Maybe there was a big story in which one person was wronged and he wanted to make that public, but from an outsider pov it looks like he is the problem
English
0
0
0
22
Guy Dar
Guy Dar@guy_dar1·
@javi_22025 @SwayStar123 while a few sparse decisions that come from supervisor experience is actually what did it - btw that's why supervisor is last author - finally, it mostly reflects on you, bc like it or not, this is quite common and often assumed to be the case that supervisors don't contribute >>
English
1
0
0
21