
Guy Dar
2.6K posts

Guy Dar
@guy_dar1
#AI Researcher | A jumped-up pantry boy who never knew his place
Katılım Nisan 2022
259 Takip Edilen511 Takipçiler

@torchcompiled But also I feel like softmax is kinda cursed in many ways maybe that's related too?
English

if this is an issue for asymmetry in the LLM head, would we expect it to similarly apply to the up matrices of FFN?
Paper mentions softmax affecting the rank of the representation is a factor here, but curious if activation functions could play a similar role.
Nathan Godey@nthngdy
🧵New paper: "Lost in Backpropagation: The LM Head is a Gradient Bottleneck" The output layer of LLMs destroys 95-99% of your training signal during backpropagation, and this significantly slows down pretraining 👇
English

Exactly, Israel is the way it is specifically because most of its population is insanely nationalist and has no intention of leaving their holy land (or sharing it with its other indigenous people)
The Force Majeure Groyper@ProgDirectorate
easily my least favourite genre of leftslop. israel would have a population of three today if these wishcasts had panned out as expected over the last few years.
English

@Andrew_Akbashev Kinda wild but seems like the arxiv plot continues almost uninterrupted despite AI tools. That's really surprising
English

A really dangerous situation. Too many submissions. Too many generated papers. Little responsibility.
1. In 2026, more than 24,000 submissions were made to the International Conference on Machine Learning (ICML). It’s TWO times more than in 2025. To fight it, the organizers now require researchers to pay $100 for every subsequent paper.
2. LLM adoption has increased researcher productivity by 90% (there’s a recent paper in Science).
3. The number of papers is becoming far too high. Submissions to arXiv have risen by 50% since 2022.
4. There are simply not enough reviewers. Plus, many scientists no longer want to invest precious time in it for free.
5. We can’t easily identify AI-made papers from the genuine ones.
__
Important words from Paul Ginsparg, a co-founder of arXiv:
“AI slop frequently can’t be discriminated just by looking at abstract, or even by just skimming full text. This makes it an “existential threat” to the system.”
Basically, we’re getting closer to the tipping point.
📍 Many professors blame the AI.
But the problem is likely elsewhere:
1. Without a sufficient number of papers, many PIs can’t get funded. They have to prove their credibility to reviewers. Their proposals have to rely on prior publications. In many countries, there are some informal (or even formal) expectations for how many papers a group with a certain size has to publish to survive (funding-wise).
2. Our students / postdocs need papers if they want to be hired in faculty roles. Yes, some departments hire people with few publications. But the majority still want to ensure their faculty can get funded. If funding is partly a function of papers, this is used in decision-making.
3. The number of papers is important if you want to get high-level awards. Many of them are not given because you published one paper (even if it’s great). They are given because you made a meaningful CONTRIBUTION to the field. How do you make it? Publish more papers.
4. Tenure promotions in many places take the number of your papers into account (often indirectly). Your tenure may get delayed if you don’t publish enough. Not everywhere, but for many mid- to low-ranked universities this story is more or less the same.
+ There are many more to mention.
📍My opinion:
Much of this is rooted in how funding is distributed.
There is a strong correlation between the requirements at a university and the funding acquisition criteria.
If funding were based ONLY on the quality of published papers, universities would hire people for the quality of their science. If funding agencies strongly discouraged publishing too many papers, universities wouldn’t expect numbers from faculty during promotions. And some supervisors wouldn’t pressure students and postdocs to publish unfinished studies and low-quality data.
Yes, we need good detectors of fake papers.
But we also need the right policies and better funding allocation criteria.

English

@ValerioCapraro Have a few problems with the claims here. But one of the biggest confusions that repeat often -- Intelligence is orthogonal to robustness and reliability. Claiming otherwise would be counter to much of the science on human neurodivergent intelligence.
English

It’s time to make this point neat and clear.
Attempts to show that AGI has already been achieved are just plainly wrong. For three reasons:
1) They shift the definition of general intelligence, originally based on robustness, generalization, and reliability, to behavioral alignment with benchmarks.
2) They confuse benchmark performance with capability to handle novelty. Spoiler: these are different.
3) They ignore that the same behavioral output can come from totally different epistemic pipelines.
Joint work with @Walter4C and @GaryMarcus
Links in the first reply

English

i really like the idea of this paper: instead of interpreting intermediary transformer activations by projecting them to the vocabulary space through "unembedding", look instead for nearest neighbours in a set of intermediary activations resulting from known tokens and contexts.
Benno Krojer@benno_krojer
🚨New paper Are visual tokens going into an LLM interpretable 🤔 Existing methods (e.g. logit lens) and assumptions would lead you to think “not much”... We propose LatentLens and show that most visual tokens are interpretable across *all* layers 💡 Details 🧵
English

@josephimperial_ That's truly a misuse of the concept of desk reject, which is of dubious utility as it is
English

Excited to share that my paper was desk rejected for referencing a table in the appendix #ACL2026
English

@SwayStar123 To be clear I don't know the details in question abt this case, I just say how it looks to me from outside. But this outsider view isnt arbitrary intuition builds on fuzzy statistics in the end. Statistics may not work at the individual case but u can assume ppl think like that
English

@SwayStar123 You don't need to be nefarious to not like being quite frankly dunked on (it's not that uncommon that a student does all the work). If you have grievances with your supervisors, even legitimate ones, maybe it's best not to publish them to the world.
English

@daidailoh @SwayStar123 Point being, you don't make them accountable bc saying supervisor did nothing is hardly a shock. Ofc, many times, even most I guess this is not the case, but you don't out them for anything they aren't allowed to do
English

@daidailoh @SwayStar123 Point is that while being polite gets you nowhere, not being polite gets you negative social points. And it's hardly the case where this bravery is oh so needed. Supervisors are allowed to not contribute, it is much worse if they are not responsive, but I reflect badly on urself
English

@javi_22025 @SwayStar123 much, but trying to glorify urself on expense of others looks bad to future employers and collaborators. It looks petty. Maybe there was a big story in which one person was wronged and he wanted to make that public, but from an outsider pov it looks like he is the problem
English

@javi_22025 @SwayStar123 while a few sparse decisions that come from supervisor experience is actually what did it
- btw that's why supervisor is last author
- finally, it mostly reflects on you, bc like it or not, this is quite common and often assumed to be the case that supervisors don't contribute >>
English










