Guy Dar

2.6K posts

Guy Dar

@guy_dar1

#AI Researcher | A jumped-up pantry boy who never knew his place

Katılım Nisan 2022

259 Takip Edilen511 Takipçiler

Guy Dar@guy_dar1·5d

@torchcompiled But also I feel like softmax is kinda cursed in many ways maybe that's related too?

English

Ethan@torchcompiled·6d

if this is an issue for asymmetry in the LLM head, would we expect it to similarly apply to the up matrices of FFN? Paper mentions softmax affecting the rank of the representation is a factor here, but curious if activation functions could play a similar role.

Nathan Godey@nthngdy

🧵New paper: "Lost in Backpropagation: The LM Head is a Gradient Bottleneck" The output layer of LLMs destroys 95-99% of your training signal during backpropagation, and this significantly slows down pretraining 👇

English

Guy Dar@guy_dar1·7 Mar

@morallawwithin This but unironically

English

florence ⏹️@morallawwithin·7 Mar

okay but consider this: consciousness is magic, and matrix multiplication is not magic

English

515

17.1K

Guy Dar@guy_dar1·6 Mar

@skdh "You paper has been flagged for human-generated content"

English

134

Sabine Hossenfelder@skdh·6 Mar

This is real

English

109

664

55.1K

Guy Dar@guy_dar1·1 Mar

@nikicaga That's hardly accurate. Israel's population is quite nationalist true (though not "insanely", certainly in comparison to other non-Western countries). "Not willing to share.." is a blatant inaccuracy to say the least.

English

959

Nikolaj🇺🇦🇵🇸@nikicaga·1 Mar

Exactly, Israel is the way it is specifically because most of its population is insanely nationalist and has no intention of leaving their holy land (or sharing it with its other indigenous people)

The Force Majeure Groyper@ProgDirectorate

easily my least favourite genre of leftslop. israel would have a population of three today if these wishcasts had panned out as expected over the last few years.

English

692

67.9K

Guy Dar@guy_dar1·1 Mar

@justalexoki Wife guy bond

English

taoki@justalexoki·1 Mar

this is why they killed him

Khamenei.ir@khamenei_ir

According to Islamic tradition,#woman is a flower; how oppressive& evil it is of a man to treat a #flower aggressively/uncaringly. 9/18/2000

English

121

4.6K

Guy Dar@guy_dar1·1 Mar

@StephenKing Dafuq

Filipino

Stephen King@StephenKing·28 Şub

Just when you think it can’t get worse……it,does.

English

6.3K

5.9K

46.2K

2.5M

Guy Dar@guy_dar1·27 Şub

Appropriate name for an insomnia book

English

Guy Dar@guy_dar1·25 Şub

I've had so much potential

hopecore@dailyhopecores

English

362

Guy Dar@guy_dar1·23 Şub

@sharut_gupta Great! Happy that it is!!

English

Sharut Gupta@sharut_gupta·23 Şub

@guy_dar1 Thanks for sharing this, very relevant : )

English

334

Sharut Gupta@sharut_gupta·23 Şub

[1/n] Do distinct large models admit a simple map that aligns their embedding spaces? We show that across multimodal contrastive models—trained on different data and architectures—an orthogonal map aligns image embeddings. Strikingly, the same map also aligns text embeddings.

English

432

35K

Guy Dar@guy_dar1·23 Şub

@Ofirlin Shabbot

English

Ofir Lindenbaum@Ofirlin·23 Şub

בוט של שבת

Jacob Posel@jacob_posel

Can a Jew let his agent run over Shabbat if the last prompt was Friday afternoon?

עברית

443

Guy Dar@guy_dar1·19 Şub

@Andrew_Akbashev Kinda wild but seems like the arxiv plot continues almost uninterrupted despite AI tools. That's really surprising

English

177

Andrew Akbashev@Andrew_Akbashev·18 Şub

A really dangerous situation. Too many submissions. Too many generated papers. Little responsibility. 1. In 2026, more than 24,000 submissions were made to the International Conference on Machine Learning (ICML). It’s TWO times more than in 2025. To fight it, the organizers now require researchers to pay $100 for every subsequent paper. 2. LLM adoption has increased researcher productivity by 90% (there’s a recent paper in Science). 3. The number of papers is becoming far too high. Submissions to arXiv have risen by 50% since 2022. 4. There are simply not enough reviewers. Plus, many scientists no longer want to invest precious time in it for free. 5. We can’t easily identify AI-made papers from the genuine ones. __ Important words from Paul Ginsparg, a co-founder of arXiv: “AI slop frequently can’t be discriminated just by looking at abstract, or even by just skimming full text. This makes it an “existential threat” to the system.” Basically, we’re getting closer to the tipping point. 📍 Many professors blame the AI. But the problem is likely elsewhere: 1. Without a sufficient number of papers, many PIs can’t get funded. They have to prove their credibility to reviewers. Their proposals have to rely on prior publications. In many countries, there are some informal (or even formal) expectations for how many papers a group with a certain size has to publish to survive (funding-wise). 2. Our students / postdocs need papers if they want to be hired in faculty roles. Yes, some departments hire people with few publications. But the majority still want to ensure their faculty can get funded. If funding is partly a function of papers, this is used in decision-making. 3. The number of papers is important if you want to get high-level awards. Many of them are not given because you published one paper (even if it’s great). They are given because you made a meaningful CONTRIBUTION to the field. How do you make it? Publish more papers. 4. Tenure promotions in many places take the number of your papers into account (often indirectly). Your tenure may get delayed if you don’t publish enough. Not everywhere, but for many mid- to low-ranked universities this story is more or less the same. + There are many more to mention. 📍My opinion: Much of this is rooted in how funding is distributed. There is a strong correlation between the requirements at a university and the funding acquisition criteria. If funding were based ONLY on the quality of published papers, universities would hire people for the quality of their science. If funding agencies strongly discouraged publishing too many papers, universities wouldn’t expect numbers from faculty during promotions. And some supervisors wouldn’t pressure students and postdocs to publish unfinished studies and low-quality data. Yes, we need good detectors of fake papers. But we also need the right policies and better funding allocation criteria.

English

378

1.4K

193K

Guy Dar@guy_dar1·18 Şub

@ValerioCapraro Have a few problems with the claims here. But one of the biggest confusions that repeat often -- Intelligence is orthogonal to robustness and reliability. Claiming otherwise would be counter to much of the science on human neurodivergent intelligence.

English

Valerio Capraro@ValerioCapraro·17 Şub

It’s time to make this point neat and clear. Attempts to show that AGI has already been achieved are just plainly wrong. For three reasons: 1) They shift the definition of general intelligence, originally based on robustness, generalization, and reliability, to behavioral alignment with benchmarks. 2) They confuse benchmark performance with capability to handle novelty. Spoiler: these are different. 3) They ignore that the same behavioral output can come from totally different epistemic pipelines. Joint work with @Walter4C and @GaryMarcus Links in the first reply

English

191

892

50.4K

Guy Dar@guy_dar1·15 Şub

@yoavgo Not unrelated: arxiv.org/abs/2310.10591

English

231

(((ل()(ل() 'yoav))))👾@yoavgo·15 Şub

i really like the idea of this paper: instead of interpreting intermediary transformer activations by projecting them to the vocabulary space through "unembedding", look instead for nearest neighbours in a set of intermediary activations resulting from known tokens and contexts.

Benno Krojer@benno_krojer

🚨New paper Are visual tokens going into an LLM interpretable 🤔 Existing methods (e.g. logit lens) and assumptions would lead you to think “not much”... We propose LatentLens and show that most visual tokens are interpretable across *all* layers 💡 Details 🧵

English

130

16.7K

Guy Dar@guy_dar1·12 Şub

@josephimperial_ That's truly a misuse of the concept of desk reject, which is of dubious utility as it is

English

585

Joseph Imperial@josephimperial_·12 Şub

Excited to share that my paper was desk rejected for referencing a table in the appendix #ACL2026

English

133

26.2K

Guy Dar@guy_dar1·12 Şub

@SwayStar123 To be clear I don't know the details in question abt this case, I just say how it looks to me from outside. But this outsider view isnt arbitrary intuition builds on fuzzy statistics in the end. Statistics may not work at the individual case but u can assume ppl think like that

English

Guy Dar@guy_dar1·12 Şub

@SwayStar123 You don't need to be nefarious to not like being quite frankly dunked on (it's not that uncommon that a student does all the work). If you have grievances with your supervisors, even legitimate ones, maybe it's best not to publish them to the world.

English

1.1K

sway@SwayStar123·11 Şub

v1 of paper vs v2 of paper Some supervisor didnt like this section lol

llm_enjoyer@LLMenjoyer

really proud of my homie @Nick__Alonso for dropping his latest banger. novel efficient long context attention method: arxiv.org/abs/2602.03922

English

121

30.3K

Guy Dar@guy_dar1·12 Şub

@daidailoh @SwayStar123 Point being, you don't make them accountable bc saying supervisor did nothing is hardly a shock. Ofc, many times, even most I guess this is not the case, but you don't out them for anything they aren't allowed to do

English

Guy Dar@guy_dar1·12 Şub

@daidailoh @SwayStar123 Point is that while being polite gets you nowhere, not being polite gets you negative social points. And it's hardly the case where this bravery is oh so needed. Supervisors are allowed to not contribute, it is much worse if they are not responsive, but I reflect badly on urself

English

Guy Dar@guy_dar1·12 Şub

@javi_22025 @SwayStar123 much, but trying to glorify urself on expense of others looks bad to future employers and collaborators. It looks petty. Maybe there was a big story in which one person was wronged and he wanted to make that public, but from an outsider pov it looks like he is the problem

English

Guy Dar@guy_dar1·12 Şub

@javi_22025 @SwayStar123 while a few sparse decisions that come from supervisor experience is actually what did it - btw that's why supervisor is last author - finally, it mostly reflects on you, bc like it or not, this is quite common and often assumed to be the case that supervisors don't contribute >>

English

Keşfet

@torchcompiled @morallawwithin @skdh @nikicaga @justalexoki @StephenKing @sharut_gupta @Ofirlin