juri

728 posts

juri banner
juri

juri

@nlopitz

Researcher @UZH_en

Katılım Aralık 2019
382 Takip Edilen578 Takipçiler
Sabitlenmiş Tweet
juri
juri@nlopitz·
Should I use Macro F1 or Accuracy? Why not Kappa? Why do some use this, and others that? What's actually evaluated here? 😵‍💫 Happy to share the final version of this paper on multi-class classification evaluation: direct.mit.edu/tacl/article/d… #machinelearning #nlproc #ml
English
0
7
21
2.2K
juri retweetledi
Gautam Kamath
Gautam Kamath@thegautamkamath·
It's so cringe when real people I otherwise know and respect post obvious AI slop on social media, particularly when they're (supposedly) expressing their feelings. Authenticity is so rare and valuable these days, and it's sad to see people just cede it from the get-go
English
3
9
109
25.4K
juri
juri@nlopitz·
@deliprao well, gotta appreciate the honesty. I think that it's actually better than doing a rushed read, misunderstanding everything, and then hitting the reject recommendation with a confidence of 4.
English
0
0
5
984
juri
juri@nlopitz·
Re LLMs as reviewers to cope with submission load. LLMs and AI models have essentially been trained on a snapshot of the past, afaik with a gap of up to 2-3 years or even more until now. How can they be good reviewers in peer-review, and on what metric?
English
0
0
3
541
juri retweetledi
Michael Merrifield
Michael Merrifield@AstroMikeMerri·
Wow, so much whining about arXiv’s steps to reduce AI slop. So easy to deal with for authors who actually read their own papers before submitting them.
English
7
19
281
6.4K
juri
juri@nlopitz·
@predict_addict yeah, it's really hard to understand how anyone could defend this. and also not everyone has to be in science.
English
1
0
0
32
Valeriy M., PhD, MBA, CQF
Valeriy M., PhD, MBA, CQF@predict_addict·
Why are some academics defending pollution of science. ArXiv has made great long needed action to stem avalanche of fake papers.
Steinn Sigurðsson@steinly0

on the whole @arxiv flap about hallucinated references etc you don't see the stuff we reject... some of it is really really egregious the decision to impose additional consequences is largely to throttle that stuff so n00bs and bad actors don't trash us trying repeatedly

English
2
1
8
712
juri
juri@nlopitz·
@yoavgo Who do you mean with "we"? Has someone claimed this authority.
English
0
0
0
39
(((ل()(ل() 'yoav))))👾
"I've been doing AI for 20 years and ..." and nothing. LLMs are new. LLM-Agents are new. our 20+ years experience with AI/ML/NLP may be marginally useful for understanding aspects of their training, but thats about it. we need new tools and experiences. we dont deserve authority.
English
29
31
404
23.5K
Djamé..
Djamé..@zehavoc·
Can't believe there was no international scandals with @aclmeeting #ACL2026 's registration prices. $1200 the full conference for an academic ? Do they think we're made of gold or what ???
English
2
0
11
1.4K
juri retweetledi
Nic Barker
Nic Barker@nicbarkeragain·
One of the biggest problems with using LLMs as a google replacement for programming, is that getting zero relevant results on google used to be a signal that you had the wrong idea about the root cause. Whereas LLMs will happily indulge any terrible idea you suggest.
English
141
620
10.2K
194.9K
juri retweetledi
dinosaur
dinosaur@dinosaurs1969·
dinosaur tweet media
ZXX
119
1.5K
33.4K
511K
juri retweetledi
Michael Roth
Michael Roth@microth·
📢 Postdoc Position in NLP @ UTN in Nuremberg, Germany I am looking for a full-time postdoctoral researcher (A13/E13, initial contract for 3 yrs) starting July 2026 or as soon as possible thereafter. Focus on implicit & underspecified language, background knowledge and/or biases.
English
1
7
35
4.3K
Alexi Gladstone
Alexi Gladstone@AlexiGlad·
looks like there's gonna be around 40k neurips submissions? the biggest exponential in ai right now is slop
English
15
8
274
24.4K
juri
juri@nlopitz·
@AlexiGlad Imagine the amount of wasted electricity and money that's been dumped into that. Any benefit for science? At least it doesn't show yet, I would say.
English
0
0
0
997
juri
juri@nlopitz·
@zehavoc I see, maybe you could try writing in the abstract smth like "In this short paper, we..." Perhaps it can help a little with this issue?
English
0
0
0
10
Djamé..
Djamé..@zehavoc·
@nlopitz They ask “give me a review of this paper” instead of “give me a review of this short paper”
English
1
0
0
27
Djamé..
Djamé..@zehavoc·
Beside world peace and my family' health, my main wish this year is for LLM-based reviewers to specify that the paper they ask to review is a SHORT paper or not. ChatGPT and Claude have no idea when they review, they're not calibrated to handle this difference by default.
GIF
English
2
0
4
156
juri
juri@nlopitz·
@yashYRS Hi, the link in the paper and on arxiv to your github repo is not working. Would be great if you could fix that 🙂
English
1
0
0
89
Yash Sarrof
Yash Sarrof@yashYRS·
In principle, CoT makes Transformers Turing Complete, but empirically LLMs struggle at longer lengths. In our paper, we study Transformer+CoT length generalization and prove that with a finite vocab, models can't solve problems beyond the restricted class TC0. But there’s a fix🧵
Yash Sarrof tweet media
English
3
16
90
14.6K
juri
juri@nlopitz·
Just raised some points in the rebuttal as reviewer, after thoughtful author responses! Sadly our own paper doesn't seem to receive the honor, even though reviewers thanked us for having added an experiment and issues "clarified" or "resolved" 🥲
English
0
0
2
93
juri
juri@nlopitz·
@pcastr This is somewhat encouraged by how workshop selection works, which has gotten quite competitive. "Famous" person as speaker -> acceptance chance for WS increases.
English
0
0
0
26
Pablo Samuel Castro
Pablo Samuel Castro@pcastr·
I wish there were a way to increase diversity in workshop keynotes/panelists. There are a few famous researchers who end up being keynotes/panelists on multiple workshops, which means lots of other great researchers are not getting those opportunities.
English
10
12
166
12.7K
juri
juri@nlopitz·
@soniajoseph_ @iclr_conf Cool work! Just a nit on "a lot of interpretability work implicitly assumes [language reps] are sparse, linear, and decomposable into independent features." I don't think many people assume language reps are decomposable, linear, etc.
English
1
0
0
187
Sonia Joseph
Sonia Joseph@soniajoseph_·
Interpretability is built on a few core assumptions. Two of our ICLR 2026 @iclr_conf papers suggest some of those assumptions are wrong (or at least highly incomplete). 1. Sparse CLIP: Co-Optimizing Interpretability and Performance in Contrastive Learning arxiv.org/abs/2601.20075 much of the field has internalized an interpretability–accuracy trade-off: if you want cleaner, more human-understandable features, you sacrifice performance. however, we find that this trade-off is not fundamental. instead of relying on post-hoc methods (e.g. sparse autoencoders trained on frozen representations), we incorporate sparsity directly into CLIP training. surprisingly, this produces features that are significantly more interpretable while preserving downstream performance. this result made me more optimistic about intrinsically interpretable models, a direction that was imo written off too early. - 2. Into the Rabbit Hull: From Task-Relevant Concepts in DINO to Minkowski Geometry arxiv.org/abs/2510.08638 a lot of interpretability work implicitly assumes that vision representations behave like language: sparse, linear, and decomposable into independent features. we find that this assumption is often misleading. instead, vision representations appear partially dense and geometrically structured. we propose the Minkowski Representation Hypothesis: tokens live in sums of convex regions formed from a small set of “archetypes,” rather than isolated features along linear directions. this reframes how different tasks (classification, segmentation, depth) recruit and organize concepts. it also suggests that many current interpretability tools are mismatched to the actual structure of vision data. -- tldr; interpretability can be built into training with surprisingly simple tweaks, and that different modalities have different sparsities/geometries. Tailoring the interp method to the modality is super impt!
Sonia Joseph tweet media
English
9
49
482
34.5K