juri

691 posts

juri banner
juri

juri

@nlopitz

Researcher @UZH_en

Katılım Aralık 2019
382 Takip Edilen576 Takipçiler
Sabitlenmiş Tweet
juri
juri@nlopitz·
Should I use Macro F1 or Accuracy? Why not Kappa? Why do some use this, and others that? What's actually evaluated here? 😵‍💫 Happy to share the final version of this paper on multi-class classification evaluation: direct.mit.edu/tacl/article/d… #machinelearning #nlproc #ml
English
0
7
21
2K
juri
juri@nlopitz·
@PandaAshwinee i think the main issue with what the AC did here is that they simply assumed a changed stance of the reviewer without even consulting them. ""Reviewer GRXx would increase the rating to 6"" (not true, and i guess this is what very much annoyed this reviewer)
English
0
0
0
375
Ashwinee Panda
Ashwinee Panda@PandaAshwinee·
there's turmoil when reviewers think the ACs decision conflicts with their own. there's turmoil when authors think ACs aren't overriding reviewers. so, there's always turmoil. an AC's job is not just to average the scores of reviewers. but overruling 3 rejects is tricky...
Ashwinee Panda tweet media
English
5
1
35
7.8K
juri
juri@nlopitz·
@BetleyJan @OwainEvans_UK thanks - i had understood the tweet as meaning the training data of gpt-4.1 (which isn't public afaik)
English
0
0
0
15
Owain Evans
Owain Evans@OwainEvans_UK·
New paper: GPT-4.1 denies being conscious or having feelings. We train it to say it's conscious to see what happens. Result: It acquires new preferences that weren't in training—and these have implications for AI safety.
Owain Evans tweet media
English
96
162
982
145.3K
juri retweetledi
Financial Times
Create enough hallucinated legal arguments, flawed engineering calculations and backdoor-ridden code, and the slop vats fill faster than our capacity to tell good work from bad, writes Tim Harford.⁠ ⁠ Read his column on telling good AI from bad: ft.trib.al/j6Io85O
Financial Times tweet media
English
50
279
854
189.6K
juri
juri@nlopitz·
@tallinzen @ChenhaoTan a lot more desk rejections probably also means increased noise in those decisions - which would be a pity. yet, i also see no way around it.
English
1
0
0
47
Tal Linzen
Tal Linzen@tallinzen·
@ChenhaoTan absolutely. we need a lot more desk rejections. though this may drive out good ACs...
English
1
1
11
1.5K
Marius Mosbach
Marius Mosbach@mariusmosbach·
@nlopitz Hey Juri :) Thanks a lot! We use llm2vec-unsupervised-simcse as a teacher. I'd argue it didn't use any supervised contrastive data.
English
1
0
1
62
juri
juri@nlopitz·
@mmitchell_ai So I am wondering if you had any specific examples of "extremely useful" scenarios there in mind
English
1
0
0
44
juri
juri@nlopitz·
@mmitchell_ai Thanks, very interesting! I noticed that you said "LLMs can be extremely useful." I do think that they can be useful (e.g., for translating), but I did not so far find any scenario where they are "extremely" useful, as in, useful to human life, or whatever other "important KPI."
English
1
0
1
205
MMitchell
MMitchell@mmitchell_ai·
"AI" is not a stochastic parrot.🦜 I wrote this piece a couple weeks ago, but it was hard for me to finish up given AI's role in society and war over the past few weeks. I should share it at some point though. Not perfect, but here it is. @margarmitchell/no-ai-is-not-a-stochastic-parrot-a99e57766bed" target="_blank" rel="nofollow noopener">medium.com/@margarmitchel
English
11
26
160
33K
juri
juri@nlopitz·
@sivareddyg very cool. i tried this with bert, but i never managed to construct embeddings with it that were better than those from sbert (contr. trained bert), far from it. current embedders are trained on much more contrastive data than back then, so double the challenge.
English
0
0
0
27
Siva Reddy
Siva Reddy@sivareddyg·
@nlopitz yes, we don't use contrastive learning anymore. stay tuned!
English
1
0
1
177
Siva Reddy
Siva Reddy@sivareddyg·
Controversial take: you don't need any of this. LLMs have gone through a lot of training already, so there ought to be a better method to turn them into extremely good embedding models. This is what my group has been working on. LLM2Vec is one such idea. We have some exciting developments recently where LLMs themselves can generate superior embeddings with zero changes to the LLM. Stay tuned!
dr. jack morris@jxmnop

x.com/i/article/2031…

English
15
26
528
64.5K
juri retweetledi
anton
anton@abacaj·
“Make the models cheap to use” “Great, they all forgot how to code” “Now 10x the price”
anton tweet media
English
231
1.6K
27.5K
669.3K
juri
juri@nlopitz·
@zapadooo @MLStreetTalk @jeremyphoward It's really ridiculous, Jeremy has contributed big to LLM development, but dare he voice just one piece of thought that's not total hype.
English
0
0
2
852
Michael Mullen
Michael Mullen@zapadooo·
@MLStreetTalk @jeremyphoward The telling thing in these comments is how MAD people get that Jeremy is saying what he is saying. Portraying him as being anti-LLM, which he is not. Like gamblers, so many people emotionally attached to their AI right now
English
2
0
22
14.6K
Machine Learning Street Talk
Machine Learning Street Talk@MLStreetTalk·
A masterclass from @jeremyphoward on why AI coding tools can be a trap -- and what 45 years of programming taught him that most vibe coders will never learn. - AI coding tools exploit gambling psychology - The difference between typing code and software engineering - Enterprise coding AND prompt-only vibe coding are "inhumane" i.e. disconnecting humans from understanding-building - AI tools remove the "desirable difficulty" you need to build deep mental models. Out on MLST now!
English
35
77
613
127K
juri
juri@nlopitz·
@yanaiela For random papers I think so, yeah. Maybe I'm simply overwhelmed by the flood of papers. Written texts have also become a little too homogenized for my taste - nice for fluency and perfect grammar I guess, but it makes the process of reading somehow a little less engaging.
English
0
0
1
28
Yanai Elazar
Yanai Elazar@yanaiela·
@nlopitz do you feel the same over random papers you read day-to-day, or hear about in conferences?
English
1
0
0
85
Yanai Elazar
Yanai Elazar@yanaiela·
What does it say about your field if you're not excited about any paper you review?
English
13
1
82
15.3K
juri
juri@nlopitz·
@yanaiela I think I also have this issue. My "excitement rate" dropped from like 50% to 5% over the last five years. I have wondered if it is some openreview matching problem. I also think that when confs where smaller ACs could do more selective assignment, which probably helped.
English
2
0
1
101
Kawin Ethayarajh
Kawin Ethayarajh@ethayarajh·
Autoregressive LLMs will likely remain dominant for three reasons: 1) As @ducx_du has pointed out, left-to-right and right-to-left orderings of language have a much lower loss floor than all other orderings. This suggests that language is (for the most part) locally dependent. The additional capacity and compute needed to model all possible orderings would be more effectively spent in a traditional AR setup. 2) When people say models should be able to generate text in any order, what they really want is to generate *concepts* in any order, not tokens. But we can already do this! If your model has sufficient depth, it can generate some concepts in latent space before others. The rise of reasoning models means that concepts can both be explored in an arbitrary order and in a way that is interpretable. If you take this to the limit, you get Reinforcement Learning Pretraining. 3) AR models won the hardware lottery / software lottery / other lotteries wherein everything in the ecosystem have bent around them. Unless there are several OOMs of benefits to be gained from switching to another paradigm, it is unlikely that there will be any switch. And because language is the universal glue around multiple modalities, it is likely to make generation in other modalities AR to enable end-to-end learning even if those other modalities would benefit from a non-AR model.
Dimitri von Rütte@dvruette

there, I said it. diffusion LLMs are the future! I'll be back in a couple of years to collect my "I told you so" award.

English
25
69
775
161.8K
Christopher Manning
Christopher Manning@chrmanning·
A mysterious new fake publishing scam: @Google Scholar lists me publishing in the Intl Jnl of CV & AI Applications And indeed here I am in the journal: ijcvai.org/index.php/ijcv… And in very good company! @DarioAmodei in the same issue! Except, I didn't write any such paper….
English
11
12
146
46K