Max

2.4K posts

Max

Max

@max_pe2002

🇪🇺eu/acc, compressing reality (pied piper)

earth Katılım Ağustos 2020
661 Takip Edilen236 Takipçiler
Sabitlenmiş Tweet
Max
Max@max_pe2002·
Max tweet mediaMax tweet mediaMax tweet mediaMax tweet media
ZXX
0
0
5
3K
sway
sway@SwayStar123·
@NicholasBardy @JiaweiYang118 Not sure that this is stable, they showed some samples from overtraining, it collapses/reward hacks. So i dont think this would work for pretraining. Correct me if im wrong
English
2
0
0
54
Jiawei Yang
Jiawei Yang@JiaweiYang118·
Two months ago, I vaguely posted a number: 0.9 FID, one-step, pixel space. Now it is 0.75, and can be even lower. Many wonder how. I thought it might end as a small FID prank: simple and deliberate. It started with one question: can FID be optimized directly, and what does it reveal? Introducing FD-loss.
Jiawei Yang tweet media
English
53
151
894
196.8K
Max
Max@max_pe2002·
@alex_peys yeah this paper probably works as a straight path from noise to sample. but having to run like 3 encoders might be slower than just training with mse
English
0
0
0
61
Max
Max@max_pe2002·
@sama @lukaszkaiser claude code is better but if you actually want to get done with more than 2 back and forths codex is better
English
0
0
0
16
Sam Altman
Sam Altman@sama·
you know what all of these "which is better" polls are silly use codex or claude code, whatever works best for you i am grateful we live in a time with such amazing tools, and grateful there is a choice
English
2.2K
1.1K
23K
1.6M
Max
Max@max_pe2002·
@scaling01 i feel bad for mistral.
English
0
0
0
152
Lisan al Gaib
Lisan al Gaib@scaling01·
Mistral Medium 3.5 is out and it's a dense 128B model
Lisan al Gaib tweet mediaLisan al Gaib tweet media
English
70
53
1.2K
1.1M
Max
Max@max_pe2002·
@LodestoneRock i think the opposite happens with x loss
English
0
0
0
48
lodestone-rock
lodestone-rock@LodestoneRock·
was debugging my x0 trainer using 1 image with lots of high frequency in it and got this interesting pattern. it seems to converge on high freq first before everything else. seems like x0 is doing frequency weighting implicitly based on your data distribution.
English
2
1
69
6.8K
Max
Max@max_pe2002·
@levelsio they needed to investigate this? would have been enough to open X and see all the posts where claude tells them reasoning effort ist set to 25%
English
0
0
0
33
Ollin Boer Bohan
Ollin Boer Bohan@madebyollin·
@max_pe2002 @gabeeegoooh @SwayStar123 Hmm, have all of your artifacty images been from multi-image chats? It looks like there's a strong bias towards texture/feature copying across images within a chat thread: x.com/madebyollin/st…
Ollin Boer Bohan@madebyollin

@JiaweiYang118 Some of the ChatGPT Images 2 weirdness is texture leakage from context images (see this thread reddit.com/r/ChatGPT/comm…) You can visualize this by asking for three unrelated images in the same chat (note how the flower texture persists in the circled area).

English
2
0
1
143
Max
Max@max_pe2002·
what do we think about these gpt image 2 artifacts. Do you guys think the decoder is trained as a GAN/LDM/ PixelDM
Max tweet mediaMax tweet mediaMax tweet media
English
3
0
5
418
Max
Max@max_pe2002·
@madebyollin yes but i also see like a repeating thing at the bottomo of the first image
English
1
0
1
75
Ollin Boer Bohan
Ollin Boer Bohan@madebyollin·
@max_pe2002 Yeah looks like LDM decoder, maybe with some sort of scheduler issue (not enough mid-frequency noise removed?).
English
1
0
1
175
Max
Max@max_pe2002·
@nicdunz what do you guys think gan artifact or ldm artifact?
English
0
0
2
134
nic
nic@nicdunz·
WHAT IS THIS WEIRD PATTERN MAKING MY IMAGE GENERATIONS FROM IMAGE 2 LOOK HORRIBLE??????
nic tweet medianic tweet media
English
46
5
178
25.8K
Max
Max@max_pe2002·
@BenjaminDEKR i think they just might have trained on everything
English
0
0
0
32
Max
Max@max_pe2002·
@LocallyAIApp i cant get a good answer out of the bonsai models
Max tweet media
English
0
0
0
13
Max
Max@max_pe2002·
@industriaalist omg finally someone who understands. with limited data and unlimited compute diffusion models are better but per data they are worse.
English
1
0
2
479
Samip
Samip@industriaalist·
diffusion models are not more data efficient than autoregressive models stop saying that
English
7
1
53
6.2K
Max
Max@max_pe2002·
@SawyerMerritt yeah still not upgrading my tesla until hw 5 is in cars after what they did to hw3 users
English
0
0
0
24
Prof. Karl Lauterbach
Prof. Karl Lauterbach@Karl_Lauterbach·
KI Modelle werden immer stärker. Die Unterschiede sind so groß dass man sie beim Wechsel auf bessere Versionen sofort erkennt. Niemand will daher ein Modell der 2. Liga. Es scheint mir katastrophal, dass wir mittlerweile hier komplett abgeschlagen sind. Eine enorme Abhängigkeit
Antonin Bergeaud@a_bergeaud

Le AI index report de Stanford (@erikbryn et al.) est sorti aujourd'hui et le premier graphique est assez douloureux hai.stanford.edu/ai-index

Deutsch
268
49
402
87.4K
Max
Max@max_pe2002·
@neural_avb should play boss music as soon as he says my name is jürgen schmidhuber
English
0
0
10
1.6K
AVB
AVB@neural_avb·
My favourite piece of Schmidhuber lore is when he challenged Ian Goodfellow during a NIPS presentation on GANs Live in public Deep Learning drama peaked here. You have seen nothing like this.
Yuntian Deng@yuntiandeng

Glad to see followups to neural-os.com, but disappointed that neither the blog (with 34 refs) nor the code repo acknowledged NeuralOS, even tho the released data code appears to build directly on top of ours. That omission is hard to understand given our shared vision.

English
14
34
619
106.3K