Max

2.4K posts

Max

@max_pe2002

🇪🇺eu/acc, compressing reality (pied piper)

earth Katılım Ağustos 2020

661 Takip Edilen236 Takipçiler

Sabitlenmiş Tweet

Max@max_pe2002·28 Eyl

ZXX

Max@max_pe2002·10h

@SwayStar123 @NicholasBardy @JiaweiYang118 also doesnt work for conditional from scratch training as it is currently. would have to seperate FID over classes

English

sway@SwayStar123·1d

@NicholasBardy @JiaweiYang118 Not sure that this is stable, they showed some samples from overtraining, it collapses/reward hacks. So i dont think this would work for pretraining. Correct me if im wrong

English

Jiawei Yang@JiaweiYang118·3d

Two months ago, I vaguely posted a number: 0.9 FID, one-step, pixel space. Now it is 0.75, and can be even lower. Many wonder how. I thought it might end as a small FID prank: simple and deliberate. It started with one question: can FID be optimized directly, and what does it reveal? Introducing FD-loss.

English

151

894

196.8K

Max@max_pe2002·3d

@alex_peys yeah this paper probably works as a straight path from noise to sample. but having to run like 3 encoders might be slower than just training with mse

English

alex peysakhovich@alex_peys·3d

a good question asked and answered: why are you training with one metric (e.g. mse on flow) and evaluating on another one (FID)?

Jiawei Yang@JiaweiYang118

English

4.3K

Max@max_pe2002·3d

@sama @lukaszkaiser claude code is better but if you actually want to get done with more than 2 back and forths codex is better

English

Sam Altman@sama·3d

you know what all of these "which is better" polls are silly use codex or claude code, whatever works best for you i am grateful we live in a time with such amazing tools, and grateful there is a choice

English

2.2K

1.1K

23K

1.6M

Max@max_pe2002·3d

@JiaweiYang118 @ChongZitaZhang very very good work! congrats! reminds me a bit off the Drift paper but better

English

351

Max@max_pe2002·5d

@scaling01 i feel bad for mistral.

English

152

Lisan al Gaib@scaling01·5d

Mistral Medium 3.5 is out and it's a dense 128B model

English

1.2K

1.1M

Max@max_pe2002·24 Nis

@LodestoneRock i think the opposite happens with x loss

English

lodestone-rock@LodestoneRock·24 Nis

@max_pe2002 yep

157

lodestone-rock@LodestoneRock·24 Nis

was debugging my x0 trainer using 1 image with lots of high frequency in it and got this interesting pattern. it seems to converge on high freq first before everything else. seems like x0 is doing frequency weighting implicitly based on your data distribution.

English

6.8K

Max@max_pe2002·24 Nis

@levelsio they needed to investigate this? would have been enough to open X and see all the posts where claude tells them reasoning effort ist set to 25%

English

@levelsio@levelsio·23 Nis

I can't believe we were right Claude was dumbified on March 4, just when we noticed!

@levelsio@levelsio

Claude Code with Opus 4.6 was so dumb today I finally had to write my own code again A sad state of affairs 🥹

English

318

434

9.1K

Max@max_pe2002·23 Nis

@madebyollin @gabeeegoooh @SwayStar123 hmm seems like a GAN artifact then if its always the same position

English

Ollin Boer Bohan@madebyollin·23 Nis

@max_pe2002 @gabeeegoooh @SwayStar123 Hmm, have all of your artifacty images been from multi-image chats? It looks like there's a strong bias towards texture/feature copying across images within a chat thread: x.com/madebyollin/st…

Ollin Boer Bohan@madebyollin

@JiaweiYang118 Some of the ChatGPT Images 2 weirdness is texture leakage from context images (see this thread reddit.com/r/ChatGPT/comm…) You can visualize this by asking for three unrelated images in the same chat (note how the flower texture persists in the circled area).

English

143

Max@max_pe2002·22 Nis

what do we think about these gpt image 2 artifacts. Do you guys think the decoder is trained as a GAN/LDM/ PixelDM

English

418

Max@max_pe2002·23 Nis

@SwayStar123 this is what i was thinking about

Max@max_pe2002

@madebyollin @gabeeegoooh i think @SwayStar123 had also previously observed such artifacts

English

222

sway@SwayStar123·23 Nis

Turns out I invented gpt image 2 before openai did

sway@SwayStar123

If you distill a model with LADD at low resolution, then use it at high resolution, the model is heavily biased towards highly detailed images!

English

4.4K

Max@max_pe2002·23 Nis

@madebyollin @gabeeegoooh i think @SwayStar123 had also previously observed such artifacts

English

350

Ollin Boer Bohan@madebyollin·23 Nis

@max_pe2002 This reddit thread has more examples of the weird mid-frequency artifacts, seems like a widespread issue... reddit.com/r/StableDiffus… cc @gabeeegoooh for awareness

English

178

Max@max_pe2002·22 Nis

@madebyollin yes but i also see like a repeating thing at the bottomo of the first image

English

Ollin Boer Bohan@madebyollin·22 Nis

@max_pe2002 Yeah looks like LDM decoder, maybe with some sort of scheduler issue (not enough mid-frequency noise removed?).

English

175

Max@max_pe2002·22 Nis

@nicdunz what do you guys think gan artifact or ldm artifact?

English

134

nic@nicdunz·22 Nis

WHAT IS THIS WEIRD PATTERN MAKING MY IMAGE GENERATIONS FROM IMAGE 2 LOOK HORRIBLE??????

English

178

25.8K

Max@max_pe2002·22 Nis

@BenjaminDEKR i think they just might have trained on everything

English

Benjamin De Kraker@BenjaminDEKR·21 Nis

It's very obvious they specifically trained it on millions of software interface screenshots. Now ask yourself why they would do that

Kate Deyneka@katedeyneka

GPT-Image-2 is out! the model is insanely good at rendering text and generating all the tiny details in complex software interfaces - I'm super impressed prompt: “Generate a realistic desktop screenshot of Adobe Premiere”

English

4.1K

788.9K

Max@max_pe2002·19 Nis

@LocallyAIApp i cant get a good answer out of the bonsai models

English

Max@max_pe2002·18 Nis

@industriaalist omg finally someone who understands. with limited data and unlimited compute diffusion models are better but per data they are worse.

English

479

Samip@industriaalist·18 Nis

diffusion models are not more data efficient than autoregressive models stop saying that

English

6.2K

Max@max_pe2002·16 Nis

@SawyerMerritt yeah still not upgrading my tesla until hw 5 is in cars after what they did to hw3 users

English

Sawyer Merritt@SawyerMerritt·15 Nis

This should make all current AI4 Tesla owners feel good.

Elon Musk@elonmusk

@teslaownersSV @TaiwanSemi_TSC @Samsung Optimus and our supercomputer clusters. AI4 is enough to achieve much better than human safety for FSD.

English

180

128

2.2K

136.9K

Max@max_pe2002·14 Nis

@Karl_Lauterbach sie sind dran schuld!!!

Deutsch

Prof. Karl Lauterbach@Karl_Lauterbach·13 Nis

KI Modelle werden immer stärker. Die Unterschiede sind so groß dass man sie beim Wechsel auf bessere Versionen sofort erkennt. Niemand will daher ein Modell der 2. Liga. Es scheint mir katastrophal, dass wir mittlerweile hier komplett abgeschlagen sind. Eine enorme Abhängigkeit

Antonin Bergeaud@a_bergeaud

Le AI index report de Stanford (@erikbryn et al.) est sorti aujourd'hui et le premier graphique est assez douloureux hai.stanford.edu/ai-index

Deutsch

268

402

87.4K

Max@max_pe2002·12 Nis

@neural_avb should play boss music as soon as he says my name is jürgen schmidhuber

English

1.6K

AVB@neural_avb·11 Nis

My favourite piece of Schmidhuber lore is when he challenged Ian Goodfellow during a NIPS presentation on GANs Live in public Deep Learning drama peaked here. You have seen nothing like this.

Yuntian Deng@yuntiandeng

Glad to see followups to neural-os.com, but disappointed that neither the blog (with 34 refs) nor the code repo acknowledged NeuralOS, even tho the released data code appears to build directly on top of ours. That omission is hard to understand given our shared vision.

English

619

106.3K

Keşfet

@SwayStar123 @NicholasBardy @JiaweiYang118 @alex_peys @sama @lukaszkaiser @ChongZitaZhang @scaling01