
@yar_vol @abhi_venigalla @MosaicML One of the authors here... here's the loss for the first 1000 iterations of a run we just did on 256 GPUs starting from init. Loss for diffusion models doesn't always correlate with sample quality, so we're doing the work to really prove things are working for an upcoming blog :)

English

