

Distilling Qwen-Image-2512 using TwinFlow, student sucking up knowledge from a monster teacher I think qwen-image is capable of doing far more then Z-image atleast on the realistic front (personal observations). Slashed batch times with MP and custom augs, 8xH200 pinned at max pretty much. Also added RL to the loop in theory it should get better then the teacher but that remains to be seen since the RL kicks in after 2k steps. The loss wont be very indicative of how the training is working I guess since its a distillation run, attaching some results on how it's going [at 1100 and 1600].

























