WanVideo sliding context window test on the 480p 14B T2V -model, worked surprisingly well but incredibly slow (50 mins on a 5090, 513 frames at 832x480)
@jqlive Absolutely, I have the rudimentary implementation in my HunyuanVideo wrapper nodes in ComfyUI as well, it needs some updating to be as good though.
@6___0 It's due to the technique itself, each step basically does the whole video in chunks, which include the overlaps, so many many passes though the model required. Actual VRAM use is really low since only the 81 frames are processed at once, as that's what the model does best.
WanVideo has been lot of fun, currently playing with applying sliding context windowing similar to AnimateDiff, the 1.3B model is especially suitable for this due to it's speed, here's 1025 frames in one go, under 5GB VRAM used, but took about 10 mins on a 5090 with 30 steps.
As an impatient person, I often get obsessed with optimizations, tried implementing TeaCache for WanVideo, failed, but in the process accidentally (maybe) succeeded? Results look promising at least!
@el_mejnun I actually have both, I couldn't justify upgrading but 2nd rig for all this stuff I couldn't resist, so I splurged.
It's considerably faster, around ~40% or so with all optimizations. Slightly difficult to use still as you have to compile many things from source. Runs hot though
@Kijaidesign Thanks for your hard work! Also would love to see every benchmark result with 5090. I believe you had 4090 before so how is the performance so far
@YolaoDude Any text2video model can do some level of video2video as long as the VAE encoder is available, and it already was. The process is basically same as the usual img2img.
HunhuyanVideo -model's vid2vid passes the hippo test! Very promising and versatile model, thanks to the cfg distillation and every possible optimization I could think of this clip of 101 frames at 768x432 took about 2 minutes to sample, fitting slightly under 20GB using my nodes.
@Kijaidesign thanks @Kijaidesign for all the work you do, cant wait to get my fingers on this to make some more test!
Here is a quick 3dgs test of your footage without cleaning 😋 This stuff will be so much fun in the next years!
I have finally pushed a bigger update to my CogVideoX ComfyUI wrapper nodes, cleaning up most of the bloat that has been accumulating with all these different models. One of the discoveries I made during this is that the orbit -LoRAs work with the "Fun" -models as well!
@OneStrangeW The workflow, among with many (much simpler) others are included with the nodes. I'm afraid I can't help with huggingface issues as all the models are hosted there.
Amazing! Any link for the workflow and nodes? Please? I messed up, something with the cog. Now it's asking me for a node that links to it on Huggingface, give me an error 404. I want to start from scratch. I am using Mochi, but I was getting better results with cog, and use less resources. I am not an expert, that is why I believe I better start from scratch.