T2V + I2V Mega Extended Length workflow

#122
by VainGuard - opened

I forked @OnkelTom 's I2V Mega Infinite Length 2.4 workflow. The biggest thing that finally clicked for me in his workflow was the use of multiple frames to feed into the next VACE batch. This let me stabilize the continued batches for AIO Mega. (Sadly this is not possible for v10, right? Either Mega or VACE has cartooning and odd movement/emotions, but I was desperate to have 30+ seconds of video.) I also loved the batched prompt node as I was looking for this functionality.

UPDATED 2025/10/01:
https://github.com/UnforeseenSoftware/VainGuard-Workflows/blob/main/RapidAIO%20Mega%20-%20V2.4%20-%20branch%20VainGuard%20T2V-I2V.json

2025/10/01 Changes: Include the first 12 frames for T2V (these need to be burned for I2V, but are good for T2V). Add optional reference image input (this is like a semi-start frame suggestion and influence for each loop). Add LoRA support. Add loop iteration info (so you can know how far into the generation you are at).

The first input lets you specify if you want I2V or T2V mode. You still need to select an image for video ratio size in T2V mode. Normally, I would have width and height inputs, but I am preserving OnkelTom's megapixel approach. I've add a note on which to use for 480p and 720p.

I fixed the KSampler to always use a different seed with each loop. Even though you specify randomize, that same seed applies for the duration of that whole workflow. I add the loop index to the seed so that the seed shifts on each loop execution. This allows you to repeat the prompt batch line and still get variation.

I believe I improved the way color matching works. This is very noticeable with T2V as there is no initial color reference to use. For both I2V and T2V I save a sample frame near the end of the first batch and subsequent batches then get color matched to that first batch.

Sorry, I had to add at least Crystools Nodes to this as their "Switch from any" node was the only way I was able to emulate proper null values for T2V mode and other logic.

The example T2V prompts with a 16:9 image reference, 0.9 megapixel selection (1280x720) generated on a 4080 Super (16GB VRAM) with 64GB system RAM. The 20 second video took 23 minutes and the 28 second video took 38 minutes.

2025/10/01 example prompt. This came out bad, but it includes the newer workflow. I'll make a better prompt version later.

Very good work! As I have said in one of my posts, I have a big problem in characters to follow the prompt. That drives me crazy (much more in I2V).

I tried your workflow using the same prompt and changing the resolution to 0.4 (848x480)... I got this cursed AI video lol I don't know why. I will test it again with the first 3 prompts. I change the input image.

@junior600 does the official Rapid Mega T2V workflow work for you? I know that I couldn't get any 4-step accelerators to work (which is built into Rapid) for any video until I installed Triton and Sage Attention 2.2 and enable --sage-attn in ComfyUI.

How should I set the megapixels for vertical videos in i2v?

@okims
Megapixel is the amout of total pixel in your image. It has nothing to do with the ratio of your output video. The image you use sets the ratio of the video. So if you use a vertical image the video will be in exact the same ratio.

@OnkelTom
oh. Indeed, using the megapixel method is definitely more convenient. Thank you.

@VainGuard
It works really well, thank you.
I was wondering is it possible to have a node that supports the GGUF version with an added Power LoRA?
On 12GB VRAM, it’s really heavy, so it would be great if there’s a node that can handle both a quantized version and LoRA.
The reason I need LoRA is that, aside from the pre-integrated NSFW LoRA, I want to be able to run LoRAs separately on the standard Mega model version that I download myself.
This way, I can fine-tune the LoRA more precisely according to my needs. It would be great if there were also multi-LoRA nodes like Power LoRA to handle multiple LoRAs at once

@okims I've updated the workflow on github. I included LoRA and optional reference image support. For now you'll have to swap out the checkpoint loader with your GGUF/clip/VAE node loader manually for now. I would need to maintain a completely separate GGUF version as the links need to be redone. I don't see a way to proxy them that would allow you to bypass one option for another.

Thank you for these and the updates. I did enjoy the original workflow, but this is such a great fork and QoL for doing both T2V and I2V. Worked right out of the box. When I first saw the CreaPrompt Multi Prompts I was like, "Man, this is what I've been wanting for ages!". Didn't know it existed or could be wired the way it is to make truly infinite stuff. I tried to make my own workflow to make it switchable from t2v and i2v but yours has it built in nicely and also some decent consistency QoL stuff. Also do appreciate the lora added because, well, it was kind of a PIA to figure out how to implement it properly. I'm no expert with workflows so I was getting all sorts of issues trying to get the lora working and looping on each iteration. So yeah, thanks again for all the work! And Onkel too.

Great job!!! I tried your default workflow, changing it from 4 to 5 seconds, 0.5 megapixels, and 960x528 resolution. I changed the last 12 images from the previous batch to 5 images as a guide for the next batch. It turned out to be a pretty good 40-second scene. 10GB 3080 card, 32GB RAM. Model version wan2.2-rapid-mega-aio-nsfw-v5

Great job!!! I tried your default workflow, changing it from 4 to 5 seconds, 0.5 megapixels, and 960x528 resolution. I changed the last 12 images from the previous batch to 5 images as a guide for the next batch. It turned out to be a pretty good 40-second scene. 10GB 3080 card, 32GB RAM. Model version wan2.2-rapid-mega-aio-nsfw-v5

That's great! This was way better than my video, but it is because it looks to start over a new scene in each extension. I guess 5 isn't enough to keep the same scene continuous? I've only gone as low as 8, but in theory should have still worked with even 1 (but very jittery).

Awesome work thank you.

Sign up or log in to comment