[Wan 2.2 LoRA] add support for 2nd transformer lora loading + wan 2.2 lightx2v lora #12074

linoytsaban · 2025-08-05T10:55:15Z

Wan 2.2 has 2 transformers, the community has found it to be beneficial to load Wan LoRAs into both transformers and occasionally in different scales as well (this also applies for Wan 2.1 LoRAs, loaded into transformer and transformer_2).
Recently, new lighting LoRA was released for Wan2.2 T2V- with separate weights for transformer (High noise stage) and transformer_2 (Low noise stage)

This PR adds support for LoRA loading into transformer_2 + adds support for lightning LoRA (has alpha keys)

T2V example:

import torch
import numpy as np
from diffusers import WanPipeline, AutoencoderKLWan
from diffusers.utils import export_to_video, load_image

dtype = torch.bfloat16
device = "cuda"
vae = AutoencoderKLWan.from_pretrained("Wan-AI/Wan2.2-T2V-A14B-Diffusers", subfolder="vae", torch_dtype=torch.float32)
pipe = WanPipeline.from_pretrained("Wan-AI/Wan2.2-T2V-A14B-Diffusers", vae=vae, torch_dtype=dtype)
pipe.to(device)

pipe.load_lora_weights(
   "Kijai/WanVideo_comfy", 
   weight_name="Wan22-Lightning/Wan2.2-Lightning_T2V-A14B-4steps-lora_HIGH_fp16.safetensors", 
    adapter_name="lightning"
)
kwargs = {}
kwargs["load_into_transformer_2"] = True
pipe.load_lora_weights(
   "Kijai/WanVideo_comfy", 
   weight_name="Wan22-Lightning/Wan2.2-Lightning_T2V-A14B-4steps-lora_LOW_fp16.safetensors", 
    adapter_name="lightning_2", **kwargs
)
pipe.set_adapters(["lightning", "lightning_2"], adapter_weights=[1., 1.])

height = 480
width = 832

prompt = "Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage."
negative_prompt = "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走"
output = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=height,
    width=width,
    num_frames=81,
    guidance_scale=1.0,
    guidance_scale_2=1.0,
    num_inference_steps=4,
    generator=torch.manual_seed(0),
).frames[0]
export_to_video(output, "t2v_out.mp4", fps=16)

t2v_out-5.mp4

HuggingFaceDocBuilderDev · 2025-08-05T11:02:28Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

luke14free · 2025-08-05T14:27:26Z

curious to see an example @linoytsaban would love to try this out

sayakpaul

Thanks for working on this. Left some comments.

sayakpaul · 2025-08-05T14:28:23Z

src/diffusers/loaders/lora_conversion_utils.py

-            converted_key = f"blocks.{i}.attn1.{c}.lora_A.weight"
-            if original_key in original_state_dict:
-                converted_state_dict[converted_key] = original_state_dict.pop(original_key)
+            has_alpha = f"blocks.{i}.self_attn.{o}.alpha" in original_state_dict


Suggested change

has_alpha = f"blocks.{i}.self_attn.{o}.alpha" in original_state_dict

alpha_key = f"blocks.{i}.self_attn.{o}.alpha"

has_alpha = alpha_key in original_state_dict

sayakpaul · 2025-08-05T14:28:38Z

src/diffusers/loaders/lora_conversion_utils.py

+            if has_alpha:
+                down_weight = original_state_dict.pop(original_key_A)
+                up_weight = original_state_dict.pop(original_key_B)
+                scale_down, scale_up = get_alpha_scales(down_weight, f"blocks.{i}.self_attn.{o}.alpha")


Suggested change

scale_down, scale_up = get_alpha_scales(down_weight, f"blocks.{i}.self_attn.{o}.alpha")

scale_down, scale_up = get_alpha_scales(down_weight, alpha_key)

sayakpaul · 2025-08-05T14:30:06Z

src/diffusers/loaders/lora_conversion_utils.py

+            if has_alpha:
+                down_weight = original_state_dict.pop(original_key_A)
+                up_weight = original_state_dict.pop(original_key_B)


Why does the popping have to be conditioned on has_alpha? Previously, that wasn't the case.

I think we can just check if has_alpha and just pop the alpha_key, keeping the existing code as is?

sayakpaul · 2025-08-05T14:30:28Z

src/diffusers/loaders/lora_conversion_utils.py

+            has_alpha = f"blocks.{i}.cross_attn.{o}.alpha" in original_state_dict
+            original_key_A = f"blocks.{i}.cross_attn.{o}.{lora_down_key}.weight"
+            converted_key_A = f"blocks.{i}.attn2.{c}.lora_A.weight"
+
+            original_key_B = f"blocks.{i}.cross_attn.{o}.{lora_up_key}.weight"
+            converted_key_B = f"blocks.{i}.attn2.{c}.lora_B.weight"
+
+            if has_alpha:
+                down_weight = original_state_dict.pop(original_key_A)
+                up_weight = original_state_dict.pop(original_key_B)
+                scale_down, scale_up = get_alpha_scales(down_weight, f"blocks.{i}.cross_attn.{o}.alpha")
+                converted_state_dict[converted_key_A] = down_weight * scale_down
+                converted_state_dict[converted_key_B] = up_weight * scale_up
+            else:
+                if original_key_A in original_state_dict:
+                    converted_state_dict[converted_key_A] = original_state_dict.pop(original_key_A)


Same as above.

sayakpaul · 2025-08-05T14:31:23Z

src/diffusers/loaders/lora_pipeline.py

-            hotswap=hotswap,
-        )
+        load_into_transformer_2 = kwargs.pop("load_into_transformer_2", False)
+        if load_into_transformer_2:


Should raise in case geattr(self, "transformer_2", None) is None.

sayakpaul · 2025-08-05T14:32:01Z

src/diffusers/loaders/lora_pipeline.py

@@ -5064,7 +5064,7 @@ class WanLoraLoaderMixin(LoraBaseMixin):
    Load LoRA layers into [`WanTransformer3DModel`]. Specific to [`WanPipeline`] and `[WanImageToVideoPipeline`].
    """

-    _lora_loadable_modules = ["transformer"]
+    _lora_loadable_modules = ["transformer", "transformer_2"]


Just to note that this loader is shared amongst Wan 2.1 and 2.2 as the pipelines are also one and the same. For Wan 2.1, we won't have any transformer_2.

sayakpaul · 2025-08-05T14:32:28Z

src/diffusers/loaders/lora_pipeline.py

+        else:
+            self.load_lora_into_transformer(
+                state_dict,
+                transformer=getattr(self, self.transformer_name) if not hasattr(self,
+                                                                                "transformer") else self.transformer,
+                adapter_name=adapter_name,
+                metadata=metadata,
+                _pipeline=self,
+                low_cpu_mem_usage=low_cpu_mem_usage,
+                hotswap=hotswap,
+            )


Why put it under else?

linoytsaban · 2025-08-05T16:53:35Z

I2V example: using Wan2.2 with Wan2.1 lightning LoRA

import torch
import numpy as np
from diffusers import WanImageToVideoPipeline
from diffusers.utils import export_to_video, load_image

model_id = "Wan-AI/Wan2.2-I2V-A14B-Diffusers"
dtype = torch.bfloat16
device = "cuda"

pipe = WanImageToVideoPipeline.from_pretrained(model_id, torch_dtype=dtype)
pipe.to(device)


pipe.load_lora_weights(
   "Kijai/WanVideo_comfy", 
    weight_name="Lightx2v/lightx2v_I2V_14B_480p_cfg_step_distill_rank128_bf16.safetensors", 
    adapter_name="lightning"
)
kwargs = {}
kwargs["load_into_transformer_2"] = True
pipe.load_lora_weights(
  "Kijai/WanVideo_comfy", 
            weight_name="Lightx2v/lightx2v_I2V_14B_480p_cfg_step_distill_rank128_bf16.safetensors", 
    adapter_name="lightning_2", **kwargs
)
pipe.set_adapters(["lightning", "lightning_2"], adapter_weights=[1., 1.])
pipe.fuse_lora(adapter_names=["lightning"], lora_scale=3., components=["transformer"])
pipe.fuse_lora(adapter_names=["lightning_2"], lora_scale=1., components=["transformer_2"])
pipe.unload_lora_weights()

image = load_image(
    "https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/wan_i2v_input.JPG"
)
max_area = 480 * 832
aspect_ratio = image.height / image.width
mod_value = pipe.vae_scale_factor_spatial * pipe.transformer.config.patch_size[1]
height = round(np.sqrt(max_area * aspect_ratio)) // mod_value * mod_value
width = round(np.sqrt(max_area / aspect_ratio)) // mod_value * mod_value
image = image.resize((width, height))
prompt = "POV selfie video, white cat with sunglasses standing on surfboard, relaxed smile, tropical beach behind (clear water, green hills, blue sky with clouds). Surfboard tips, cat falls into ocean, camera plunges underwater with bubbles and sunlight beams. Brief underwater view of cat’s face, then cat resurfaces, still filming selfie, playful summer vacation mood."

negative_prompt = "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走"
generator = torch.Generator(device=device).manual_seed(42)
output = pipe(
    image=image,
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=height,
    width=width,
    num_frames=81,
    guidance_scale=1,
    num_inference_steps=4,
    generator=generator,
).frames[0]
export_to_video(output, "i2v_output.mp4", fps=16)

i2v_output-84.mp4

luke14free · 2025-08-05T18:09:47Z

thanks a lot for the amazing work @linoytsaban just FYI issue #12047 also applies to this PR, I tried and I get the mismatch error with GGUF models, reporting as they are the most popular way to run Wan on consumer hardware.

mayankagrawal10198 · 2025-08-06T07:52:47Z

@linoytsaban are we sure if we don't put boundary_ratio args in our generation pipe would still choose transformer2 as low noise ? Bcs I can see first PR on wan2.2 #12004 has these lines

if self.config.boundary_ratio is not None:
boundary_timestep = self.config.boundary_ratio * self.scheduler.config.num_train_timesteps
else:
boundary_timestep = None

    with self.progress_bar(total=num_inference_steps) as progress_bar:
        for i, t in enumerate(timesteps):
            if self.interrupt:
                continue

            self._current_timestep = t

            if boundary_timestep is None or t >= boundary_timestep:
                # wan2.1 or high-noise stage in wan2.2
                current_model = self.transformer
                current_guidance_scale = guidance_scale
            else:
                # low-noise stage in wan2.2
                current_model = self.transformer_2
                current_guidance_scale = guidance_scale_2

linoytsaban added 2 commits August 5, 2025 13:25

add alpha

96864fb

load into 2nd transformer

0847255

linoytsaban requested review from sayakpaul and a-r-r-o-w August 5, 2025 10:55

Merge branch 'main' into wan22-lightx2v

dcce164

sayakpaul reviewed Aug 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Wan 2.2 LoRA] add support for 2nd transformer lora loading + wan 2.2 lightx2v lora #12074

[Wan 2.2 LoRA] add support for 2nd transformer lora loading + wan 2.2 lightx2v lora #12074

linoytsaban commented Aug 5, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Aug 5, 2025

Uh oh!

luke14free commented Aug 5, 2025

Uh oh!

sayakpaul left a comment

Uh oh!

sayakpaul Aug 5, 2025

Uh oh!

sayakpaul Aug 5, 2025

Uh oh!

sayakpaul Aug 5, 2025

Uh oh!

sayakpaul Aug 5, 2025

Uh oh!

sayakpaul Aug 5, 2025

Uh oh!

sayakpaul Aug 5, 2025

Uh oh!

sayakpaul Aug 5, 2025

Uh oh!

linoytsaban commented Aug 5, 2025

Uh oh!

luke14free commented Aug 5, 2025

Uh oh!

mayankagrawal10198 commented Aug 6, 2025

Uh oh!

Uh oh!

	has_alpha = f"blocks.{i}.self_attn.{o}.alpha" in original_state_dict
	alpha_key = f"blocks.{i}.self_attn.{o}.alpha"
	has_alpha = alpha_key in original_state_dict

	scale_down, scale_up = get_alpha_scales(down_weight, f"blocks.{i}.self_attn.{o}.alpha")
	scale_down, scale_up = get_alpha_scales(down_weight, alpha_key)

[Wan 2.2 LoRA] add support for 2nd transformer lora loading + wan 2.2 lightx2v lora #12074

Are you sure you want to change the base?

[Wan 2.2 LoRA] add support for 2nd transformer lora loading + wan 2.2 lightx2v lora #12074

Conversation

linoytsaban commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Aug 5, 2025

Uh oh!

luke14free commented Aug 5, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

linoytsaban commented Aug 5, 2025

Uh oh!

luke14free commented Aug 5, 2025

Uh oh!

mayankagrawal10198 commented Aug 6, 2025

Uh oh!

Uh oh!

linoytsaban commented Aug 5, 2025 •

edited

Loading