feat: add wan2.2_t2v model and quantization config#454
feat: add wan2.2_t2v model and quantization config#454gushiqiao merged 19 commits intoModelTC:mainfrom
Conversation
Add a small test script to load sharded safetensors from a Hugging Face repo/local dir and print parameter keys with shapes. Made-with: Cursor
…sformer experts Add support for skipping quantization on specified transformer blocks (block_ids: [0, 40] → block 0 of transformer and transformer_2) to improve quality of the two highest-impact blocks. Changes: - base_blockwise_quantization.py: add _get_ignored_block_ids_set and _is_ignored_block helpers; modify set_no_quant_layer to skip all linear layers when layer_names is empty; modify run to skip block_transform for ignored blocks so AWQ scales are not applied - configs/…/awq_w_a_skip_first.yaml: new config with ignored_layers block_ids [0, 40] and separate save_path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
088d80d to
007360e
Compare
…uant/run config Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Made-with: Cursor
Charles2530
left a comment
There was a problem hiding this comment.
solve problem above in comments
|
可以解一下合并冲突,然后就可以merge了 |
|
你好,已经解决冲突了 |
JiwaniZakir
left a comment
There was a problem hiding this comment.
The changes to wan_i2v/awq_w_a.yaml, wan_t2v/awq_w_a.yaml, wan_t2v/rtn_w_a.yaml, and wan_t2v/smoothquant_w_a.yaml are purely removing trailing newlines (introducing \ No newline at end of file), which is a regression in the existing files unrelated to the stated goal of this PR and goes against POSIX file conventions.
The new wan2_2_t2v/awq_w_a.yaml uses type: Wan2T2V, whereas the existing wan_t2v configs use type: WanT2V — the diff doesn't include any code registering or implementing the Wan2T2V model class, so it's unclear whether this will resolve correctly at runtime or silently fall back to an incorrect handler.
The newly added docs/wan2.1_quantization_guide.md documents Wan2.1 models (WanI2V, WanT2V) exclusively, but this PR introduces Wan2.2 support (wan2_2_t2v). The guide should either be updated to cover Wan2.2 specifics (notably guidance_scale_2, which appears only in the new config) or a separate doc should be added, since guidance_scale_2: 3.0 in the calib/eval sections is a new parameter with no explanation anywhere in the documentation.
The wan2_2_t2v directory only ships an AWQ config, whereas the existing wan_t2v directory also provides RTN and SmoothQuant variants. If those methods are also supported for Wan2.2, the missing configs should be included for consistency; if not, a comment explaining the omission would be helpful.
Add wan2.2_t2v model and quant configuration, corresponding config and script changes