[quantization] Introduce wrapper for Qwen3VLVisionBlock by dvsav · Pull Request #500 · Samsung/TICO

dvsav · 2026-02-19T10:18:52Z

This change introduces QuantQwen3VLVisionBlock wrapper to support post-training quantization of Qwen3VLVisionBlock module.

Why?

Qwen3VLVisionBlock module is used in the image encoder part of Qwen model.
Trying to quantize Qwen3VLVisionBlock via PTQ generates exception PTQQuantizer: no quantization wrapper for Qwen3VLVisionBlock.

What

This change introduces:

Class QuantQwen3VLVisionBlock (tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_block.py).
Unit tests: class TestQuantQwen3VLVisionBlock (test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_block.py) - skipped if transformers package is not installed.
New entry in _CORE_MODULES (tico/quantization/wrapq/wrappers/registry.py).
Example of Qwen3VLVisionBlock quantization and conversion to Circle (tico/quantization/wrapq/examples/qwen/quantize_vision_block.py).

Unit Tests

Below unit tests run is presented along with coverage information (irrelevant files replaced with ellipsis ...):

$ coverage run -m pytest test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_block.py -v
======================================================================================= test session starts ========================================================================================
platform linux -- Python 3.10.12, pytest-8.4.0, pluggy-1.6.0 -- /home/d.savchenkov/myenv/bin/python3
cachedir: .pytest_cache
rootdir: /home/d.savchenkov/TICO
configfile: pyproject.toml
plugins: anyio-4.12.0, mock-3.15.1, xdist-3.7.0, cov-6.2.1
collected 7 items                                                                                                                                                                                  

test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_block.py::TestQuantQwen3VLVisionBlock::test_different_num_patches            PASSED                                               [ 14%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_block.py::TestQuantQwen3VLVisionBlock::test_forward_diff                     PASSED                                               [ 28%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_block.py::TestQuantQwen3VLVisionBlock::test_mode_transitions                 PASSED                                               [ 42%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_block.py::TestQuantQwen3VLVisionBlock::test_observer_count                   PASSED                                               [ 57%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_block.py::TestQuantQwen3VLVisionBlock::test_output_shape                     PASSED                                               [ 71%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_block.py::TestQuantQwen3VLVisionBlock::test_registration_in_registry         PASSED                                               [ 85%]
test/quantization/wrapq/wrappers/qwen_vl/test_quant_vision_block.py::TestQuantQwen3VLVisionBlock::test_residual_connection_preservation PASSED                                               [100%]

================================================================================== 7 passed, 2 warnings in 7.21s ===================================================================================

$ coverage report -m
Name                                                                    Stmts   Miss  Cover   Missing
-----------------------------------------------------------------------------------------------------
...
tico/quantization/wrapq/wrappers/qwen_vl/quant_vision_block.py             42      0   100%
...
-----------------------------------------------------------------------------------------------------
TOTAL                                                                   10670   6671    37%

Example Script

$ python3 tico/quantization/wrapq/examples/qwen/quantize_vision_block.py
┌───────────── Quantization Error Summary ─────────────
│ Mean |diff|: 0.139611
│ PEIR       : 8.666391 %
└──────────────────────────────────────────────────────
    ┌────────────────────────────────────────────┐
 5.4┤                                            │
    │                                        ••  │
    │                                     •• •   │
 3.6┤                                  ••••••    │
    │                                 ••••••     │
    │                              ••••••        │
    │                           ••••••••         │
 1.9┤                          •••••••           │
    │                       ••••••••             │
    │                     ••••••••               │
 0.1┤                   ••••••••                 │
    │                 ••••••••                   │
    │              •••••••••                     │
    │             ••••••••                       │
-1.6┤           ••••••••                         │
    │          ••••••                            │
    │       ••••••••                             │
-3.4┤      ••••••                                │
    │    •••••                                   │
    │  •••••                                     │
    │                                            │
-5.1┤                                            │
    └┬──────────┬──────────┬─────────┬──────────┬┘
   -5.1       -2.5        0.1       2.8       5.4 


Circle model saved as 'quantized_vision_block.circle'

dvsav · 2026-03-03T08:52:19Z

Reference Code

Below is the source code of Qwen3VLVisionBlock:

# transformers/models/qwen3_vl/modeling_qwen3_vl.py
class Qwen3VLVisionBlock(GradientCheckpointingLayer):
    def __init__(self, config, attn_implementation: str = "sdpa") -> None:
        super().__init__()
        self.norm1 = nn.LayerNorm(config.hidden_size, eps=1e-6)
        self.norm2 = nn.LayerNorm(config.hidden_size, eps=1e-6)
        self.attn = Qwen3VLVisionAttention(config=config)
        self.mlp = Qwen3VLVisionMLP(config=config)

    def forward(
        self,
        hidden_states: torch.Tensor,
        cu_seqlens: torch.Tensor,
        rotary_pos_emb: torch.Tensor | None = None,
        position_embeddings: tuple[torch.Tensor, torch.Tensor] | None = None,
        **kwargs,
    ) -> torch.Tensor:
        hidden_states = hidden_states + self.attn(
            self.norm1(hidden_states),
            cu_seqlens=cu_seqlens,
            rotary_pos_emb=rotary_pos_emb,
            position_embeddings=position_embeddings,
            **kwargs,
        )
        hidden_states = hidden_states + self.mlp(self.norm2(hidden_states))
        return hidden_states

This change introduces QuantQwen3VLVisionBlock wrapper to support post-training quantization of Qwen3VLVisionBlock module. TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>

mhs4670go

LGTM

dayo09

LGTM :-D

FYI, we will need to split heads of VLM vision attention blocks likewise we did in llama attention blocks.

dvsav mentioned this pull request Feb 19, 2026

Qwen3-VL: Implement quantization wrappers #483

Open

[quantization] Introduce wrapper for Qwen3VLVisionBlock

4607679

This change introduces QuantQwen3VLVisionBlock wrapper to support post-training quantization of Qwen3VLVisionBlock module. TICO-DCO-1.0-Signed-off-by: d.savchenkov <d.savchenkov@partner.samsung.com>

dvsav force-pushed the quant_vision_block branch from ff088fc to 4607679 Compare March 3, 2026 16:18

dvsav marked this pull request as ready for review March 3, 2026 16:26

dayo09 requested review from dayo09 and mhs4670go March 4, 2026 06:09

mhs4670go approved these changes Mar 4, 2026

View reviewed changes

dayo09 approved these changes Mar 5, 2026

View reviewed changes

dayo09 merged commit b57a455 into Samsung:main Mar 5, 2026
7 checks passed

dvsav deleted the quant_vision_block branch March 5, 2026 11:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[quantization] Introduce wrapper for Qwen3VLVisionBlock#500

[quantization] Introduce wrapper for Qwen3VLVisionBlock#500
dayo09 merged 1 commit intoSamsung:mainfrom
dvsav:quant_vision_block

dvsav commented Feb 19, 2026 •

edited

Loading

Uh oh!

dvsav commented Mar 3, 2026

Uh oh!

mhs4670go left a comment

Uh oh!

dayo09 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dvsav commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why?

What

Unit Tests

Example Script

Uh oh!

dvsav commented Mar 3, 2026

Reference Code

Uh oh!

mhs4670go left a comment

Choose a reason for hiding this comment

Uh oh!

dayo09 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dvsav commented Feb 19, 2026 •

edited

Loading