Skip to content

Conversation

@negvet
Copy link
Collaborator

@negvet negvet commented Jan 23, 2026

Description

Introducing semantic quantizer roles, e.g. input:linear, grad_output:layernorm_linear.
Emitted by module/op and used through RecipeState.create(., roles=..), so that right quantizers can be constructed without relying on index in a list.

Now used only by CustomRecipe, but can be extended to all recipes.
Also extendable to arbitrary operations, e.g. qkv:dpa and s:dpa (scores) for attention.

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Change A
  • Change B

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

negvet and others added 4 commits January 23, 2026 15:14
…ipe state

Signed-off-by: Evgeny <etsykunov@nvidia.com>
Signed-off-by: Evgeny <etsykunov@nvidia.com>
Signed-off-by: Evgeny <etsykunov@nvidia.com>
@negvet negvet requested review from cyanguwa and timmoon10 January 23, 2026 15:32
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 23, 2026

Greptile Summary

Introduces semantic quantizer roles (e.g., input:linear, grad_output:layernorm_linear) for the CustomRecipe quantization system. Modules and operations now emit role strings that describe the semantic purpose of each quantizer, which are consumed by RecipeState.create(., roles=..) to construct the right quantizers without relying on positional indices.

Key Changes:

  • Added roles parameter to RecipeState.create() and stored on recipe state instances
  • CustomRecipeState.make_quantizers() now requires roles and validates their count matches num_quantizers
  • Added get_quantizer_roles() method to base classes (BasicOperation, TransformerEngineBaseModule)
  • Implemented role emission in all modules: Linear, LayerNormLinear, LayerNormMLP, GroupedLinear, and BasicLinear op
  • Role format changed from linear_input to input:linear (bucket:scope pattern) - breaking change for CustomRecipe users
  • Updated all example quantizer factories and tests to use new role format with validation
  • Role strings follow pattern <bucket>:<scope> where bucket is input, weight, output, grad_output, or grad_input

Architecture:
The implementation creates a clean separation between module/op concerns (declaring what quantizers are needed via semantic roles) and recipe concerns (creating appropriate quantizers for those roles). This enables extensibility to other operations like attention (qkv:dpa, s:dpa) in the future.

Confidence Score: 5/5

  • Safe to merge - well-structured refactoring with comprehensive test coverage
  • The changes are well-designed and thoroughly tested. The implementation adds a clean abstraction layer for semantic quantizer roles, properly validates inputs, maintains backward compatibility for non-CustomRecipe use cases, and includes comprehensive test coverage. All tests have been updated to reflect the new API. The breaking change to CustomRecipe is intentional and documented.
  • No files require special attention

Important Files Changed

Filename Overview
transformer_engine/pytorch/quantization.py Updated RecipeState.create() to accept optional roles parameter and set it on the returned state. Refactored CustomRecipeState.make_quantizers() to require roles and validate their length against num_quantizers. Changes enable semantic role-based quantizer creation for CustomRecipe.
transformer_engine/pytorch/ops/op.py Added get_quantizer_roles() method to BasicOperation base class and updated reset_recipe_state() to pass roles to RecipeState.create(). Enables operations to declare semantic quantizer roles.
transformer_engine/pytorch/module/base.py Added get_quantizer_roles() abstract method to base module and updated set_meta_tensor() to retrieve roles and pass them to RecipeState.create(). Foundation for modules to emit semantic role strings.
transformer_engine/pytorch/module/linear.py Implemented get_quantizer_roles() to emit semantic role strings for Linear module (e.g., input:linear, weight:linear, output:linear for forward, grad_output:linear, grad_input:linear for backward).
transformer_engine/common/recipe/init.py Updated CustomRecipe docstring to document the new semantic role naming convention (e.g., input:linear, weight:linear instead of linear_input, linear_weight). Breaking API change for CustomRecipe users.
tests/pytorch/test_custom_recipe.py Updated all test factories to use new role format (bucket:scope). Added format validation and bucket extraction logic. Tests cover Linear, LayerNormLinear, LayerNormMLP, GroupedLinear, and ops.Linear with role counting verification.

Sequence Diagram

sequenceDiagram
    participant User
    participant Module as Module/Op
    participant RecipeState
    participant CustomRecipeState
    participant QFactory as Quantizer Factory

    Note over User,QFactory: Initialization Phase
    User->>Module: Create module with autocast(recipe=CustomRecipe(qfactory))
    Module->>Module: get_quantizer_roles(fwd=True)
    Note right of Module: Returns ["input:linear", "weight:linear", "output:linear"]
    
    Module->>RecipeState: RecipeState.create(recipe, mode="forward", num_quantizers=3, roles=roles)
    RecipeState->>CustomRecipeState: __init__(recipe, mode, num_quantizers)
    RecipeState->>CustomRecipeState: state.roles = roles
    RecipeState-->>Module: CustomRecipeState instance
    
    Note over User,QFactory: Quantizer Creation Phase
    Module->>CustomRecipeState: make_quantizers()
    CustomRecipeState->>CustomRecipeState: Validate len(roles) == num_quantizers
    
    loop For each role in roles
        CustomRecipeState->>QFactory: qfactory("input:linear")
        QFactory-->>CustomRecipeState: Quantizer instance
        CustomRecipeState->>QFactory: qfactory("weight:linear")
        QFactory-->>CustomRecipeState: Quantizer instance
        CustomRecipeState->>QFactory: qfactory("output:linear")
        QFactory-->>CustomRecipeState: Quantizer instance
    end
    
    CustomRecipeState-->>Module: List of quantizers
    
    Note over User,QFactory: Backward Pass (similar flow)
    Module->>Module: get_quantizer_roles(fwd=False)
    Note right of Module: Returns ["grad_output:linear", "grad_input:linear"]
    Module->>RecipeState: RecipeState.create(recipe, mode="backward", num_quantizers=2, roles=roles)
Loading

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 23, 2026

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".

@negvet
Copy link
Collaborator Author

negvet commented Jan 23, 2026

bucket:scope order can be flipped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant