Skip to content

Adding past inputs to MultiHeadAttention parser for supporting new models#4703

Merged
causten merged 41 commits intodevelopfrom
multihead_past_kv_seq
Apr 10, 2026
Merged

Adding past inputs to MultiHeadAttention parser for supporting new models#4703
causten merged 41 commits intodevelopfrom
multihead_past_kv_seq

Conversation

@urpetkov-amd
Copy link
Copy Markdown
Collaborator

Motivation

This PR has been created from this original PR: #4637
Motivation for this is support more input in MultiHeadAttention parser in order to support new VLM models.

Technical Details

Past key, past value and past_sequence_length have been added as new inputs that MultiHeadAttention supports.
Tests have been added for these new inputs.

Changelog Category

Add a CHANGELOG.md entry for any option other than Not Applicable

    • Added: New functionality.
    • Changed: Changes to existing functionality.
    • Removed: Functionality or support that has been removed. (Compared to a previous release)
    • Optimized: Component performance that has been optimized or improved.
    • Resolved Issues: Known issues from a previous version that have been resolved.
    • Not Applicable: This PR is not to be included in the changelog.

@urpetkov-amd urpetkov-amd requested a review from causten as a code owner March 25, 2026 11:24
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 25, 2026

Codecov Report

❌ Patch coverage is 97.64706% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/onnx/parse_multi_head_attention.cpp 97.65% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #4703      +/-   ##
===========================================
+ Coverage    92.32%   92.34%   +0.01%     
===========================================
  Files          583      583              
  Lines        29334    29412      +78     
===========================================
+ Hits         27082    27158      +76     
- Misses        2252     2254       +2     
Files with missing lines Coverage Δ
src/onnx/parse_multi_head_attention.cpp 98.25% <97.65%> (-0.20%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Collaborator

@TedThemistokleous TedThemistokleous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good.

  • Good comments explaining logic (gen_onnx test cases, early returns)

Just add the following

  • Leverage std::optional for capturing the argument after you've validated that all past input are good. This will simplify your checks and not rely on size of args
  • Add a small blub for your verify tests on how you generated the gold data in a comment box. Good for reproduction if we somehow see a regression with some other MIGraphX change.

Comment thread src/onnx/parse_multi_head_attention.cpp
Comment thread test/onnx/verify/mha_past_key_value_test.cpp
Comment thread test/onnx/gen_onnx.py
Comment thread src/onnx/parse_multi_head_attention.cpp Outdated
@urpetkov-amd
Copy link
Copy Markdown
Collaborator Author

@TedThemistokleous
Thanks. Changes have been added.

@urpetkov-amd
Copy link
Copy Markdown
Collaborator Author

@TedThemistokleous @causten

Can we merge this if everything is okay now? Thanks

@TedThemistokleous TedThemistokleous added Onnx Operators Adding or modifying an Onnx Operator in the MIGraphX codebase bugfix Fixes a bug found in the code. labels Apr 9, 2026
@causten causten merged commit e3fe11b into develop Apr 10, 2026
32 checks passed
@causten causten deleted the multihead_past_kv_seq branch April 10, 2026 21:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Fixes a bug found in the code. Onnx Operators Adding or modifying an Onnx Operator in the MIGraphX codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants