Update MagpieTTS' Inference Parameter Configuration#15254
Conversation
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: blisc <blisc@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces an InferenceParameters dataclass to centralize inference configuration for MagpieTTS models and enables attention prior by default in the do_tts() method. The main purpose is to make attention prior active by default for improved inference quality while simplifying parameter management across the codebase.
Key changes:
- Added
InferenceParametersdataclass withapply_attention_prior=Trueby default (previouslyFalse) - Refactored
infer_batch()to read parameters fromself.inference_parametersinstead of method arguments - Simplified
do_tts()API by removing explicit temperature, topk, max_decoder_steps, and cfg_scale parameters
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| nemo/collections/tts/models/magpietts.py | Introduced InferenceParameters dataclass with attention prior enabled by default; refactored infer_batch() to use instance parameters; simplified do_tts() signature |
| nemo/collections/tts/models/magpietts_preference_optimization.py | Updated test_step() and generate_and_reward() to set inference_parameters attributes instead of passing them as arguments |
| nemo/collections/tts/modules/magpietts_inference/inference.py | Modified standard inference to set model's inference_parameters before calling infer_batch() |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| topk=topk, | ||
| use_cfg=use_cfg, | ||
| cfg_scale=cfg_scale, | ||
| use_local_transformer_for_inference=True, |
There was a problem hiding this comment.
Should use_local_transformer_for_inference be in InferenceParameters?
There was a problem hiding this comment.
No, the intension of InferenceParameters is to save model specific hyperparameters so we can save it as part of the yaml config and subsequently, part of the nemo file. Since some parameters like estimate_alignment_from_layers are model-specific, I want to keep this new class to contain only those parameters. Options like use_cfg or use_LT are run-specific and should lie elsewhere.
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: blisc <blisc@users.noreply.github.com>
|
Left one minor comment, otherwise LGTM given standard and long-form inference has been tested and nothing breaks. |
Signed-off-by: Jason <jasoli@nvidia.com>
… magpietts_2601_inferfix
rfejgin
left a comment
There was a problem hiding this comment.
Looks good to me (some internal discussions ongoing but not needed for this PR)
…15212 Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: blisc <blisc@users.noreply.github.com>
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>
|
LGTM. |
|
[🤖]: Hi @blisc 👋, We wanted to let you know that a CICD pipeline for this PR just finished successfully. So it might be time to merge this PR or get some approvals. |
* move inference params to checkpoint and make do_tts apply prior Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * Enable LT in do_tts; add docstrings Signed-off-by: Jason <jasoli@nvidia.com> * update defaults; merge inference dataclasses Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * update epsilon value Signed-off-by: Jason <jasoli@nvidia.com> * add defaults for inference; fix longform mode; fix bug introduced in NVIDIA-NeMo#15212 Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * fix field key usage Signed-off-by: Jason <jasoli@nvidia.com> --------- Signed-off-by: Jason <jasoli@nvidia.com> Signed-off-by: blisc <blisc@users.noreply.github.com> Co-authored-by: blisc <blisc@users.noreply.github.com> Signed-off-by: Akhil Varanasi <akhilvaranasi23@gmail.com>
* move inference params to checkpoint and make do_tts apply prior Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * Enable LT in do_tts; add docstrings Signed-off-by: Jason <jasoli@nvidia.com> * update defaults; merge inference dataclasses Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * update epsilon value Signed-off-by: Jason <jasoli@nvidia.com> * add defaults for inference; fix longform mode; fix bug introduced in NVIDIA-NeMo#15212 Signed-off-by: Jason <jasoli@nvidia.com> * Apply isort and black reformatting Signed-off-by: blisc <blisc@users.noreply.github.com> * fix field key usage Signed-off-by: Jason <jasoli@nvidia.com> --------- Signed-off-by: Jason <jasoli@nvidia.com> Signed-off-by: blisc <blisc@users.noreply.github.com> Co-authored-by: blisc <blisc@users.noreply.github.com>
What does this PR do ?
This PR separates model specific and run specific inference parameters. Model specific parameters are not stored inside a ModelInferenceParameters dataclass within the model itself. This parameters should be stored in the model yaml, currently this needs to be done manually. In particular, the latest checkpoint: https://huggingface.co/nvidia/magpie_tts_multilingual_357m has been updated to include a inference_parameters field in the top level yaml.
Run specific inference parameters, such as applying local transformer or using CFG, remain as arguments to the inference functions.
Updates inference defaults, and merges the nemo/collections/tts/modules/magpietts_inference/inference.py configuration with this new setup.
Collection: tts
Changelog
PR Type: