Skip to content

[codex] Fix NEST SSL model-support training batches#15812

Merged
chtruong814 merged 4 commits into
mainfrom
codex/fix-15810-nest-ssl-training-step
Jun 23, 2026
Merged

[codex] Fix NEST SSL model-support training batches#15812
chtruong814 merged 4 commits into
mainfrom
codex/fix-15810-nest-ssl-training-step

Conversation

@pzelasko

Copy link
Copy Markdown
Collaborator

Summary

  • Update the two NEST SSL model-support training-step tests to build a valid AudioNoiseBatch.
  • Reuse the generated audio and noise tensors so noisy_audio matches the SSL dataset contract: audio + noise.
  • Keep init and inference coverage unchanged.

Root Cause

The generated training-step tests created independent random audio, noise, and noisy_audio tensors. That does not match the synthetic batch shape produced by the SSL datasets and can produce non-finite loss for the restored NEST SSL artifacts.

Validation

  • isort --check tests/e2e_nightly/test_model_support_nvidia__ssl_en_nest_large_v1_0.py tests/e2e_nightly/test_model_support_nvidia__ssl_en_nest_xlarge_v1_0.py
  • black --check tests/e2e_nightly/test_model_support_nvidia__ssl_en_nest_large_v1_0.py tests/e2e_nightly/test_model_support_nvidia__ssl_en_nest_xlarge_v1_0.py
  • pytest tests/e2e_nightly/test_model_support_nvidia__ssl_en_nest_large_v1_0.py tests/e2e_nightly/test_model_support_nvidia__ssl_en_nest_xlarge_v1_0.py --collect-only -q

Full NEST artifact execution was not run locally because the required .nemo files are not present under /home/TestData/nemo-speech-ci-models.

Fixes #15810

@copy-pr-bot

copy-pr-bot Bot commented Jun 18, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@pzelasko pzelasko marked this pull request as ready for review June 18, 2026 15:21
@pzelasko

Copy link
Copy Markdown
Collaborator Author

/ok to test 4e587ac

@github-actions

Copy link
Copy Markdown
Contributor

[🤖]: Hi @pzelasko 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

chtruong814
chtruong814 previously approved these changes Jun 19, 2026
@chtruong814

Copy link
Copy Markdown
Collaborator

/ok to test acd5347

@chtruong814 chtruong814 added the r3.0.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge. label Jun 19, 2026
pzelasko added 2 commits June 22, 2026 10:02
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
@github-actions

Copy link
Copy Markdown
Contributor

[🤖]: Hi @pzelasko 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

…ssl-training-step

Signed-off-by: Charlie Truong <chtruong@nvidia.com>
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
@chtruong814

Copy link
Copy Markdown
Collaborator

/ok to test a088d44

@github-actions

Copy link
Copy Markdown
Contributor

[🤖]: Hi @pzelasko 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully.

So it might be time to merge this PR or get some approvals.

@chtruong814 chtruong814 merged commit 4a33e9c into main Jun 23, 2026
288 of 290 checks passed
@chtruong814 chtruong814 deleted the codex/fix-15810-nest-ssl-training-step branch June 23, 2026 16:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

r3.0.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge. Run CICD Run e2e nightly

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NEST SSL model-support training-step tests produce NaN loss on synthetic batches

2 participants