Fixes #8697 GPU memory leak by checking both image and label tensors for CUDA device #8708

benediktjohannes · 2026-01-17T15:39:22Z

Modified device detection to check BOTH image and label tensors

torch.cuda.empty_cache() now called if EITHER tensor is on GPU

Prevents GPU memory leaks in mixed device scenarios

…abel tensors for CUDA device Modified device detection to check BOTH image and label tensors torch.cuda.empty_cache() now called if EITHER tensor is on GPU Prevents GPU memory leaks in mixed device scenarios Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>

coderabbitai · 2026-01-17T15:39:36Z

📝 Walkthrough

Walkthrough

Per-sample data access in monai/auto3dseg/analyzer.py was refactored to use local variables image_tensor and label_tensor and index them per sample. CUDA detection now considers both tensors via any(...). Per-sample image and label arrays are constructed from image_tensor[i] and label_tensor respectively; label casts to torch.int16 are preserved. Empty-foreground placeholders changed from torch.Tensor([0]) to MetaTensor([0.0]). Shape consistency checks and the statistics computation and output update flow remain unchanged. No public API or signature changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	Description covers the core change but omits the required template structure with sections like 'Types of changes' and checkboxes for testing/documentation.	Follow the repository template: add 'Types of changes' section with appropriate checkboxes (e.g., non-breaking change, tests added, etc.).

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title clearly summarizes the main change: GPU memory leak fix via improved CUDA device detection for both image and label tensors.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@monai/auto3dseg/analyzer.py`:
- Around line 471-476: The code uses undefined image_tensor and label_tensor
before they are assigned; extract them from the incoming data dict like
ImageStats and FgImageStats do (use the same keys those classes expect, e.g.,
"image" for the input image and "label" for the segmentation label), then
proceed with the existing logic (compute using_cuda, save restore_grad_state,
set_grad_enabled(False), build ndas from image_tensor and cast label_tensor to
int16 for ndas_label). Ensure the variables are set to the tensor objects (with
correct shape [B,...]) before the lines referencing using_cuda, ndas, and
ndas_label.

monai/auto3dseg/analyzer.py

Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>

ericspod

Hi @benediktjohannes I think this is fine as it is now so we'll run tests again though we'll have to wait for us to resolve a CI issue. @mingxin-zheng was involved in this code so if he can double check for me that would be great.

benediktjohannes · 2026-01-21T21:07:42Z

Hi @ericspod thanks for the Message! I am (to be honest) also Not quite sure about this PR so please don‘t merge it as Long as Nobody Else is sure that this is Fine. Thanks!

mingxin-zheng · 2026-01-23T15:14:46Z

monai/auto3dseg/analyzer.py

-            using_cuda = False
+        image_tensor = d[self.image_key]
+        label_tensor = d[self.label_key]
+        using_cuda = any(isinstance(t, torch.Tensor) and t.device.type == "cuda" for t in (image_tensor, label_tensor))


Thanks for the PR. MetaTensor could also use cuda device, so I believe we will need to check that as well

Thanks for the answer! 👍

benediktjohannes · 2026-01-23T15:21:00Z

By the way: I'm not quite sure at the other edits of this PR as they were just recommended by coderabbit (and another important thing: I'm not quite sure whether I could have made a mistake when pasting these) and I'm also not quite sure in this case whether they should be merged or not, but I just relyed on the coderabbit review, so please correct me if there is a mistake and please also review this before merging, thanks!

Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@monai/auto3dseg/analyzer.py`:
- Around line 471-478: Add parameterized unit tests that cover mixed-device
scenarios for LabelStats by creating cases where image_tensor is on CPU and
label_tensor is on CUDA and vice versa; instantiate batches similar to how ndas
and ndas_label are derived (use image_tensor, label_tensor and
MetaTensor/torch.Tensor on specific torch.device settings), call the LabelStats
code paths that consume ndas/ndas_label, and assert expected statistics and that
no device-related errors occur. Ensure tests toggle CUDA availability with
torch.cuda.is_available() guards, use explicit .to(device) on tensors, and
verify behavior matches the existing CPU-only and GPU-only assertions so
mixed-device handling is validated.

coderabbitai · 2026-01-24T00:45:23Z

monai/auto3dseg/analyzer.py

+        image_tensor = d[self.image_key]
+        label_tensor = d[self.label_key]
+        using_cuda = any(isinstance(t, (torch.Tensor, MetaTensor)) and t.device.type == "cuda" for t in (image_tensor, label_tensor))
        restore_grad_state = torch.is_grad_enabled()
        torch.set_grad_enabled(False)

-        ndas: list[MetaTensor] = [d[self.image_key][i] for i in range(d[self.image_key].shape[0])]  # type: ignore
-        ndas_label: MetaTensor = d[self.label_key].astype(torch.int16)  # (H,W,D)
+        ndas: list[MetaTensor] = [image_tensor[i] for i in range(image_tensor.shape[0])]  # type: ignore
+        ndas_label: MetaTensor = label_tensor.astype(torch.int16)  # (H,W,D)


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n "LabelStats|FgImageStats|using_cuda|empty_cache|get_label_ccp" -g "*test*.py" -i

Repository: Project-MONAI/MONAI

Length of output: 1005

🏁 Script executed:

# Also check the actual code at the specified lines head -500 monai/auto3dseg/analyzer.py | tail -50

Repository: Project-MONAI/MONAI

Length of output: 2358

🏁 Script executed:

# Find the LabelStats class and its __call__ method rg -n "class LabelStats|def __call__" monai/auto3dseg/analyzer.py

Repository: Project-MONAI/MONAI

Length of output: 662

🏁 Script executed:

sed -n '319,360p' tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 2553

🏁 Script executed:

sed -n '341,370p' tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 1890

🏁 Script executed:

sed -n '420,500p' tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 3971

🏁 Script executed:

head -50 tests/apps/test_auto3dseg.py | grep -E "device|import"

Repository: Project-MONAI/MONAI

Length of output: 606

🏁 Script executed:

grep -n "^device\s*=" tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 80

🏁 Script executed:

# Check the beginning of the test file for device setup head -100 tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 3483

🏁 Script executed:

grep -n "SIM_GPU_TEST_CASES\|@parameterized\|skip_if_no_cuda" tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 359

🏁 Script executed:

# Check if there are GPU parametrized test methods grep -B 5 -A 10 "test_label_stats.*gpu\|test_label_stats.*cuda" tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 45

🏁 Script executed:

sed -n '214,280p' tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 3116

🏁 Script executed:

# Check what the GPU test actually tests grep -A 40 "@skip_if_no_cuda" tests/apps/test_auto3dseg.py | head -60

Repository: Project-MONAI/MONAI

Length of output: 1974

Add tests for mixed-device LabelStats handling.

Tests currently cover CPU-only and GPU-only scenarios but not mixed (image on CPU + label on CUDA, or vice versa). Add parameterized test cases to verify LabelStats handles these mixed-device scenarios correctly per the coding guidelines.

🤖 Prompt for AI Agents

In `@monai/auto3dseg/analyzer.py` around lines 471 - 478, Add parameterized unit tests that cover mixed-device scenarios for LabelStats by creating cases where image_tensor is on CPU and label_tensor is on CUDA and vice versa; instantiate batches similar to how ndas and ndas_label are derived (use image_tensor, label_tensor and MetaTensor/torch.Tensor on specific torch.device settings), call the LabelStats code paths that consume ndas/ndas_label, and assert expected statistics and that no device-related errors occur. Ensure tests toggle CUDA availability with torch.cuda.is_available() guards, use explicit .to(device) on tensors, and verify behavior matches the existing CPU-only and GPU-only assertions so mixed-device handling is validated.

benediktjohannes requested review from KumoLiu, Nic-Ma and ericspod as code owners January 17, 2026 15:39

coderabbitai bot reviewed Jan 17, 2026

View reviewed changes

monai/auto3dseg/analyzer.py Outdated Show resolved Hide resolved

Forgot to define variables, thanks coderabbitai for the information

f8cc9ff

Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>

ericspod approved these changes Jan 19, 2026

View reviewed changes

mingxin-zheng reviewed Jan 23, 2026

View reviewed changes

Update analyzer.py

c532535

Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>

coderabbitai bot reviewed Jan 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes #8697 GPU memory leak by checking both image and label tensors for CUDA device #8708

Fixes #8697 GPU memory leak by checking both image and label tensors for CUDA device #8708

Uh oh!

benediktjohannes commented Jan 17, 2026

Uh oh!

coderabbitai bot commented Jan 17, 2026 •

edited

Loading

Walkthrough

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

ericspod left a comment

Uh oh!

benediktjohannes commented Jan 21, 2026

Uh oh!

mingxin-zheng Jan 23, 2026

Uh oh!

benediktjohannes Jan 23, 2026

Uh oh!

benediktjohannes commented Jan 23, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jan 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fixes #8697 GPU memory leak by checking both image and label tensors for CUDA device #8708

Are you sure you want to change the base?

Fixes #8697 GPU memory leak by checking both image and label tensors for CUDA device #8708

Uh oh!

Conversation

benediktjohannes commented Jan 17, 2026

Uh oh!

coderabbitai bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ericspod left a comment

Choose a reason for hiding this comment

Uh oh!

benediktjohannes commented Jan 21, 2026

Uh oh!

mingxin-zheng Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

benediktjohannes Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

benediktjohannes commented Jan 23, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai bot commented Jan 17, 2026 •

edited

Loading