Skip to content

Conversation

@benediktjohannes
Copy link

Modified device detection to check BOTH image and label tensors

torch.cuda.empty_cache() now called if EITHER tensor is on GPU

Prevents GPU memory leaks in mixed device scenarios

…abel tensors for CUDA device

Modified device detection to check BOTH image and label tensors

torch.cuda.empty_cache() now called if EITHER tensor is on GPU

Prevents GPU memory leaks in mixed device scenarios

Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 17, 2026

📝 Walkthrough

Walkthrough

Per-sample data access in monai/auto3dseg/analyzer.py was refactored to use local variables image_tensor and label_tensor and index them per sample. CUDA detection now considers both tensors via any(...). Per-sample image and label arrays are constructed from image_tensor[i] and label_tensor respectively; label casts to torch.int16 are preserved. Empty-foreground placeholders changed from torch.Tensor([0]) to MetaTensor([0.0]). Shape consistency checks and the statistics computation and output update flow remain unchanged. No public API or signature changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive Description covers the core change but omits the required template structure with sections like 'Types of changes' and checkboxes for testing/documentation. Follow the repository template: add 'Types of changes' section with appropriate checkboxes (e.g., non-breaking change, tests added, etc.).
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main change: GPU memory leak fix via improved CUDA device detection for both image and label tensors.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@monai/auto3dseg/analyzer.py`:
- Around line 471-476: The code uses undefined image_tensor and label_tensor
before they are assigned; extract them from the incoming data dict like
ImageStats and FgImageStats do (use the same keys those classes expect, e.g.,
"image" for the input image and "label" for the segmentation label), then
proceed with the existing logic (compute using_cuda, save restore_grad_state,
set_grad_enabled(False), build ndas from image_tensor and cast label_tensor to
int16 for ndas_label). Ensure the variables are set to the tensor objects (with
correct shape [B,...]) before the lines referencing using_cuda, ndas, and
ndas_label.

Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>
Copy link
Member

@ericspod ericspod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @benediktjohannes I think this is fine as it is now so we'll run tests again though we'll have to wait for us to resolve a CI issue. @mingxin-zheng was involved in this code so if he can double check for me that would be great.

@benediktjohannes
Copy link
Author

Hi @ericspod thanks for the Message! I am (to be honest) also Not quite sure about this PR so please don‘t merge it as Long as Nobody Else is sure that this is Fine. Thanks!

using_cuda = False
image_tensor = d[self.image_key]
label_tensor = d[self.label_key]
using_cuda = any(isinstance(t, torch.Tensor) and t.device.type == "cuda" for t in (image_tensor, label_tensor))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. MetaTensor could also use cuda device, so I believe we will need to check that as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the answer! 👍

@benediktjohannes
Copy link
Author

By the way: I'm not quite sure at the other edits of this PR as they were just recommended by coderabbit (and another important thing: I'm not quite sure whether I could have made a mistake when pasting these) and I'm also not quite sure in this case whether they should be merged or not, but I just relyed on the coderabbit review, so please correct me if there is a mistake and please also review this before merging, thanks!

Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@monai/auto3dseg/analyzer.py`:
- Around line 471-478: Add parameterized unit tests that cover mixed-device
scenarios for LabelStats by creating cases where image_tensor is on CPU and
label_tensor is on CUDA and vice versa; instantiate batches similar to how ndas
and ndas_label are derived (use image_tensor, label_tensor and
MetaTensor/torch.Tensor on specific torch.device settings), call the LabelStats
code paths that consume ndas/ndas_label, and assert expected statistics and that
no device-related errors occur. Ensure tests toggle CUDA availability with
torch.cuda.is_available() guards, use explicit .to(device) on tensors, and
verify behavior matches the existing CPU-only and GPU-only assertions so
mixed-device handling is validated.

Comment on lines +471 to +478
image_tensor = d[self.image_key]
label_tensor = d[self.label_key]
using_cuda = any(isinstance(t, (torch.Tensor, MetaTensor)) and t.device.type == "cuda" for t in (image_tensor, label_tensor))
restore_grad_state = torch.is_grad_enabled()
torch.set_grad_enabled(False)

ndas: list[MetaTensor] = [d[self.image_key][i] for i in range(d[self.image_key].shape[0])] # type: ignore
ndas_label: MetaTensor = d[self.label_key].astype(torch.int16) # (H,W,D)
ndas: list[MetaTensor] = [image_tensor[i] for i in range(image_tensor.shape[0])] # type: ignore
ndas_label: MetaTensor = label_tensor.astype(torch.int16) # (H,W,D)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

rg -n "LabelStats|FgImageStats|using_cuda|empty_cache|get_label_ccp" -g "*test*.py" -i

Repository: Project-MONAI/MONAI

Length of output: 1005


🏁 Script executed:

# Also check the actual code at the specified lines
head -500 monai/auto3dseg/analyzer.py | tail -50

Repository: Project-MONAI/MONAI

Length of output: 2358


🏁 Script executed:

# Find the LabelStats class and its __call__ method
rg -n "class LabelStats|def __call__" monai/auto3dseg/analyzer.py

Repository: Project-MONAI/MONAI

Length of output: 662


🏁 Script executed:

sed -n '319,360p' tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 2553


🏁 Script executed:

sed -n '341,370p' tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 1890


🏁 Script executed:

sed -n '420,500p' tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 3971


🏁 Script executed:

head -50 tests/apps/test_auto3dseg.py | grep -E "device|import"

Repository: Project-MONAI/MONAI

Length of output: 606


🏁 Script executed:

grep -n "^device\s*=" tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 80


🏁 Script executed:

# Check the beginning of the test file for device setup
head -100 tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 3483


🏁 Script executed:

grep -n "SIM_GPU_TEST_CASES\|@parameterized\|skip_if_no_cuda" tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 359


🏁 Script executed:

# Check if there are GPU parametrized test methods
grep -B 5 -A 10 "test_label_stats.*gpu\|test_label_stats.*cuda" tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 45


🏁 Script executed:

sed -n '214,280p' tests/apps/test_auto3dseg.py

Repository: Project-MONAI/MONAI

Length of output: 3116


🏁 Script executed:

# Check what the GPU test actually tests
grep -A 40 "@skip_if_no_cuda" tests/apps/test_auto3dseg.py | head -60

Repository: Project-MONAI/MONAI

Length of output: 1974


Add tests for mixed-device LabelStats handling.

Tests currently cover CPU-only and GPU-only scenarios but not mixed (image on CPU + label on CUDA, or vice versa). Add parameterized test cases to verify LabelStats handles these mixed-device scenarios correctly per the coding guidelines.

🤖 Prompt for AI Agents
In `@monai/auto3dseg/analyzer.py` around lines 471 - 478, Add parameterized unit
tests that cover mixed-device scenarios for LabelStats by creating cases where
image_tensor is on CPU and label_tensor is on CUDA and vice versa; instantiate
batches similar to how ndas and ndas_label are derived (use image_tensor,
label_tensor and MetaTensor/torch.Tensor on specific torch.device settings),
call the LabelStats code paths that consume ndas/ndas_label, and assert expected
statistics and that no device-related errors occur. Ensure tests toggle CUDA
availability with torch.cuda.is_available() guards, use explicit .to(device) on
tensors, and verify behavior matches the existing CPU-only and GPU-only
assertions so mixed-device handling is validated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants