-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Fixes #8697 GPU memory leak by checking both image and label tensors for CUDA device #8708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
…abel tensors for CUDA device Modified device detection to check BOTH image and label tensors torch.cuda.empty_cache() now called if EITHER tensor is on GPU Prevents GPU memory leaks in mixed device scenarios Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>
📝 WalkthroughWalkthroughPer-sample data access in monai/auto3dseg/analyzer.py was refactored to use local variables Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@monai/auto3dseg/analyzer.py`:
- Around line 471-476: The code uses undefined image_tensor and label_tensor
before they are assigned; extract them from the incoming data dict like
ImageStats and FgImageStats do (use the same keys those classes expect, e.g.,
"image" for the input image and "label" for the segmentation label), then
proceed with the existing logic (compute using_cuda, save restore_grad_state,
set_grad_enabled(False), build ndas from image_tensor and cast label_tensor to
int16 for ndas_label). Ensure the variables are set to the tensor objects (with
correct shape [B,...]) before the lines referencing using_cuda, ndas, and
ndas_label.
Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>
ericspod
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @benediktjohannes I think this is fine as it is now so we'll run tests again though we'll have to wait for us to resolve a CI issue. @mingxin-zheng was involved in this code so if he can double check for me that would be great.
|
Hi @ericspod thanks for the Message! I am (to be honest) also Not quite sure about this PR so please don‘t merge it as Long as Nobody Else is sure that this is Fine. Thanks! |
monai/auto3dseg/analyzer.py
Outdated
| using_cuda = False | ||
| image_tensor = d[self.image_key] | ||
| label_tensor = d[self.label_key] | ||
| using_cuda = any(isinstance(t, torch.Tensor) and t.device.type == "cuda" for t in (image_tensor, label_tensor)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. MetaTensor could also use cuda device, so I believe we will need to check that as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the answer! 👍
|
By the way: I'm not quite sure at the other edits of this PR as they were just recommended by coderabbit (and another important thing: I'm not quite sure whether I could have made a mistake when pasting these) and I'm also not quite sure in this case whether they should be merged or not, but I just relyed on the coderabbit review, so please correct me if there is a mistake and please also review this before merging, thanks! |
Signed-off-by: benediktjohannes <benedikt.johannes.hofer@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@monai/auto3dseg/analyzer.py`:
- Around line 471-478: Add parameterized unit tests that cover mixed-device
scenarios for LabelStats by creating cases where image_tensor is on CPU and
label_tensor is on CUDA and vice versa; instantiate batches similar to how ndas
and ndas_label are derived (use image_tensor, label_tensor and
MetaTensor/torch.Tensor on specific torch.device settings), call the LabelStats
code paths that consume ndas/ndas_label, and assert expected statistics and that
no device-related errors occur. Ensure tests toggle CUDA availability with
torch.cuda.is_available() guards, use explicit .to(device) on tensors, and
verify behavior matches the existing CPU-only and GPU-only assertions so
mixed-device handling is validated.
| image_tensor = d[self.image_key] | ||
| label_tensor = d[self.label_key] | ||
| using_cuda = any(isinstance(t, (torch.Tensor, MetaTensor)) and t.device.type == "cuda" for t in (image_tensor, label_tensor)) | ||
| restore_grad_state = torch.is_grad_enabled() | ||
| torch.set_grad_enabled(False) | ||
|
|
||
| ndas: list[MetaTensor] = [d[self.image_key][i] for i in range(d[self.image_key].shape[0])] # type: ignore | ||
| ndas_label: MetaTensor = d[self.label_key].astype(torch.int16) # (H,W,D) | ||
| ndas: list[MetaTensor] = [image_tensor[i] for i in range(image_tensor.shape[0])] # type: ignore | ||
| ndas_label: MetaTensor = label_tensor.astype(torch.int16) # (H,W,D) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n "LabelStats|FgImageStats|using_cuda|empty_cache|get_label_ccp" -g "*test*.py" -iRepository: Project-MONAI/MONAI
Length of output: 1005
🏁 Script executed:
# Also check the actual code at the specified lines
head -500 monai/auto3dseg/analyzer.py | tail -50Repository: Project-MONAI/MONAI
Length of output: 2358
🏁 Script executed:
# Find the LabelStats class and its __call__ method
rg -n "class LabelStats|def __call__" monai/auto3dseg/analyzer.pyRepository: Project-MONAI/MONAI
Length of output: 662
🏁 Script executed:
sed -n '319,360p' tests/apps/test_auto3dseg.pyRepository: Project-MONAI/MONAI
Length of output: 2553
🏁 Script executed:
sed -n '341,370p' tests/apps/test_auto3dseg.pyRepository: Project-MONAI/MONAI
Length of output: 1890
🏁 Script executed:
sed -n '420,500p' tests/apps/test_auto3dseg.pyRepository: Project-MONAI/MONAI
Length of output: 3971
🏁 Script executed:
head -50 tests/apps/test_auto3dseg.py | grep -E "device|import"Repository: Project-MONAI/MONAI
Length of output: 606
🏁 Script executed:
grep -n "^device\s*=" tests/apps/test_auto3dseg.pyRepository: Project-MONAI/MONAI
Length of output: 80
🏁 Script executed:
# Check the beginning of the test file for device setup
head -100 tests/apps/test_auto3dseg.pyRepository: Project-MONAI/MONAI
Length of output: 3483
🏁 Script executed:
grep -n "SIM_GPU_TEST_CASES\|@parameterized\|skip_if_no_cuda" tests/apps/test_auto3dseg.pyRepository: Project-MONAI/MONAI
Length of output: 359
🏁 Script executed:
# Check if there are GPU parametrized test methods
grep -B 5 -A 10 "test_label_stats.*gpu\|test_label_stats.*cuda" tests/apps/test_auto3dseg.pyRepository: Project-MONAI/MONAI
Length of output: 45
🏁 Script executed:
sed -n '214,280p' tests/apps/test_auto3dseg.pyRepository: Project-MONAI/MONAI
Length of output: 3116
🏁 Script executed:
# Check what the GPU test actually tests
grep -A 40 "@skip_if_no_cuda" tests/apps/test_auto3dseg.py | head -60Repository: Project-MONAI/MONAI
Length of output: 1974
Add tests for mixed-device LabelStats handling.
Tests currently cover CPU-only and GPU-only scenarios but not mixed (image on CPU + label on CUDA, or vice versa). Add parameterized test cases to verify LabelStats handles these mixed-device scenarios correctly per the coding guidelines.
🤖 Prompt for AI Agents
In `@monai/auto3dseg/analyzer.py` around lines 471 - 478, Add parameterized unit
tests that cover mixed-device scenarios for LabelStats by creating cases where
image_tensor is on CPU and label_tensor is on CUDA and vice versa; instantiate
batches similar to how ndas and ndas_label are derived (use image_tensor,
label_tensor and MetaTensor/torch.Tensor on specific torch.device settings),
call the LabelStats code paths that consume ndas/ndas_label, and assert expected
statistics and that no device-related errors occur. Ensure tests toggle CUDA
availability with torch.cuda.is_available() guards, use explicit .to(device) on
tensors, and verify behavior matches the existing CPU-only and GPU-only
assertions so mixed-device handling is validated.
Modified device detection to check BOTH image and label tensors
torch.cuda.empty_cache() now called if EITHER tensor is on GPU
Prevents GPU memory leaks in mixed device scenarios