Skip to content

AMD/ROCm - Fix VAE_KL_MEM_RATIO overestimation for modern ROCm#12685

Open
ReinerBforartists wants to merge 2 commits intoComfy-Org:masterfrom
ReinerBforartists:2-amdrocm---fix-vae_kl_mem_ratio-overestimation-for-modern-rocm
Open

AMD/ROCm - Fix VAE_KL_MEM_RATIO overestimation for modern ROCm#12685
ReinerBforartists wants to merge 2 commits intoComfy-Org:masterfrom
ReinerBforartists:2-amdrocm---fix-vae_kl_mem_ratio-overestimation-for-modern-rocm

Conversation

@ReinerBforartists
Copy link
Copy Markdown
Contributor

VAE_KL_MEM_RATIO is set to 2.73 for AMD/ROCm in comfy/sd.py. This value was introduced for older ROCm versions where memory overhead was significantly higher. On modern ROCm (7.x), this massively overestimates VRAM requirements for VAE operations, causing ComfyUI to unnecessarily offload models from VRAM before VAE encoding/decoding.

Impact: On GPUs with limited VRAM (8-16GB), this overestimation may cause frequent unnecessary model offloading, significantly impacting performance. On larger GPUs (32GB) the impact is less noticeable but still causes suboptimal memory management.

Tested on: AMD Radeon AI PRO R9700 (32GB VRAM, gfx1201), ROCm 7.2, Windows and Linux
Fix: A value of 1.0 worked correctly with no OOM errors. 1.3 is suggested as a conservative value to maintain a safety margin for older hardware or ROCm versions.

Change: comfy/sd.py: VAE_KL_MEM_RATIO = 2.73 → 1.3 for AMD

Related: WAN 2.2 i2v: Second run is 4 5to 5 times slower on AMD GPU (ROCm) #12672

@ReinerBforartists ReinerBforartists changed the title AMD/ROCm - Fix VAE_KL_MEM_RATIO overestimation for modern ROCm #2 AMD/ROCm - Fix VAE_KL_MEM_RATIO overestimation for modern ROCm Feb 27, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 27, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ec0fd0 and 9aec99e.

📒 Files selected for processing (1)
  • comfy/sd.py

📝 Walkthrough

Walkthrough

This pull request changes the VAE memory ratio constant for AMD GPU devices in the VAE initialization path, reducing VAE_KL_MEM_RATIO from 2.73 to 1.3 in both occurrences within the AMD branch. The modification affects memory estimation for encode/decode operations on AMD hardware only and does not change behavior for non-AMD devices.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adjusting VAE_KL_MEM_RATIO for AMD/ROCm to fix memory overestimation on modern versions.
Description check ✅ Passed The description clearly explains the rationale, impact, testing details, and the specific change being made to VAE_KL_MEM_RATIO for AMD devices.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@comfy/sd.py`:
- Around line 442-445: Add a brief inline comment explaining the rationale for
the VAE_KL_MEM_RATIO override when running on AMD (model_management.is_amd()) by
documenting that 1.3 is a conservative 30% safety margin for modern ROCm (e.g.,
ROCm 7.x) and reference the tracking issue or PR (for example "see issue `#2`");
place this comment immediately adjacent to the VAE_KL_MEM_RATIO = 1.3 assignment
so future maintainers understand why it differs from the original 2.73/1.0
value.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f1ec37 and 2ec0fd0.

📒 Files selected for processing (1)
  • comfy/sd.py

Comment thread comfy/sd.py
@lostdisc
Copy link
Copy Markdown
Contributor

lostdisc commented Mar 7, 2026

I remember the 2.73 value came from last October (v0.3.65), about 3 weeks after ROCm 6.4.4 w/Pytorch for Windows came out. It was in the same release as cudnn/MIOpen being disabled for AMD, but not related according to comfyanonymous.

I tried out a 1.3 value with SDXL, but surprisingly didn't see a difference in peak VRAM usage during VAE decode. It didn't break anything either, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants