[bugfix] include image mode and size in tmp image cache key#9605
[bugfix] include image mode and size in tmp image cache key#9605he-yufeng wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Code Review
This pull request prevents cache collisions for images that share the same flattened pixel bytes but differ in mode or dimensions by prepending metadata (mode, width, and height) to the image bytes before hashing. A unit test has been added to verify this fix. The reviewer suggested using incremental hashing with hasher.update() to avoid unnecessary memory overhead from copying the entire image byte stream during concatenation.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| meta = f'{image.mode}-{image.width}x{image.height}-'.encode() | ||
| img_hash = hashlib.sha256(meta + img_bytes).hexdigest() |
There was a problem hiding this comment.
Concatenating meta + img_bytes creates a new bytes object in memory, which copies the entire image byte stream. For large images, this can lead to unnecessary memory overhead and performance degradation. Instead, you can update the hash incrementally using hasher.update() to avoid this extra memory allocation.
| meta = f'{image.mode}-{image.width}x{image.height}-'.encode() | |
| img_hash = hashlib.sha256(meta + img_bytes).hexdigest() | |
| meta = f'{image.mode}-{image.width}x{image.height}-'.encode() | |
| hasher = hashlib.sha256(meta) | |
| hasher.update(img_bytes) | |
| img_hash = hasher.hexdigest() |
PR type
PR information
Fixes #9360.
Template._save_pil_image()keyed the temp image cache onsha256(image.tobytes()).Image.tobytes()returns only the flattened pixel stream, without the image mode, width, or height. Two images that share the same pixel bytes but differ in shape (e.g.120x80and80x120) therefore hash to the same cache path. Since the method skips saving when the path already exists, the second image silently reuses the first image's PNG, so multimodal inference/training can read the wrong image.This includes the mode and size in the hash input so images with different dimensions get distinct cache files. Behavior for any single image is unchanged (the file is still written once and reused on repeat).
Experiment results
Added
test_save_pil_image_dimension_collisionintests/general/test_template.py: it builds two RGB images with identical pixel bytes but transposed dimensions, saves both, and asserts the cache paths differ and each saved file keeps its own size. The test fails on the previous hash and passes after this change.