Skip to content

Synchronize no-GPU cache eviction with CPU streams#3566

Merged
zcbenz merged 1 commit into
ml-explore:mainfrom
dhiltgen:allocator-cache-fix
May 20, 2026
Merged

Synchronize no-GPU cache eviction with CPU streams#3566
zcbenz merged 1 commit into
ml-explore:mainfrom
dhiltgen:allocator-cache-fix

Conversation

@dhiltgen
Copy link
Copy Markdown
Contributor

Proposed changes

Follow up to #3554 - while testing variations on my AVX2 branch I found this bug on that change.

The no-GPU CPU allocator cache could release cached buffers while asynchronous CPU work was still queued on a non-default stream. mlx-lm generation can hit this by queuing the next token on a CPU stream and calling clear_cache between yielded tokens, which can leave queued work referencing freed cached memory.

Synchronize CPU streams before evicting cached buffers from clear_cache or set_cache_limit, and protect get_cache_memory with the allocator mutex. Add a regression test that verifies clear_cache waits for queued CPU stream work before returning.

Checklist

Put an x in the boxes that apply.

  • I have read the CONTRIBUTING document
  • I have run pre-commit run --all-files to format my code / installed pre-commit prior to committing changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the necessary documentation (if needed)

The no-GPU CPU allocator cache could release cached buffers while asynchronous CPU work was still queued on a non-default stream. mlx-lm generation can hit this by queuing the next token on a CPU stream and calling clear_cache between yielded tokens, which can leave queued work referencing freed cached memory.

Synchronize CPU streams before evicting cached buffers from clear_cache or set_cache_limit, and protect get_cache_memory with the allocator mutex. Add a regression test that verifies clear_cache waits for queued CPU stream work before returning.
Copy link
Copy Markdown
Member

@angeloskath angeloskath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and makes total sense, thanks!

Copy link
Copy Markdown
Collaborator

@zcbenz zcbenz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix!

@zcbenz zcbenz merged commit e0163f3 into ml-explore:main May 20, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants