Synchronize file position when opening a BufferedReader#2
Open
bemoody wants to merge 2 commits into
Open
Conversation
added 2 commits
May 6, 2026 14:33
When reading from a buffered file object (created by open(file, "rb") or similar), we want to use direct file descriptor I/O for efficiency. However, we want to start reading from the current position in the buffered stream (if the caller previously called read() or seek() on this file object, the buffered stream position isn't necessarily equal to the underlying file descriptor's position.) Calling seek() on the buffered file object isn't enough to do this: the effect on the raw file descriptor is not documented, and in current versions of CPython it doesn't perform a real seek if the destination is within the buffer. So instead, call seek() on the raw file object. Afterwards, the buffer will be out of sync with the underlying file (which was always true, it just wasn't obvious.)
Regression test for issue #1: previously, the decoder would not correctly set the file position if the input was a buffered file object that already had some data in its buffer.
|
Makes sense I think, just see this comment: #1 (comment) Which is effectively "can we also put the BufferedReader (/ Writer) back in a sane state on close()?" Maybe the "recover original state" step is unnecessary if the raw position is never depended on; that's not quite clear to me. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
If we pass an existing file object to
plibflac.Decoder, we want to start decoding at the current position in the input stream. If the input is a buffered file object, the input position might not be the same as the position of the underlying file descriptor.(We do want to use direct file descriptor I/O if we can, bypassing the Python buffering layer, because it's significantly faster and more parallelizable.)
Previously the code was supposed to handle this but didn't actually work as intended - calling
seekon a BufferedReader isn't guaranteed to set the position of the underlying FD.Fixes #1