FLAC decoder loses last frame data when final frame is incomplete/corrupted
Summary
When decoding FLAC files with incomplete or slightly corrupted final frames, PyAV's decoder silently drops the last frame's data, even when error detection is disabled via err_detect='ignore_err'. This behavior differs from FFmpeg CLI and other audio libraries (e.g., torchaudio), which successfully decode the complete audio including the last frame.
Environment
- PyAV version: 16.0.1
- Python version: 3.10
- OS: Windows
Expected Behavior
PyAV should decode all available audio data from the FLAC file, including partial data from the last frame, similar to how FFmpeg CLI handles it with -err_detect ignore_err.
Actual Behavior
PyAV raises av.error.InvalidDataError on the last frame and discards all data from that frame, resulting in missing audio samples (typically 80-100ms of audio).
Comparison with Other Tools
| Tool |
Behavior |
Samples Decoded |
| torchaudio |
✅ Decodes completely |
13,561,718 |
| FFmpeg CLI |
✅ Decodes completely |
13,561,718 |
| PyAV |
❌ Loses last frame |
13,557,760 |
| Missing |
- |
3,958 samples (~82ms) |
Minimal Reproducible Example
import av
import torchaudio
# Test file (replace with your FLAC file that has incomplete last frame)
input_file = 'test.flac'
# Reference: Load with torchaudio (complete decoding)
waveform_ref, sr_ref = torchaudio.load(input_file)
reference_samples = waveform_ref.shape[1]
print(f"torchaudio samples: {reference_samples:,}")
# PyAV: Attempt to decode with error ignoring
options = {
'err_detect': 'ignore_err',
'fflags': 'genpts'
}
container = av.open(input_file, options=options)
stream = container.streams.audio[0]
# Also try setting options on codec context
stream.codec_context.options = {'err_detect': 'ignore_err'}
total_samples = 0
frame_count = 0
errors = 0
for packet in container.demux(stream):
try:
for frame in packet.decode():
frame_count += 1
# total_samples += frame.samples # or frame.duration in newer versions
total_samples += frame.duration # or frame.duration in newer versions
except av.error.InvalidDataError as e:
errors += 1
print(f"InvalidDataError on frame {frame_count}: {e}")
# Even with error handling, the last frame's data is lost
continue
container.close()
print(f"PyAV samples: {total_samples:,}")
print(f"Missing samples: {reference_samples - total_samples:,}")
print(f"Missing duration: {(reference_samples - total_samples) / sr_ref * 1000:.2f} ms")
print(f"Errors encountered: {errors}")
output:
torchaudio samples: 13,561,718
InvalidDataError on frame 3310: [Errno 1094995529] Invalid data found when processing input: 'avcodec_receive_frame()'
PyAV samples: 13,557,760
Missing samples: 3,958
Missing duration: 82.46 ms
Errors encountered: 1
FLAC decoder loses last frame data when final frame is incomplete/corrupted
Summary
When decoding FLAC files with incomplete or slightly corrupted final frames, PyAV's decoder silently drops the last frame's data, even when error detection is disabled via
err_detect='ignore_err'. This behavior differs from FFmpeg CLI and other audio libraries (e.g., torchaudio), which successfully decode the complete audio including the last frame.Environment
Expected Behavior
PyAV should decode all available audio data from the FLAC file, including partial data from the last frame, similar to how FFmpeg CLI handles it with
-err_detect ignore_err.Actual Behavior
PyAV raises
av.error.InvalidDataErroron the last frame and discards all data from that frame, resulting in missing audio samples (typically 80-100ms of audio).Comparison with Other Tools
Minimal Reproducible Example
output: