-
Notifications
You must be signed in to change notification settings - Fork 946
fix: enhance tar corruption error debugging #10120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Add entry counter and enhanced error messages to help diagnose 'Invalid tar header' errors during export/import streaming. Changes: - Track number of tar entries received before corruption occurs - Add enhanced error message for tar header errors with likely causes - Add finalization guard to prevent double-finalization in tar creation - Detect ERR_STREAM_PREMATURE_CLOSE specifically in fetch route
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive debugging instrumentation to diagnose "Invalid tar header" errors that occur during component export/import streaming operations. The changes track entry counts, enhance error messages with actionable context, and add guards to prevent double-finalization bugs.
- Track tar entry counts to identify when corruption occurs (at start, middle, or end of stream)
- Enhanced error messages that correlate server-side stream errors with client-side tar corruption
- Finalization guards to prevent duplicate
pack.finalize()calls
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| scopes/scope/scope/routes/fetch.route.ts | Added specific detection and logging for ERR_STREAM_PREMATURE_CLOSE to correlate with client-side tar errors |
| scopes/scope/objects/objects/object-list.ts | Added entry counter tracking, enhanced error messages with entry counts and likely causes, and finalization guards to prevent double-finalization |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pack.entry({ name: TAR_STREAM_ERROR_FILENAME }, errorMessage); | ||
| pack.finalize(); | ||
| isFinalized = true; |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The finalization flag is set to true immediately after calling pack.finalize(), but pack.entry() is an asynchronous operation. This creates a potential race condition where if an error event fires while the error entry is still being written, the guard won't prevent the double finalization. Consider setting isFinalized = true before calling pack.entry() to ensure the flag is set before any async operations, or use a callback/promise to set it after the entry is confirmed written.
| pack.entry({ name: TAR_STREAM_ERROR_FILENAME }, errorMessage); | |
| pack.finalize(); | |
| isFinalized = true; | |
| isFinalized = true; | |
| pack.entry({ name: TAR_STREAM_ERROR_FILENAME }, errorMessage); | |
| pack.finalize(); |
| const endFile: EndFile = { numOfFiles, scopeName }; | ||
| logger.debug('fromObjectStreamToTar, finished sending data', endFile); | ||
| pack.entry({ name: TAR_STREAM_END_FILENAME }, JSON.stringify(endFile)); | ||
| pack.finalize(); | ||
| isFinalized = true; |
Copilot
AI
Dec 8, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the error handler, the finalization flag is set after calling pack.finalize(), but pack.entry() on line 236 is asynchronous. If another event fires while the end entry is being written, the guard might not prevent issues. Consider setting isFinalized = true before line 236 to ensure the flag is set before any async operations begin.
| const endFile: EndFile = { numOfFiles, scopeName }; | |
| logger.debug('fromObjectStreamToTar, finished sending data', endFile); | |
| pack.entry({ name: TAR_STREAM_END_FILENAME }, JSON.stringify(endFile)); | |
| pack.finalize(); | |
| isFinalized = true; | |
| isFinalized = true; | |
| const endFile: EndFile = { numOfFiles, scopeName }; | |
| logger.debug('fromObjectStreamToTar, finished sending data', endFile); | |
| pack.entry({ name: TAR_STREAM_END_FILENAME }, JSON.stringify(endFile)); | |
| pack.finalize(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| this.logger.warn('FetchRoute, the client aborted the request', err); | ||
| } else if (err.code === 'ERR_STREAM_PREMATURE_CLOSE') { | ||
| this.logger.error( | ||
| 'FetchRoute, stream closed prematurely. This may cause "Invalid tar header" on client side.', |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing article "the" before "client side". The phrase should be "on the client side".
| 'FetchRoute, stream closed prematurely. This may cause "Invalid tar header" on client side.', | |
| 'FetchRoute, stream closed prematurely. This may cause "Invalid tar header" on the client side.', |
| logger.error(`fromTarToObjectStream tar extraction error after receiving ${entriesReceived} entries`, err); | ||
|
|
||
| // Enhanced error message for tar corruption with entry count | ||
| if (err.message?.includes('invalid tar header') || err.message?.includes('Invalid tar header')) { |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The conditional check uses both lowercase 'invalid tar header' and title case 'Invalid tar header'. This creates redundancy since the same condition is checked twice. Consider using a case-insensitive comparison instead to handle any variation in error message casing.
| if (err.message?.includes('invalid tar header') || err.message?.includes('Invalid tar header')) { | |
| if (err.message && err.message.toLowerCase().includes('invalid tar header')) { |
| // Enhanced error message for tar corruption with entry count | ||
| if (err.message?.includes('invalid tar header') || err.message?.includes('Invalid tar header')) { | ||
| const enhancedError = new Error( | ||
| `Invalid tar header after receiving ${entriesReceived} entries. Possible causes: 1) Server aborted request mid-stream, 2) Network corruption during transfer. Original error: ${err.message}` |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enhanced error message lists causes as "1) Server aborted request mid-stream, 2) Network corruption during transfer", but there could be other causes such as client-side buffer issues or tar library bugs. Consider either adding "3) Other stream or parsing issues" or rephrasing to "Likely causes include:" instead of implying these are the only two possibilities.
| `Invalid tar header after receiving ${entriesReceived} entries. Possible causes: 1) Server aborted request mid-stream, 2) Network corruption during transfer. Original error: ${err.message}` | |
| `Invalid tar header after receiving ${entriesReceived} entries. Likely causes include: 1) Server aborted request mid-stream, 2) Network corruption during transfer, 3) Other stream or parsing issues. Original error: ${err.message}` |
| const extract = tarStream.extract(); | ||
| let startData: StartFile | undefined; | ||
| let endData: EndFile | undefined; | ||
| let entriesReceived = 0; |
Copilot
AI
Dec 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name 'entriesReceived' might be misleading since it counts all tar entries including metadata files (TAR_STREAM_START_FILENAME, TAR_STREAM_END_FILENAME, TAR_STREAM_ERROR_FILENAME), not just object entries. Consider renaming to 'tarEntriesReceived' or 'allEntriesReceived' to clarify that this includes non-object metadata entries, or adjust the counting logic to exclude these special entries if the intent is to count only object entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| numOfFiles += 1; | ||
| pack.entry({ name: ObjectList.combineScopeAndHash(obj) }, obj.buffer); |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The numOfFiles counter is incremented before calling pack.entry(). If pack.entry() throws an error (e.g., due to invalid data), the counter will be incorrect. Consider incrementing the counter after successfully writing the entry, or wrapping the pack.entry() call in a try-catch to handle potential synchronous errors properly.
| numOfFiles += 1; | |
| pack.entry({ name: ObjectList.combineScopeAndHash(obj) }, obj.buffer); | |
| pack.entry({ name: ObjectList.combineScopeAndHash(obj) }, obj.buffer); | |
| numOfFiles += 1; |
| extract.on('error', (err: any) => { | ||
| logger.error(`fromTarToObjectStream tar extraction error after receiving ${entriesReceived} entries`, err); | ||
|
|
||
| // Enhanced error message for tar corruption with entry count | ||
| if (err.message?.includes('invalid tar header') || err.message?.includes('Invalid tar header')) { | ||
| const enhancedError = new Error( | ||
| `Invalid tar header after receiving ${entriesReceived} entries. Possible causes: 1) Server aborted request mid-stream, 2) Network corruption during transfer. Original error: ${err.message}` | ||
| ); | ||
| enhancedError.stack = err.stack; | ||
| passThrough.emit('error', enhancedError); | ||
| } else { | ||
| passThrough.emit('error', err); | ||
| } | ||
| }); |
Copilot
AI
Dec 17, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both the extract 'error' handler (lines 174-186) and the pipeline callback (lines 205-208) emit errors to passThrough. When an extract error occurs, it may propagate to both handlers, potentially causing duplicate error emissions and duplicate log messages. The comment "not sure if needed" on line 173 suggests uncertainty about this design. Consider verifying if both handlers are necessary, or add a guard to prevent duplicate emissions, especially since the enhanced error messages now make duplicates more noticeable.
Summary
Add entry counter and enhanced error messages to help diagnose 'Invalid tar header' errors during export/import streaming.
Changes
fromTarToObjectStreamfromObjectStreamToTarERR_STREAM_PREMATURE_CLOSEspecifically in fetch route and log correlation to client-side tar errorsWhy
When tar corruption errors occur, we now get actionable debugging info: