Skip to content

Conversation

@davidfirst
Copy link
Member

Summary

Add entry counter and enhanced error messages to help diagnose 'Invalid tar header' errors during export/import streaming.

Changes

  • Track number of tar entries received before corruption occurs in fromTarToObjectStream
  • Add enhanced error message for tar header errors showing entry count and likely causes (server abort, network corruption)
  • Add finalization guard to prevent double-finalization bugs in fromObjectStreamToTar
  • Detect ERR_STREAM_PREMATURE_CLOSE specifically in fetch route and log correlation to client-side tar errors

Why

When tar corruption errors occur, we now get actionable debugging info:

  • Entry count shows if corruption happens at start (0 entries), middle, or end
  • Enhanced error messages point to most likely root causes
  • Server logs can be correlated with client errors via entry counts

Add entry counter and enhanced error messages to help diagnose 'Invalid tar header' errors during export/import streaming.

Changes:
- Track number of tar entries received before corruption occurs
- Add enhanced error message for tar header errors with likely causes
- Add finalization guard to prevent double-finalization in tar creation
- Detect ERR_STREAM_PREMATURE_CLOSE specifically in fetch route
Copilot AI review requested due to automatic review settings December 8, 2025 20:36
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive debugging instrumentation to diagnose "Invalid tar header" errors that occur during component export/import streaming operations. The changes track entry counts, enhance error messages with actionable context, and add guards to prevent double-finalization bugs.

  • Track tar entry counts to identify when corruption occurs (at start, middle, or end of stream)
  • Enhanced error messages that correlate server-side stream errors with client-side tar corruption
  • Finalization guards to prevent duplicate pack.finalize() calls

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
scopes/scope/scope/routes/fetch.route.ts Added specific detection and logging for ERR_STREAM_PREMATURE_CLOSE to correlate with client-side tar errors
scopes/scope/objects/objects/object-list.ts Added entry counter tracking, enhanced error messages with entry counts and likely causes, and finalization guards to prevent double-finalization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 217 to 252
pack.entry({ name: TAR_STREAM_ERROR_FILENAME }, errorMessage);
pack.finalize();
isFinalized = true;
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finalization flag is set to true immediately after calling pack.finalize(), but pack.entry() is an asynchronous operation. This creates a potential race condition where if an error event fires while the error entry is still being written, the guard won't prevent the double finalization. Consider setting isFinalized = true before calling pack.entry() to ensure the flag is set before any async operations, or use a callback/promise to set it after the entry is confirmed written.

Suggested change
pack.entry({ name: TAR_STREAM_ERROR_FILENAME }, errorMessage);
pack.finalize();
isFinalized = true;
isFinalized = true;
pack.entry({ name: TAR_STREAM_ERROR_FILENAME }, errorMessage);
pack.finalize();

Copilot uses AI. Check for mistakes.
Comment on lines 209 to 238
const endFile: EndFile = { numOfFiles, scopeName };
logger.debug('fromObjectStreamToTar, finished sending data', endFile);
pack.entry({ name: TAR_STREAM_END_FILENAME }, JSON.stringify(endFile));
pack.finalize();
isFinalized = true;
Copy link

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the error handler, the finalization flag is set after calling pack.finalize(), but pack.entry() on line 236 is asynchronous. If another event fires while the end entry is being written, the guard might not prevent issues. Consider setting isFinalized = true before line 236 to ensure the flag is set before any async operations begin.

Suggested change
const endFile: EndFile = { numOfFiles, scopeName };
logger.debug('fromObjectStreamToTar, finished sending data', endFile);
pack.entry({ name: TAR_STREAM_END_FILENAME }, JSON.stringify(endFile));
pack.finalize();
isFinalized = true;
isFinalized = true;
const endFile: EndFile = { numOfFiles, scopeName };
logger.debug('fromObjectStreamToTar, finished sending data', endFile);
pack.entry({ name: TAR_STREAM_END_FILENAME }, JSON.stringify(endFile));
pack.finalize();

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings December 11, 2025 14:09
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

this.logger.warn('FetchRoute, the client aborted the request', err);
} else if (err.code === 'ERR_STREAM_PREMATURE_CLOSE') {
this.logger.error(
'FetchRoute, stream closed prematurely. This may cause "Invalid tar header" on client side.',
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing article "the" before "client side". The phrase should be "on the client side".

Suggested change
'FetchRoute, stream closed prematurely. This may cause "Invalid tar header" on client side.',
'FetchRoute, stream closed prematurely. This may cause "Invalid tar header" on the client side.',

Copilot uses AI. Check for mistakes.
logger.error(`fromTarToObjectStream tar extraction error after receiving ${entriesReceived} entries`, err);

// Enhanced error message for tar corruption with entry count
if (err.message?.includes('invalid tar header') || err.message?.includes('Invalid tar header')) {
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditional check uses both lowercase 'invalid tar header' and title case 'Invalid tar header'. This creates redundancy since the same condition is checked twice. Consider using a case-insensitive comparison instead to handle any variation in error message casing.

Suggested change
if (err.message?.includes('invalid tar header') || err.message?.includes('Invalid tar header')) {
if (err.message && err.message.toLowerCase().includes('invalid tar header')) {

Copilot uses AI. Check for mistakes.
// Enhanced error message for tar corruption with entry count
if (err.message?.includes('invalid tar header') || err.message?.includes('Invalid tar header')) {
const enhancedError = new Error(
`Invalid tar header after receiving ${entriesReceived} entries. Possible causes: 1) Server aborted request mid-stream, 2) Network corruption during transfer. Original error: ${err.message}`
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enhanced error message lists causes as "1) Server aborted request mid-stream, 2) Network corruption during transfer", but there could be other causes such as client-side buffer issues or tar library bugs. Consider either adding "3) Other stream or parsing issues" or rephrasing to "Likely causes include:" instead of implying these are the only two possibilities.

Suggested change
`Invalid tar header after receiving ${entriesReceived} entries. Possible causes: 1) Server aborted request mid-stream, 2) Network corruption during transfer. Original error: ${err.message}`
`Invalid tar header after receiving ${entriesReceived} entries. Likely causes include: 1) Server aborted request mid-stream, 2) Network corruption during transfer, 3) Other stream or parsing issues. Original error: ${err.message}`

Copilot uses AI. Check for mistakes.
const extract = tarStream.extract();
let startData: StartFile | undefined;
let endData: EndFile | undefined;
let entriesReceived = 0;
Copy link

Copilot AI Dec 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'entriesReceived' might be misleading since it counts all tar entries including metadata files (TAR_STREAM_START_FILENAME, TAR_STREAM_END_FILENAME, TAR_STREAM_ERROR_FILENAME), not just object entries. Consider renaming to 'tarEntriesReceived' or 'allEntriesReceived' to clarify that this includes non-object metadata entries, or adjust the counting logic to exclude these special entries if the intent is to count only object entries.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings December 17, 2025 19:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 226 to 227
numOfFiles += 1;
pack.entry({ name: ObjectList.combineScopeAndHash(obj) }, obj.buffer);
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The numOfFiles counter is incremented before calling pack.entry(). If pack.entry() throws an error (e.g., due to invalid data), the counter will be incorrect. Consider incrementing the counter after successfully writing the entry, or wrapping the pack.entry() call in a try-catch to handle potential synchronous errors properly.

Suggested change
numOfFiles += 1;
pack.entry({ name: ObjectList.combineScopeAndHash(obj) }, obj.buffer);
pack.entry({ name: ObjectList.combineScopeAndHash(obj) }, obj.buffer);
numOfFiles += 1;

Copilot uses AI. Check for mistakes.
Comment on lines +174 to 187
extract.on('error', (err: any) => {
logger.error(`fromTarToObjectStream tar extraction error after receiving ${entriesReceived} entries`, err);

// Enhanced error message for tar corruption with entry count
if (err.message?.includes('invalid tar header') || err.message?.includes('Invalid tar header')) {
const enhancedError = new Error(
`Invalid tar header after receiving ${entriesReceived} entries. Possible causes: 1) Server aborted request mid-stream, 2) Network corruption during transfer. Original error: ${err.message}`
);
enhancedError.stack = err.stack;
passThrough.emit('error', enhancedError);
} else {
passThrough.emit('error', err);
}
});
Copy link

Copilot AI Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the extract 'error' handler (lines 174-186) and the pipeline callback (lines 205-208) emit errors to passThrough. When an extract error occurs, it may propagate to both handlers, potentially causing duplicate error emissions and duplicate log messages. The comment "not sure if needed" on line 173 suggests uncertainty about this design. Consider verifying if both handlers are necessary, or add a guard to prevent duplicate emissions, especially since the enhanced error messages now make duplicates more noticeable.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants