fix(sync-service): Fix subquery shape dependency validation on restore from backup#3628
Closed
fix(sync-service): Fix subquery shape dependency validation on restore from backup#3628
Conversation
0e2a220 to
28da709
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3628 +/- ##
=======================================
Coverage 87.79% 87.79%
=======================================
Files 18 18
Lines 1663 1663
Branches 415 415
=======================================
Hits 1460 1460
Misses 201 201
Partials 2 2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Contributor
|
Found 1 test failure on Blacksmith runners: Failure
|
ec42b75 to
bcd75af
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is still in draft because I have some reservations about it:
However this PR represents my current best guess as to how a Materializer went missing causing AutoArc's outage.
Summary
restore_dependency_handlesfor clarityProblem
Shapes with subqueries store handles to their dependency shapes in
shape_dependencies_handles. When restoring shapes from storage viaload_shapes, this list is rebuilt and validated byrestore_dependency_handles, which removes shapes whose dependencies no longer exist.However, when restoring from backup via
load_backup, this validation was not performed. The backup contains serializedShapestructs with theirshape_dependencies_handlesalready populated, but those handles may reference shapes that no longer exist (due to cleanup, schema changes, or prior removal).Evidence from AutoArc's production logs
AWS CloudWatch logs from Dec 18 show a crash loop with this error:
GenServer.call({:via, Registry, {..., {Electric.Shapes.Consumer.Materializer, "32220858-1765808264363524"}}}, :get_link_values, 5000)
** (EXIT) no process: the process is not alive
Key observations:
32220858-1765808264363524was created Dec 15 (timestamp embedded in handle):shutdownreason)32220858-1765808264363524was not among themshape_dependencies_handlesThis is a classic stale reference: the dependency shape was removed/lost, but parent shapes restored from backup still held handles to it.
Fix
load_backupremoves shapes without valid storage, callremove_shapes_with_invalid_dependenciesto cascade removals to any shapes whoseshape_dependencies_handlesreference the removed handlesrestore_dependency_handlesto also cascade removals (handles theload_shapespath)