-
Notifications
You must be signed in to change notification settings - Fork 96
Cascade may fail when Part table references another Part with renamed foreign keys #1429
Description
Summary
Diagram.cascade() with part_integrity="cascade" may not correctly propagate restrictions when a Part table references another Part table instead of its Master directly, especially with renamed (projected) foreign keys.
Context
Discussion: #1232 (Spyglass Merge table pattern)
Related: #1423 (Diagram.trace() — same issue applies in the upstream direction)
Problem
The Part→Master upward propagation in _propagate_restrictions() (diagram.py lines 514-537) does two things:
- Identifies the Master via
extract_master(target)— uses the__naming convention to derive the master table name from the part table name - Restricts the Master via
master_ft.proj() & child_ft.proj()— joins on shared primary key attributes
This assumes the Part table shares primary key attributes directly with its Master.
Case 1: Part references another Part with renamed FK (no direct Master reference)
@schema
class Master(dj.Manual):
definition = """
master_id : smallint
"""
class PartA(dj.Part):
definition = """
-> master
part_a_id : smallint
"""
class PartB(dj.Part):
definition = """
-> Master.PartA.proj(src_master='master_id', src_part='part_a_id')
part_b_id : smallint
"""PartB has no direct FK to Master — it references PartA with renamed keys (src_master ← master_id, src_part ← part_a_id). When cascade reaches PartB:
extract_mastercorrectly identifies Master by naming convention- But
master_ft.proj() & child_ft.proj()joins on shared attribute names —master_idis not in PartB (it was renamed tosrc_master), so the join finds no common attributes - The intermediate PartA restriction is also skipped
Case 2: Part references another Part only (no Master reference at all)
@schema
class Master(dj.Manual):
definition = """
master_id : smallint
"""
class PartA(dj.Part):
definition = """
-> master
part_a_id : smallint
"""
class PartB(dj.Part):
definition = """
-> Master.PartA
part_b_id : smallint
"""PartB references PartA directly (inheriting master_id and part_a_id). The restriction path should be PartB → PartA → Master. The current code jumps from PartB to Master via proj() & proj(). If master_id is inherited through PartA, the join may work by coincidence — but PartA's restriction is skipped entirely.
Case 3: Merge table with renamed reference from a table with different PK
@schema
class Merge(dj.Manual):
definition = """
merge_id : uuid
"""
class Source1(dj.Part):
definition = """
-> master
---
-> UpstreamPipeline1
"""
class Source2(dj.Part):
definition = """
-> master
---
-> UpstreamPipeline2
"""
@schema
class Downstream(dj.Computed):
definition = """
downstream_id : smallint
---
-> Merge
-> Merge.proj(comparison_src='merge_id')
"""Downstream has its own primary key and two FKs into Merge — one direct, one renamed via .proj(). Cascade from Downstream needs to reach Merge's Parts through both FK paths. The Part→Master propagation doesn't account for the renamed FK (comparison_src → merge_id).
Expected behavior
Cascade should correctly propagate restrictions through Part→Part→Master chains, including cases with renamed foreign keys (via .proj()), by following the actual FK path rather than relying on shared attribute names.
Proposed fix
When propagating from a Part to its Master:
- If the Part doesn't share PK attributes directly with Master, follow the FK chain through intermediate Part tables
- Apply the standard restriction propagation rules (including
attr_mapalias handling from_apply_propagation_rule) at each step, rather than a directproj() & proj()join
Verification
Needs test cases covering:
- PartB references PartA with renamed FK (no direct Master reference) — verify Master is correctly restricted
- PartB references PartA directly (no Master reference) — verify PartA and Master are both restricted
- Renamed FK back to same Master via
.proj()from a table with different PK — verify both paths are traced