Skip to content

docs(deploy): spec + explanation for set_replica_identity (#1447)#180

Merged
dimitri-yatsenko merged 2 commits into
mainfrom
feat/replica-identity-docs
Jun 10, 2026
Merged

docs(deploy): spec + explanation for set_replica_identity (#1447)#180
dimitri-yatsenko merged 2 commits into
mainfrom
feat/replica-identity-docs

Conversation

@dimitri-yatsenko

Copy link
Copy Markdown
Member

Summary

Adds two new pages documenting the `dj.deploy.set_replica_identity` helper that lands in DataJoint 2.3 (datajoint-python #1466). Closes the docs side of datajoint-python #1447.

File Role
`src/reference/specs/deploy-operations.md` (new) Normative spec for the new `datajoint.deploy` module + the `set_replica_identity` function. Includes a Design rationale section explaining the three structural choices (migration-only; not in `dj.migrate`; new module for an emerging category).
`src/explanation/postgresql-cdc-replication.md` (new) Conceptual explainer: what `REPLICA IDENTITY` is, why CDC consumers care (Lakehouse Sync mandates `FULL` and silently skips tables that lack it), cost and compliance considerations, and the representative workflow.
`mkdocs.yaml` New "Operations" group under Concepts; new "Deployment" group under Reference → Specifications.

The spec carries the formal API contract; the explainer carries the reasoning and the WAL/compliance tradeoffs that motivate the feature. Both pages cross-link to each other.

Why a new module page rather than folding into existing specs

`datajoint.deploy` is the first of an emerging category of operational helpers (publication membership, vacuum/reindex, role grants are plausible siblings). Giving it a dedicated spec page now — with one inhabitant — establishes the boundary against `datajoint.migrate` and provides a home for future helpers without retroactive reorganization. The rationale section in the spec walks through the alternatives that were rejected.

Sequencing

This PR is independently reviewable but should land alongside or after datajoint-python #1466 — the code that implements the function. If they merge in either order, no broken links result; both stand alone.

Test plan

  • `mkdocs serve` renders both new pages under their new nav groups
  • Cross-links resolve (spec ↔ explainer, both → PostgreSQL docs)
  • No `mkdocs build --strict` warnings introduced
  • Re-read both pages once datajoint-python #1466 lands to confirm the API matches the shipped code

Two new pages for the dj.deploy.set_replica_identity helper landing in
DataJoint 2.3 (datajoint-python #1466):

- src/reference/specs/deploy-operations.md — normative spec for the
  datajoint.deploy module, with set_replica_identity as the first
  inhabitant. Includes a Design rationale section explaining the three
  structural choices: migration-only (no auto-emit at declare time), a
  new module rather than dj.migrate, idempotency-by-default.

- src/explanation/postgresql-cdc-replication.md — explainer covering
  what REPLICA IDENTITY is at the PostgreSQL level, why CDC consumers
  care (Databricks Lakehouse Sync mandates FULL and silently skips
  tables that lack it), cost and compliance considerations, and the
  representative workflow.

Both pages cross-link to each other. The spec carries the formal API
contract; the explainer carries the reasoning and the WAL/compliance
tradeoffs that motivate it.

Nav: new "Operations" group under Concepts; new "Deployment" group
under Reference > Specifications.

Slated for 2.3 alongside the implementation PR.
MilagrosMarin
MilagrosMarin previously approved these changes Jun 10, 2026

@MilagrosMarin MilagrosMarin left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified carefully against datajoint-python#1466:

✅ Signature, return shape, error messages match impl exactly
Schema.list_tables() already excludes ~/~~ tables (schemas.py:525-545) — the CDC-relevant filter
✅ Cross-links resolve; nav placement under Concepts → Operations and Reference → Specifications → Deployment is sensible
✅ The "Design rationale" section is unusually well-argued — the migration-only argument (mixed-state failure mode from a config-flag + utility combo) is the right framing
✅ Databricks Lakehouse Sync "silently skipped" framing accurately motivates the feature

Sequencing note: impl PR #1466 is still open (I requested changes on the assert usage). This docs PR is independently reviewable but ideally lands alongside or after the impl.

Approving — the spec and explainer are a clean pair.

The companion spec page (deploy-operations.md) already carries a
version-added admonition. Mirror it on the explainer so a reader
landing here from search or a link sees the version context up front
rather than only from the spec cross-link.
@dimitri-yatsenko dimitri-yatsenko merged commit 605b52b into main Jun 10, 2026
3 checks passed
@dimitri-yatsenko dimitri-yatsenko deleted the feat/replica-identity-docs branch June 10, 2026 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants