Skip to content

feat: add is_deleted soft-delete to ClickHouse IDENTITIES table#7710

Open
10done wants to merge 10 commits into
Flagsmith:mainfrom
10done:feat/clickhouse-identities-is-deleted
Open

feat: add is_deleted soft-delete to ClickHouse IDENTITIES table#7710
10done wants to merge 10 commits into
Flagsmith:mainfrom
10done:feat/clickhouse-identities-is-deleted

Conversation

@10done

@10done 10done commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Thanks for submitting a PR! Please check the boxes below:

  • I have read the Contributing Guide.
  • I have added information to docs/ if required so people know about the feature.
  • I have filled in the "Changes" section below.
  • I have filled in the "How did you test this code" section below.

Changes

Fixes #7593

  • Added is_deleted Bool DEFAULT false column to the ClickHouse IDENTITIES table via a new migration
  • Extended the mapper and INSERT column list to carry is_deleted on every backfill row (defaults to false)
  • Added write_identity_deletion_tombstone_to_clickhouse async task, dispatched from EdgeIdentity.delete() to tombstone deleted identities in real time
  • Added AND i.is_deleted = false to the segment membership count query so deleted identities never appear in counts

How did you test this code?

Screenshot 2026-06-05 at 5 01 43 AM DESCRIBE TABLE IDENTITIES` confirms `is_deleted Bool DEFAULT false` column exists after migration Screenshot 2026-06-05 at 5 03 35 AM Inserted a live row (`alice`) and a tombstone (`bob`); query with `WHERE is_deleted = false` returns only `alice` Screenshot 2026-06-05 at 5 07 03 AM Python assertions confirm `services.py` has the `is_deleted = false` filter and `tasks.py` has the correct INSERT column list and SQL

@10done 10done requested a review from a team as a code owner June 5, 2026 00:00
@10done 10done requested review from emyller and removed request for a team June 5, 2026 00:00
@vercel

vercel Bot commented Jun 5, 2026

Copy link
Copy Markdown

@10done is attempting to deploy a commit to the Flagsmith Team on Vercel.

A member of the Team first needs to authorize it.

@github-actions github-actions Bot added the api Issue related to the REST API label Jun 5, 2026
@10done 10done requested a review from a team as a code owner June 5, 2026 01:19
@github-actions github-actions Bot added the docs Documentation updates label Jun 5, 2026
@codecov

codecov Bot commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.39%. Comparing base (cbcac64) to head (32cb19c).
⚠️ Report is 20 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7710      +/-   ##
==========================================
- Coverage   98.52%   98.39%   -0.13%     
==========================================
  Files        1444     1453       +9     
  Lines       54971    55870     +899     
==========================================
+ Hits        54161    54976     +815     
- Misses        810      894      +84     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@matthewelwell

Copy link
Copy Markdown
Contributor

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to handle deleted identities in ClickHouse by writing a tombstone row (with is_deleted=True) upon identity deletion, ensuring they are excluded from segment membership counts. It includes a ClickHouse migration, updates to the mapping logic, a new Celery task, and corresponding unit tests. The review feedback highlights a potential issue with write amplification and table bloat in multi-tenant environments, suggesting that we should verify if segment membership is enabled for the organization before writing the tombstone row.

Comment on lines +209 to +216
if not settings.CLICKHOUSE_ENABLED:
logger.info(
"tombstone.skipped",
reason="clickhouse_not_configured",
env_key=env_key,
identifier=identifier,
)
return

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

In a multi-tenant environment, writing tombstone rows for every deleted identity across all organizations—even those that do not have segment_membership_inspection enabled—will lead to unnecessary ClickHouse write amplification and table bloat. We should check if the organization has segment membership enabled before writing the tombstone.

    if not settings.CLICKHOUSE_ENABLED:
        logger.info(
            "tombstone.skipped",
            reason="clickhouse_not_configured",
            env_key=env_key,
            identifier=identifier,
        )
        return

    from environments.models import Environment

    try:
        environment = Environment.objects.select_related(
            "project__organisation"
        ).get(api_key=env_key)
    except Environment.DoesNotExist:
        logger.info(
            "tombstone.skipped",
            reason="environment_not_found",
            env_key=env_key,
            identifier=identifier,
        )
        return

    if not is_membership_enabled(environment.project.organisation):
        logger.info(
            "tombstone.skipped",
            reason="segment_membership_disabled",
            env_key=env_key,
            identifier=identifier,
        )
        return

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have addressed this raised concern.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api Issue related to the REST API docs Documentation updates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Handle is_deleted attribute when querying for identities in ClickHouse

3 participants