Skip to content

feat(tables): paginated background row-delete jobs via table_jobs#4915

Open
TheodoreSpeaks wants to merge 3 commits into
stagingfrom
improvement/table-row-deletes
Open

feat(tables): paginated background row-delete jobs via table_jobs#4915
TheodoreSpeaks wants to merge 3 commits into
stagingfrom
improvement/table-row-deletes

Conversation

@TheodoreSpeaks

Copy link
Copy Markdown
Collaborator

Summary

  • Add a background row-delete job: "select all" (active filter + optional deselections) sends { filter, excludeRowIds } instead of every row id, deleting in keyset-paginated batches. Rows inserted mid-job are spared via a created_at cutoff.
  • Generalize the async-import slot into a dedicated table_jobs table (one row per job, type = import|delete). Concurrency gate is a partial-unique index on (table_id) WHERE status='running' — one bg job per table, so import/delete are mutually exclusive. Job fields are re-exposed on TableDefinition via a latest-job join, so the tray/SSE/grid stay unchanged.
  • Optimistic UX: matching rows clear from the view immediately; the tray shows "Deleting N rows…" and reconciles on completion (restores on failure/cancel).
  • Selection model gains { kind:'all', excluded } so "select all then deselect" maps cleanly to the job — fixes the old bug where deselect-after-select-all silently dropped unloaded rows.
  • Migration 0227: create table_jobs, backfill in-flight import state, drop the old import_* columns. Janitor sweeps stale running jobs → failed and prunes terminal jobs.

Type of Change

  • New feature

Testing

  • Manual: select-all delete (filtered + deselections) on a large table — rows clear instantly, job finishes in the background, mid-job inserts survive; import still works; import+delete mutually exclusive (409).
  • bun run lint clean, bun run check:api-validation:strict passes, tsc --noEmit clean, 260 table/cron tests pass (added delete-runner, delete-async, and job/cancel tests).

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel

vercel Bot commented Jun 9, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Jun 9, 2026 7:37am

Request Review

@cursor

cursor Bot commented Jun 9, 2026

Copy link
Copy Markdown

PR Summary

High Risk
Touches row deletion, schema migration/backfill, and shared job concurrency; optimistic UI can diverge from server state until SSE reconciles on failure/cancel.

Overview
Introduces background “select all” row deletes via POST /api/table/[tableId]/delete-async: the client sends the active filter and optional excludeRowIds instead of every row id; a detached worker deletes in keyset pages with a created_at cutoff so rows added mid-job survive.

Import state moves off user_table_definitions into a new table_jobs table (migration 0230): one running job per table (partial unique index), types import | delete, with jobStatus / jobId / jobType re-exposed on table APIs. Import/async paths use markTableJobRunning and shared cancel at /job/cancel; SSE events are generalized from import to kind: 'job'.

Grid UX: select-all supports an exclusion set, filter-scoped totals for counts, async delete with optimistic row removal and confirmation; Cmd+A toggles whole-table selection. Cron janitor marks stale running jobs failed and prunes old terminal job rows.

Reviewed by Cursor Bugbot for commit b324ac0. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread apps/sim/hooks/queries/tables.ts
@greptile-apps

greptile-apps Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a background row-delete job (paginated, filter-scoped, ownership-gated) alongside a new table_jobs table that replaces the old import_* columns on user_table_definitions. The concurrency gate moves from a conditional update on the definition row to a partial unique index on (table_id) WHERE status='running', making import and delete mutually exclusive for free.

  • Backend: delete-runner.ts pages through rows by keyset on id with a created_at cutoff, table_jobs tracks progress for both import and delete, and the janitor sweeps stale running jobs and prunes old terminal rows.
  • Frontend: RowSelection gains an excluded set for "select all then deselect", the context-menu path now fires the async job instead of draining all rows, and the tray/SSE/action bar are updated to the renamed job* fields.
  • Migration 0227: backfills existing import state into table_jobs before dropping the old columns.

Confidence Score: 4/5

Safe to merge after addressing the incorrect row count shown in the delete confirmation dialog when a filter is active.

The backend (migration, delete runner, job CRUD, concurrency gate) is well-designed and correct. The one defect is in the frontend: the delete confirmation dialog shows the total table row count instead of the filtered count when a filter is active, so a user deleting 50 filtered rows sees 'Delete 10,000 rows…'. The over-broad optimistic cache clear is a minor multi-cached-state edge case. Everything else — keyset pagination, cutoff semantics, ownership checks, SSE reconciliation, import compatibility — looks solid.

apps/sim/app/workspace/[workspaceId]/tables/[tableId]/components/table-grid/table-grid.tsx (estimatedCount logic) and apps/sim/hooks/queries/tables.ts (onMutate query scope)

Important Files Changed

Filename Overview
apps/sim/lib/table/delete-runner.ts New background paginated delete worker: well-structured keyset pagination with created_at cutoff, ownership-gated per page, and correct exclusion-set handling
apps/sim/lib/table/service.ts Import-scoped fields migrated to table_jobs table; job CRUD functions generalized from import→job; latestJobForTable uses correct DESC ordering with DISTINCT ON for batch; two sequential queries per getTableById call is acceptable
packages/db/migrations/0227_table_jobs.sql Creates table_jobs with partial unique index on (table_id) WHERE status='running' as concurrency gate; backfills existing import state correctly; drops old import_ columns
packages/db/schema.ts tableJobs table added with FK cascades, partial unique index for concurrency, and watchdog/table-started indexes; old import_ columns removed from userTableDefinitions
apps/sim/app/api/table/[tableId]/delete-async/route.ts Kickoff route validates filter up front, captures created_at cutoff, atomically claims job slot via markTableJobRunning, then fires runDetached; correct 409 handling on conflict
apps/sim/app/api/table/[tableId]/job/cancel/route.ts Cancel route correctly resolves job type via getTableById before flipping status; markJobCanceled is scoped to jobId+running so TOCTOU does not affect correctness
apps/sim/app/workspace/[workspaceId]/tables/[tableId]/components/table-grid/table-grid.tsx estimatedCount for select-all delete uses tableRowCountRef.current (total table rows) instead of filtered count; when a filter is active the confirmation dialog shows the wrong (inflated) number
apps/sim/hooks/queries/tables.ts useDeleteTableRowsAsync added with correct onError rollback; onMutate optimistic update clears all cached row queries (exact:false) regardless of filter, not just the current filter view
apps/sim/app/workspace/[workspaceId]/tables/[tableId]/hooks/use-table-event-stream.ts import→job event kind rename applied cleanly; applyJob correctly handles type-discriminated delete vs import behavior; debounce timer cleanup updated
apps/sim/app/api/cron/cleanup-stale-executions/route.ts Stale-job janitor updated to target table_jobs rows; adds 24-hour pruning of terminal jobs; error is caught separately so a prune failure does not mask the stale-marking result

Sequence Diagram

sequenceDiagram
    participant Client
    participant DeleteAsyncRoute as POST /delete-async
    participant DB as PostgreSQL
    participant Worker as runTableDelete (detached)
    participant SSE as SSE stream

    Client->>DeleteAsyncRoute: "POST {filter, excludeRowIds}"
    DeleteAsyncRoute->>DB: markTableJobRunning (INSERT table_jobs ON CONFLICT DO NOTHING)
    alt slot already taken
        DB-->>DeleteAsyncRoute: conflict 0 rows
        DeleteAsyncRoute-->>Client: 409 A job is already in progress
    else slot claimed
        DB-->>DeleteAsyncRoute: job row inserted
        DeleteAsyncRoute->>Worker: runDetached(runTableDelete)
        DeleteAsyncRoute-->>Client: "200 {jobId}"
        loop each page
            Worker->>DB: updateJobProgress (heartbeat + ownership check)
            alt job canceled
                DB-->>Worker: 0 rows JobSupersededError stop
            end
            Worker->>DB: "selectRowIdPage (id > afterId, createdAt <= cutoff, filter)"
            DB-->>Worker: page of IDs
            Worker->>DB: deletePageByIds skipCompaction
            Worker->>SSE: appendTableEvent(running, progress) every 5000 rows
        end
        Worker->>DB: markJobReady
        Worker->>SSE: appendTableEvent(ready)
        SSE-->>Client: invalidate rows + detail query
    end
    opt user cancels
        Client->>DB: "POST /job/cancel {jobId}"
        DB-->>Client: "canceled=true"
        SSE-->>Client: appendTableEvent(canceled) restore optimistic rows
    end
Loading

Reviews (1): Last reviewed commit: "feat(tables): paginated background row-d..." | Re-trigger Greptile

Comment on lines +815 to +817
const excludeRowIds = rowSel.excluded ? [...rowSel.excluded] : []
const estimatedCount = Math.max(0, tableRowCountRef.current - excludeRowIds.length)
onRequestDeleteAllByFilter({ excludeRowIds, estimatedCount })

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The estimatedCount uses tableRowCountRef.current, which is the TOTAL table row count (tableData.rowCount), not the count matching the active filter. When a filter is active, the confirmation dialog will say "Delete 10,000 rows matching the current filter" even though only a small subset of those rows match the filter and will actually be deleted. This could alarm users or make them distrust the UI. The filtered count should come from the rows query's metadata (or at minimum the dialog should not show a count when a filter is active).

Suggested change
const excludeRowIds = rowSel.excluded ? [...rowSel.excluded] : []
const estimatedCount = Math.max(0, tableRowCountRef.current - excludeRowIds.length)
onRequestDeleteAllByFilter({ excludeRowIds, estimatedCount })
const excludeRowIds = rowSel.excluded ? [...rowSel.excluded] : []
// When a filter is active, tableRowCountRef.current is the *total* table count, not the
// filtered count — pass undefined so the dialog avoids showing a misleading number.
const estimatedCount = queryOptions?.filter
? undefined
: Math.max(0, tableRowCountRef.current - excludeRowIds.length)
onRequestDeleteAllByFilter({ excludeRowIds, estimatedCount })

Comment thread apps/sim/hooks/queries/tables.ts Outdated
Comment on lines +939 to +956
await queryClient.cancelQueries({ queryKey: tableKeys.rowsRoot(tableId) })
const previousQueries = queryClient.getQueriesData<InfiniteData<TableRowsResponse, number>>({
queryKey: tableKeys.rowsRoot(tableId),
})
const previousDetail = queryClient.getQueryData<TableDefinition>(tableKeys.detail(tableId))
const keep = new Set(excludeRowIds ?? [])
queryClient.setQueriesData<InfiniteData<TableRowsResponse, number>>(
{ queryKey: tableKeys.rowsRoot(tableId), exact: false },
(old) =>
old
? {
...old,
pages: old.pages.map((page) => ({
...page,
rows: page.rows.filter((r) => keep.has(r.id)),
})),
}
: old

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Optimistic update over-clears row caches

setQueriesData with exact: false updates every cached row query for this table, not just the one that matches the current filter. If the user previously loaded the table without a filter (or with a different filter), those cached pages are also cleared. Since onSuccess only invalidates tableKeys.lists() (not row queries), the stale no-op rows stay empty until the SSE terminal event fires — which may be minutes later for a large delete. The fix is to restrict the optimistic update to the exact query key that matches the current filter.

…ed optimistic clear, Cmd+A select-all, hide delete from tray)
…row-deletes

# Conflicts:
#	apps/sim/app/workspace/[workspaceId]/tables/[tableId]/table.tsx
#	apps/sim/app/workspace/[workspaceId]/tables/components/import-csv-dialog/import-csv-dialog.tsx
#	apps/sim/app/workspace/[workspaceId]/tables/tables.tsx
#	apps/sim/lib/table/import-runner.ts
#	apps/sim/lib/table/service.ts
#	packages/db/migrations/meta/0227_snapshot.json
#	packages/db/migrations/meta/_journal.json

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit b324ac0. Configure here.

if (excluded.has(targetId)) excluded.delete(targetId)
else excluded.add(targetId)
return { kind: 'all', excluded: excluded.size === 0 ? undefined : excluded }
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shift-click breaks select-all

Medium Severity

After a filter-scoped select-all (kind: 'all'), a shift-click on a row checkbox skips the exclusion path and calls rowSelectionMaterialize, which collapses the selection to loaded row ids only. Unloaded rows that still match the filter drop out of the selection and from async delete scope.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b324ac0. Configure here.

const lockedId = prev?.importId
if (lockedId && importId && importId !== lockedId) return
const lockedId = prev?.jobId
if (lockedId && jobId && jobId !== lockedId) return

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Terminal job events dropped early

High Severity

applyJob ignores terminal SSE events when the table detail cache has no jobId, but async delete clears rows in onMutate and only sets jobId in onSuccess. If the worker finishes and emits ready or failed before that lands, the event is dropped, row invalidation never runs, and a failed delete can leave the grid empty while data still exists.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit b324ac0. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant