[Bug] Key_Shared (PIP-379): pending-acks cleanup on mark-delete advance can break ordering

## Summary

For Key_Shared subscriptions using the PIP-379 implementation (`PersistentStickyKeyDispatcherMultipleConsumers`), the dispatcher's `markDeletePositionMoveForward()` hook clears each consumer's `PendingAcksMap` for any positions at or below the new mark-delete position. This pending-acks cleanup is harmful for Key_Shared and can result in **out-of-order message delivery** for a sticky key whose hash is currently being drained.

## Background

PIP-379 introduced the `DrainingHashesTracker` to prevent out-of-order delivery during sticky-key hash range reassignment. While consumer A still has unacked messages for hash *H* in its `PendingAcksMap`, the new owner of *H* (consumer B) is **blocked** from receiving messages with *H*. The hash refCount is decremented whenever an entry is removed from the `PendingAcksMap`, via `PendingAcksRemoveHandler.handleRemoving` → `DrainingHashesTracker.reduceRefCount`.

The handler does not distinguish removal-because-acked from removal-because-mark-delete-advanced. Both paths look identical to the draining-hashes tracker.

## What happens

`PersistentDispatcherMultipleConsumers.markDeletePositionMoveForward()` does two things when the cursor mark-delete position advances (triggered by skip-messages, clear-backlog, expiry, or backlog-quota eviction):

1. Drops entries from `redeliveryMessages` for positions ≤ new mark-delete position.
2. Calls `Consumer.removePendingAcksUpToPositionAndDecrementUnacked(...)` for every consumer in the dispatcher.

`PersistentStickyKeyDispatcherMultipleConsumers` invokes `super.markDeletePositionMoveForward()` and inherits both behaviors.

Step 2 is correct for Shared subscriptions, but for Key_Shared (PIP-379) it removes in-flight pending acks that the draining-hashes mechanism is using to keep a hash blocked. The hash refCount can prematurely drop to 0, the block lifts, and the new owner of that hash starts receiving messages with the same key **before the original consumer has processed its already-on-the-wire messages**.

### Failure scenario

1. Consumer A receives message M with sticky key K (hash H). M is added to A's `PendingAcksMap`. M is on the wire.
2. Consumer B joins; the sticky-key range covering H reassigns from A to B. Hash H is registered as **draining** for A.
3. Mark-delete advances past M's position — for example, because backlog-quota eviction (`BacklogQuotaManager`) calls `ManagedCursor.skipEntries`, or because message expiry advances the cursor, or an admin issues `skip-messages`/`clear-backlog`.
4. `markDeletePositionMoveForward` fires. M is removed from A's `PendingAcksMap`. The remove handler decrements H's refCount, which reaches 0, and the draining block lifts.
5. A new message with key K is produced. It is delivered to consumer **B**.
6. Meanwhile, M is still in flight to consumer A and is processed by A.
7. The application has now processed two messages with the same key on two different consumers — out-of-order delivery, violating the Key_Shared ordering guarantee.

## Trigger surface

Any cursor mark-delete advance that does not go through the consumer-driven ack path can hit this:

- Backlog-quota eviction (size policy `consumer_backlog_eviction` and non-precise time policy) — see `BacklogQuotaManager`.
- Message expiry — `PersistentMessageExpiryMonitor`.
- Admin `skipMessages` / `clearBacklog` on a subscription.

## Pre-existing, not a regression

This is a latent bug, not introduced by recent PRs:

- The same `Consumer.removePendingAcksUpToPositionAndDecrementUnacked` call previously lived inside `PersistentDispatcherMultipleConsumers.readMoreEntries()`.
- PR **#25592** ("Move pending acks cleanup to selected mark-delete callbacks") relocated it into the new `markDeletePositionMoveForward` hook.
- PR **#25624** ("Close pending acks cleanup gap in BacklogQuotaManager") wired the hook into the `BacklogQuotaManager` paths that had been missing it.

Neither PR semantically changed the pending-acks cleanup itself; both are correct for Shared subscriptions and faithfully preserve the prior behavior. The bug surfaces because the prior behavior is wrong for PIP-379 Key_Shared — and is now reachable from more paths.

The original review of PR #24096 retained step 2 over concern that draining hashes could otherwise be blocked indefinitely if pending acks weren't pruned on mark-delete advance. That concern needs revising: the pending acks will be cleared naturally when the consumer eventually acks (delivery still completes even if the broker has trimmed storage) or disconnects (`PendingAcksMap.forEachAndClose` with `closing=true`).

## Impact

- **Severity:** Correctness regression for Key_Shared ordering — the central invariant of the subscription type.
- **Trigger conditions:** Requires a draining sticky-key hash (consumer set has changed) AND a mark-delete advance that bypasses the per-consumer ack path. Backlog-quota eviction (`consumer_backlog_eviction` policy), message expiry, and admin skip/clear operations can all reach it.
- **Detectability:** Silent. No log, no metric. The application sees out-of-order processing.
- **Visibility:** No tests today exercise the interaction between mark-delete advancement and `DrainingHashesTracker` for Key_Shared.

## Scope of fix

Should target the PIP-379 dispatcher (`PersistentStickyKeyDispatcherMultipleConsumers`) and its parent (`PersistentDispatcherMultipleConsumers`).

The deprecated `PersistentStickyKeyDispatcherMultipleConsumersClassic` is out of scope.

The redelivery-tracker cleanup (step 1 above) must be preserved for Key_Shared — it correctly prevents re-reading entries that storage has already discarded.

## Related

- PR #25592 — relocated the cleanup into `markDeletePositionMoveForward`.
- PR #25624 — closed gaps in `BacklogQuotaManager`, increasing the bug's reachability.
- PR #24096 — earlier review thread where the trade-off was first discussed.
- PIP-379 — the design document for the Key_Shared draining-hashes mechanism.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Key_Shared (PIP-379): pending-acks cleanup on mark-delete advance can break ordering #25627

Summary

Background

What happens

Failure scenario

Trigger surface

Pre-existing, not a regression

Impact

Scope of fix

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug] Key_Shared (PIP-379): pending-acks cleanup on mark-delete advance can break ordering #25627

Description

Summary

Background

What happens

Failure scenario

Trigger surface

Pre-existing, not a regression

Impact

Scope of fix

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions