Reduce SQL Server queue length monitoring query volume#5555
Open
ramonsmits wants to merge 2 commits into
Open
Conversation
Replace the per-queue length probe (one IF EXISTS + max-min(RowVersion) statement per tracked queue, every 200ms) with a single catalog-view query per catalog that reads approximate row counts from sys.partitions. This reads no queue table data, takes no locks, and needs only SELECT on the queue tables. Also make the poll interval configurable via the QueueLengthQueryDelayInterval connection string part (default 200ms), and add optional adaptive back-off up to QueueLengthQueryMaxDelayInterval while all monitored queues are empty (disabled by default; base interval is always used while any queue has work, preserving the fix for #4556).
mauroservienti
approved these changes
Jun 24, 2026
mauroservienti
left a comment
Member
There was a problem hiding this comment.
One minor comment, @ramonsmits. Otheriwise, it's good to go.
SqlTable: rename Name/Schema/Catalog to Unquoted* to make the quoted-vs-unquoted contract explicit (review feedback), and convert to a primary constructor. The now-dead per-table LengthQuery's duplicated if/else is collapsed into BuildFullTableName/BuildLengthQuery helpers. QueueLengthProvider: pace the poll interval concurrently with the query so the effective cadence is max(interval, queryDuration) instead of interval + queryDuration. The additive query-time term was what let the cadence drift past the 1s monitoring bucket and starve buckets of samples (#4556 false-zero sawtooth). With drift removed, restore sane defaults — base 1s (matches the finest monitoring bucket) ramping to a 10s idle ceiling — superseding the #4557 200ms oversampling workaround. Adaptive back-off is now ON by default.
mauroservienti
approved these changes
Jun 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Spike addressing high SQL query volume from queue-length monitoring (one
IF EXISTS+max-min(RowVersion)statement per tracked queue, every 200ms).Changes
sys.partitions(one query per catalog) instead of probing each queue table. No queue-table reads, no locks, only metadata visibility required. Accuracy is comparable to the existingmax-min(RowVersion)estimate (both approximate; the catalog counter avoids identity-gap over-counting).max(interval, queryDuration)instead ofinterval + queryDuration. That additive query-time term is what let the cadence drift past the 1s monitoring bucket and starve buckets of samples — the SQL Server and PostgreSQL can indicate report0as the queue length value #4556 false-zero "sawtooth". Removing the drift supersedes the SQL Server and PostgreSQL can indicate report 0 as the queue length value #4557 200ms oversampling workaround and lets the cadence return to a sane default.QueueLengthQueryDelayIntervalconnection string part (default 1s, matching the finest monitoring bucket: 1-minute history / 60 = 1s per point), following the existing ASB convention.QueueLengthQueryMaxDelayInterval(default 10s) while every monitored queue is empty; now on by default. The base interval is always used while any queue has work, so the fix for SQL Server and PostgreSQL can indicate report0as the queue length value #4556 is preserved. Set the max equal to the base to disable back-off.SqlTablecleanup. Identifier properties renamed toUnquoted*to make the quoted-vs-unquoted contract explicit, converted to a primary constructor, and the duplicated per-table length-query branches collapsed into helpers.Permissions
The bulk query reads only catalog views, so it needs less permission than the per-queue probe it replaces:
SELECT) makes that table's row visible. NoVIEW DATABASE STATErequired (hencesys.partitionsrather than thesys.dm_db_partition_statsDMV).SELECT max([RowVersion]) … FROM <queue>), which already requiresSELECTon each queue table — strictly more — so any user that worked before continues to work.IF EXISTS→ -1).