Skip to content

[GPUHEALTH-1361] Add Jitter to Intial OTLP Push to avoid thundering herd#105

Open
mukilsh wants to merge 1 commit intomainfrom
gpuhealth-1361-initial-jitter
Open

[GPUHEALTH-1361] Add Jitter to Intial OTLP Push to avoid thundering herd#105
mukilsh wants to merge 1 commit intomainfrom
gpuhealth-1361-initial-jitter

Conversation

@mukilsh
Copy link
Contributor

@mukilsh mukilsh commented Feb 24, 2026

No description provided.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3b0e700998

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +145 to +147
initialMaxJitter := collectionInterval / 20 // 5%
if initialMaxJitter > maxInitialJitter {
initialMaxJitter = maxInitialJitter

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Cap initial jitter by remaining context deadline

The new pre-send jitter is capped at maxInitialJitter (1 minute) without considering the request context deadline, and sendOTLPRequestWithRetry waits that full delay before the first HTTP attempt. In export(), the same context is created with context.WithTimeout(..., e.options.timeout) (default 30s), so with longer collection intervals (e.g. ≥20m, where jitter can reach 60s) many exports will hit ctx.Done() before any request is sent, causing data drops that did not happen before this change.

Useful? React with 👍 / 👎.

Signed-off-by: Mukil <mukils@nvidia.com>

Modify jitter

Signed-off-by: Mukil <mukils@nvidia.com>

modify jitter handling

Signed-off-by: Mukil <mukils@nvidia.com>
@mukilsh mukilsh force-pushed the gpuhealth-1361-initial-jitter branch from 73b7179 to 4dc9fb5 Compare February 24, 2026 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant