Skip to content

Integrate OpenTelemetry Weaver into Cortex (Distributor POC)#7387

Draft
CharlieTLe wants to merge 9 commits intocortexproject:masterfrom
CharlieTLe:worktree-vast-herding-canyon
Draft

Integrate OpenTelemetry Weaver into Cortex (Distributor POC)#7387
CharlieTLe wants to merge 9 commits intocortexproject:masterfrom
CharlieTLe:worktree-vast-herding-canyon

Conversation

@CharlieTLe
Copy link
Copy Markdown
Member

Summary

  • Introduces OpenTelemetry Weaver for schema-driven telemetry management, starting with the Distributor as a proof of concept
  • Defines all 23 Distributor metrics and 9 trace spans in YAML schema files (telemetry/registry/distributor/)
  • Generates Go metric registration code (pkg/distributor/telemetry_gen.go) and markdown documentation (docs/telemetry/cortex_distributor.md) from the schema using Jinja2 templates
  • Refactors distributor.go New() to call the generated registerDistributorMetrics() instead of ~100 lines of inline promauto.With(reg) calls
  • Adds CI steps (telemetry-check, check-telemetry) to validate the schema and ensure generated code stays in sync
  • Adds Weaver binary to the build image and Rego naming policy for Cortex metric conventions

Test plan

  • go build ./pkg/distributor/... compiles cleanly
  • make lint passes (golangci-lint 0 issues, misspell, faillint all clean)
  • make telemetry-check validates the schema
  • make check-telemetry confirms generated code is up-to-date
  • All pkg/distributor unit tests pass unchanged
  • Unit tests pass for pkg/cortex, pkg/querier, pkg/ruler, pkg/ingester, pkg/alertmanager, pkg/storegateway, cmd/cortex
  • Integration test TestDistriubtorAcceptMixedHASamplesRunningInMicroservicesMode passes — the only e2e test that asserts on refactored metrics (cortex_distributor_deduped_samples_total, cortex_distributor_non_ha_samples_received_total)
  • Additional integration tests pass: TestOTLPIngestExemplar, TestIngesterSharding, TestRulerAPI, TestAlertmanager, TestQuerierRemoteRead

🤖 Generated with Claude Code

Schema-driven telemetry: define metrics/spans in YAML, generate Go code
and documentation from the schema using OpenTelemetry Weaver.

- Add telemetry/registry/ with Distributor metric and span definitions
- Add Jinja2 templates for Go code generation and markdown docs
- Generate pkg/distributor/telemetry_gen.go with metric registration
- Refactor distributor.go New() to use generated registerDistributorMetrics()
- Add Makefile targets: telemetry-check, telemetry-generate, check-telemetry
- Add CI steps for schema validation and generated code freshness
- Add Weaver binary to build-image Dockerfile
- Add Rego naming policy for Cortex metric conventions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
@dosubot dosubot bot added ci/cd component/distributor type/observability To help know what is going on inside Cortex labels Mar 31, 2026
CharlieTLe and others added 4 commits March 30, 2026 18:39
- Fix Dockerfile: use .tar.xz format, update to v0.22.1, handle
  missing arm64 Linux builds gracefully
- Fix generated markdown: replace HTML comments with Hugo frontmatter
  to avoid Hugo build errors

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
The tar archive contains weaver-x86_64-unknown-linux-gnu/weaver,
not a top-level weaver binary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
The current CI build image doesn't include Weaver yet. Add an
install-weaver target that downloads it on-demand so telemetry
checks work before the build image is updated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
CharlieTLe and others added 3 commits March 30, 2026 19:16
The CI build image lacks xz-utils. Use the official installer script
which handles its own decompression dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
The installer script doesn't support --yes or --install-dir. Use
--no-modify-path --quiet flags and configure PATH to find the
installed binary in ~/.cargo/bin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
The CI build image has python3 but not xz-utils. Use python3's
built-in lzma module to decompress the .tar.xz archive.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
@CharlieTLe CharlieTLe marked this pull request as draft March 31, 2026 17:45
Weaver v0.22.1 sorts attributes alphabetically, which would change
the label order in generated []string{} slices and break existing
WithLabelValues() call sites. Add explicit label ordering in
params.labels to preserve the original Go label order regardless
of Weaver version. Regenerate with v0.22.1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Charlie Le <charlie_le@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/cd component/distributor size/XXL type/observability To help know what is going on inside Cortex

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant