Skip to content

Cache 404 responses for unknown artifacts#66

Open
bdehamer wants to merge 2 commits intomainfrom
bdehamer/no-artifact-cache
Open

Cache 404 responses for unknown artifacts#66
bdehamer wants to merge 2 commits intomainfrom
bdehamer/no-artifact-cache

Conversation

@bdehamer
Copy link

@bdehamer bdehamer commented Mar 20, 2026

Problem

The deployment-tracker calls the GitHub deployment record API for every container it observes — including third-party containers that GitHub has no knowledge of. These always return a 404 (NoArtifactError). For clusters running a lot of third-party workloads, this generates significant unnecessary API traffic.

Approach

This adds a new unknownArtifacts TTL cache to the controller that tracks image digests which have previously returned a 404 from the deployment record API. Before making an API call, recordContainer() checks this cache and skips the request if the digest is already known to be missing.

Why a separate cache?

The existing observedDeployments cache serves a different purpose — it deduplicates successful posts during rapid pod churn. The two caches differ in key semantics and TTL:

observedDeployments unknownArtifacts
Key event||deploymentName||digest digest
TTL 2 minutes 1 hour
Purpose Deduplicate redundant posts Suppress calls for missing artifacts

The 404 cache is keyed by digest alone — if GitHub doesn't have an artifact for a given digest, that's true regardless of the deployment name or event type. The longer TTL reflects the fact that an unknown artifact is unlikely to become known in the near term.

Observability

A new Prometheus counter deptracker_post_record_unknown_artifact_cache_hit tracks how many API calls are being avoided by the cache.

Changes

  • internal/controller/controller.go — Add unknownArtifacts cache field, check it before API calls, populate it on NoArtifactError
  • pkg/dtmetrics/prom.go — New cache hit counter metric
  • internal/controller/controller_test.go — Unit tests covering cache population, cache hits, expiry, and cross-event-type behavior

Signed-off-by: Brian DeHamer <bdehamer@github.com>
@bdehamer bdehamer requested a review from a team as a code owner March 20, 2026 21:26
Copilot AI review requested due to automatic review settings March 20, 2026 21:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a TTL cache for image digests that have previously returned 404/NoArtifactError from the deployment record API, so the controller can avoid repeated GitHub API calls for third-party/unknown artifacts and expose cache-hit observability.

Changes:

  • Add an unknownArtifacts expiring cache to suppress repeat posts for digests known to be missing (404), with a 1-hour TTL.
  • Add a Prometheus counter to track avoided API calls due to unknown-artifact cache hits.
  • Add unit tests validating cache population, cache hits, expiry behavior, and cross-event behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
internal/controller/controller.go Introduces unknownArtifacts TTL cache, checks it before posting, and populates it on NoArtifactError.
pkg/dtmetrics/prom.go Adds a new Prometheus counter for unknown-artifact cache hits.
internal/controller/controller_test.go Adds unit tests for unknown-artifact caching behavior in recordContainer().
Comments suppressed due to low confidence (3)

internal/controller/controller_test.go:124

  • This test constructs deploymentrecord.NoArtifactError{} with its internal wrapped error left nil. (*NoArtifactError).Error() dereferences n.err and will panic if the error is ever logged/formatted, making the test setup fragile. Consider adding an exported constructor in pkg/deploymentrecord (or making NoArtifactError.Error() nil-safe) and using that here so the mocked error is always well-formed.
	poster := &mockPoster{
		lastErr: &deploymentrecord.NoArtifactError{},
	}

internal/controller/controller_test.go:144

  • This test constructs deploymentrecord.NoArtifactError{} with its internal wrapped error left nil. (*NoArtifactError).Error() dereferences n.err and will panic if the error is ever logged/formatted, making the test setup fragile. Consider adding an exported constructor in pkg/deploymentrecord (or making NoArtifactError.Error() nil-safe) and using that here so the mocked error is always well-formed.
	poster := &mockPoster{
		lastErr: &deploymentrecord.NoArtifactError{},
	}

internal/controller/controller_test.go:85

  • This test constructs deploymentrecord.NoArtifactError{} with its internal wrapped error left nil. (*NoArtifactError).Error() dereferences n.err and will panic if the error is ever logged/formatted, making the test setup fragile. Consider adding an exported constructor in pkg/deploymentrecord (or making NoArtifactError.Error() nil-safe) and using that here so the mocked error is always well-formed.
	poster := &mockPoster{
		lastErr: &deploymentrecord.NoArtifactError{},
	}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI commented Mar 20, 2026

@bdehamer I've opened a new pull request, #67, to work on those changes. Once the pull request is ready, I'll request review from you.

* Initial plan

* Make NoArtifactError.Error() nil-safe

Co-authored-by: bdehamer <398027+bdehamer@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/deployment-tracker/sessions/3eb7150d-f2e1-4ef9-8452-686a5ddb22f0

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: bdehamer <398027+bdehamer@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants