Skip to content

High cardinality metrics not published by argo workflows controller #3620

@Zujiry

Description

@Zujiry

Describe the bug

The high cardinality metrics are not enabled for Argo Workflows when following the documentation. I am NOT using opentelemetry but the kube-prometheus-stack.

Missing:
https://argo-workflows.readthedocs.io/en/release-3.7/metrics/#cronworkflows_concurrencypolicy_triggered
https://argo-workflows.readthedocs.io/en/release-3.7/metrics/#cronworkflows_triggered_total
https://argo-workflows.readthedocs.io/en/release-3.7/metrics/#workflowtemplate_runtime
https://argo-workflows.readthedocs.io/en/release-3.7/metrics/#workflowtemplate_triggered_total

There is no documentation on how to enable those, just they are high cardinality and can break the deployment.

Related helm chart

argo-workflows

Helm chart version

3.7.4

To Reproduce

Deploy argo-workflows chart with:

controller:
    podLabels:
        app: workflow-controller
    rbac:
        create: true
    serviceMonitor:
        enabled: true
        namespace: monitoring
    metricsConfig:
        enabled: true
        path: /metrics
        port: 9090
        serviceMonitor:
            enabled: true
            namespace: monitoring
            additionalLabels:
                release: kube-prometheus-stack
kube -n argo port-forward deploy/argo-workflow-workflow-controller 9090:9090
curl -s http://localhost:9090/metrics | grep -i cronworkflow

Results in

argo_workflows_error_count{cause="CronWorkflowSpecError"} 0
argo_workflows_error_count{cause="CronWorkflowSubmissionError"} 0
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="0.1"} 0
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="0.2"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="0.5"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="1"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="2"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="5"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="10"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="20"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="60"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="180"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="List",le="+Inf"} 1
argo_workflows_k8s_request_duration_sum{kind="cronworkflows",status_code="200",verb="List"} 0.195048022
argo_workflows_k8s_request_duration_count{kind="cronworkflows",status_code="200",verb="List"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="0.1"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="0.2"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="0.5"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="1"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="2"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="5"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="10"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="20"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="60"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="180"} 1
argo_workflows_k8s_request_duration_bucket{kind="cronworkflows",status_code="200",verb="Watch",le="+Inf"} 1
argo_workflows_k8s_request_duration_sum{kind="cronworkflows",status_code="200",verb="Watch"} 0.002435291
argo_workflows_k8s_request_duration_count{kind="cronworkflows",status_code="200",verb="Watch"} 1
argo_workflows_k8s_request_total{kind="cronworkflows",status_code="200",verb="List"} 1
argo_workflows_k8s_request_total{kind="cronworkflows",status_code="200",verb="Watch"} 1

Expected behavior

cronworkflows_triggered_total should appear as a metric

Screenshots

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions