⚠️ This tool was created out of curiosity during a hackathon. It hasn’t been thoroughly tested and still requires improvements.
K8s AI Detective is a tool designed to automate debugging and summarizing issues when an alert is triggered. It leverages kubectl-ai to analyze the alert context, gather relevant information (such as logs, events, and resource states), and generate an initial summary.
demo.mov
Usage: k8s-ai-detective --api-key=STRING [flags]
K8s AI Detective automates debugging and summarizing alerts by leveraging `kubectl-ai` to analyze context, gather logs, events, and resource states, and generate an initial
summary.
Flags:
-h, --help Show context-sensitive help.
--address=":8085" The address where the server should listen on ($ADDRESS).
--config-file-path="./config.yml" Config file path ($CONFIG_FILE_PATH).
--llm-provider="gemini" Language model provider ($LLM_PROVIDER)
--llm-provider-model="gemini-2.5-pro" LLM provider's model name ($LLM_PROVIDER_MODEL)
--api-key=STRING API key of the llm-provider you set for authentication ($API_KEY)
--kubeconfig="" Path to kubeconfig file (uses in-cluster config if not set) ($KUBECONFIG)
--alert-queue-size=10 Queue size to hold alerts (Max 256) ($ALERT_QUEUE_SIZE).
--worker-count=3 Number of alerts processed in parallel (Max 256) ($WORKER_COUNT).
--worker-timeout=120s Timeout for processing each alert in the worker ($WORKER_TIMEOUT).
--slack-bot-token="" Slack bot token for authentication ($SLACK_BOT_TOKEN).
--slack-channel-id="" Slack channel ID to send notifications ($SLACK_CHANNEL_ID).
--log.format="json" Set the output format of the logs. Must be "console" or "json" ($LOG_FORMAT).
--log.level=INFO Set the log level. Must be "DEBUG", "INFO", "WARN" or "ERROR" ($LOG_LEVEL).
--log.add-source Whether to add source file and line number to log records ($LOG_ADD_SOURCE).- Create a Slack channel to receive alerts and follow this YouTube Shorts tutorial to obtain a Slack bot token from https://api.slack.com/apps/
- Deploy kind K8s cluster locally
kind get nodes
kind-control-plane
kubectl cluster-info
Kubernetes control plane is running at https://127.0.0.1:56492
CoreDNS is running at https://127.0.0.1:56492/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.- Start
k8s-ai-detective
# Export envs
export API_KEY="REDACTED"
export KUBECONFIG="~/.kube/config"
export SLACK_BOT_TOKEN="REDACTED"
export SLACK_CHANNEL_ID="REDACTED"
# Run app locally
task run
time=2025-11-09T20:07:40+01:00 level=INFO msg="Version information" version="" branch="" revision=""
time=2025-11-09T20:07:40+01:00 level=INFO msg="Build context" go_version=go1.25.4 user="" date=""
time=2025-11-09T20:07:40+01:00 level=INFO msg="Using kubeconfig" file=/Users/veerendra/.kube/config
time=2025-11-09T20:07:44+01:00 level=INFO msg="kubectl-ai is working..." response=Acknowledged.
time=2025-11-09T20:07:44+01:00 level=INFO msg="Starting HTTP server" address=:8085
...- Deploy a dummy pod and simulate an alertmanager alert, using below script
What this script does:
- Creates a dummy deployment (my-app) in the default namespace using an intentionally unknown image, causing ImagePullBackOff.
- Waits for the pod to be created and detects its name dynamically.
- Generates an alert.json file containing the pod information and a simulated Alertmanager alert.
- Prompts the user to optionally post the alert.json to k8s-ai-detective for testing.
Example usage below
./demo/simulate_imagepullbackoff_alert.sh
[*] Checking if deployment my-app exists in namespace default...
[*] Creating deployment my-app...
deployment.apps/my-app created
[*] Waiting for pod to be created...
[*] Found pod: my-app-86bc446b5f-5rm5f
[*] Alert JSON generated: alert.json
[*] To test your app, run:
curl -X POST -H "Content-Type: application/json" -d @alert.json http://localhost:8085/alert
[*] Send the alert now? (y/N): y
[*] Sending alert to http://localhost:8085/alert ...
200
[*] Alert sent.
kubectl get pods -n default
NAME READY STATUS RESTARTS AGE
my-app-86bc446b5f-5rm5f 0/1 ImagePullBackOff 0 7m38s- It should able to send summarized info to the slack channel
Configure alertmanager to send alerts to k8s-ai-detective like below
receivers:
- name: "all-alerts"
webhook_configs:
- url: "https://k8s-ai-detective/alert"
send_resolved: true- Using Taskfile
Install Taskfile: Installation Guide
# List available tasks
task --list
task: Available tasks for this project:
* all: Run comprehensive checks: format, lint, security and test
* build: Build the application binary for the current platform
* build-docker: Build Docker image
* build-platforms: Build the application binaries for multiple platforms and architectures
* fmt: Formats all Go source files
* install: Install required tools and dependencies
* lint: Run static analysis and code linting using golangci-lint
* run: Runs the main application
* security: Run security vulnerability scan
* test: Runs all tests in the project (aliases: tests)
* vet: Examines Go source code and reports suspicious constructs
# Build the application
task build
# Run tests
task test- Build with goreleaser
Install GoReleaser: Installation Guide
# Build locally
goreleaser release --snapshot --clean
...- Add contextual logging using
slog - Improve alert de-duplication with
fingerprintduring processing - Expand configuration options
- Support excluding or including specific alerts
- Allow dedicated prompts for selected alerts
- Enable exclusion of alert groups
- Support excluding specific namespaces
- Add metrics collection and reporting
- kubectl-ai
- Understanding the context package in golang
- Graceful Shutdown in Go: Practical Patterns
- How to parse a JSON request body in Go
- slack-go -- Send message to Slack channel
- Stackoverflow -- different about withcancel and withtimeout in golang's context
- Stackoverflow -- k8s client-go rest.Config api.Config
- Code Snippet -- How to use
kubectl-ainatively in Go


