Conversation
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
ChrisJBurns
left a comment
There was a problem hiding this comment.
Review: Graduate CRDs from v1alpha1 to v1beta1
Verified:
- Core
GroupVersionconstant correctly changed tov1beta1 - All Go imports and aliases consistently updated (
mcpv1alpha1→mcpv1beta1) - Hardcoded
apiVersionstring inpkg/export/k8s.goupdated - Webhook manifests (both copies) correctly reference
v1beta1paths and apiVersions - Helm chart CRDs, swagger docs, and skill docs updated
- Old
v1alpha1/directory fully removed — zero stragglingv1alpha1references in production code - All 377 files show symmetric add/delete counts, confirming purely mechanical changes
Two items to address — both in the design doc, not in production code.
| "sigs.k8s.io/controller-runtime/pkg/log" | ||
| "sigs.k8s.io/controller-runtime/pkg/reconcile" | ||
|
|
||
| mcpv1alpha1 "github.com/stacklok/toolhive/cmd/thv-operator/api/v1alpha1" | ||
| mcpv1beta1 "github.com/stacklok/toolhive/cmd/thv-operator/api/v1beta1" | ||
| ctrlutil "github.com/stacklok/toolhive/cmd/thv-operator/pkg/controllerutil" |
There was a problem hiding this comment.
🔴 Two documentation files modified by this PR still contain stale v1alpha1 references that directly contradict the graduation changes. In docs/operator/virtualmcpserver-api.md, line 16 reads '- Version: v1alpha1' while lines 59 and 451 in the same file correctly show apiVersion: toolhive.stacklok.dev/v1beta1, leaving the API reference internally contradictory. In docs/arch/02-core-concepts.md, line 267 has apiVersion: mcp.stacklok.dev/v1alpha1 — wrong on two counts: the API group should be toolhive.stacklok.dev (not mcp.stacklok.dev), and the version should be v1beta1.
Extended reasoning...
Bug 1: virtualmcpserver-api.md line 16 still says v1alpha1
What the bug is and how it manifests: The file docs/operator/virtualmcpserver-api.md was modified in commit 5830a3a as part of this PR. The YAML example blocks at lines 59 and 451 were correctly updated to use apiVersion: toolhive.stacklok.dev/v1beta1. However, the prose 'API Group and Version' table at line 16 was missed and still reads:
- **Version**: `v1alpha1`
Specific code path: A user consulting this reference document to determine which apiVersion string to use will see v1alpha1 in the version table at the top, but v1beta1 in every YAML example further down the same file.
Why existing code doesn't prevent it: This is a documentation-only change; there is no automated test or CI check that validates consistency between prose version tables and YAML examples within the same markdown file.
Impact: Users who read only the 'API Group and Version' table will use the stale v1alpha1 string and get a 'no kind is registered' error from the Kubernetes API server, since the operator now only serves v1beta1. Users who copy from the YAML examples will get the correct string.
How to fix: Change line 16 from - **Version**: v1alpha1 to `- **Version**: `v1beta1.
Step-by-step proof:
- Open
docs/operator/virtualmcpserver-api.md - Line 16:
- **Version**:v1alpha1`` ← stale - Line 59:
apiVersion: toolhive.stacklok.dev/v1beta1← correct - Line 451:
apiVersion: toolhive.stacklok.dev/v1beta1← correct - The same file now contains two different version claims pointing to opposite conclusions.
Bug 2: docs/arch/02-core-concepts.md line 267 has wrong API group AND stale version
What the bug is and how it manifests: The file docs/arch/02-core-concepts.md was modified in commit 5830a3a — specifically, line 288 was updated from api/v1alpha1/virtualmcpserver_types.go to api/v1beta1/virtualmcpserver_types.go. However, the adjacent YAML example at line 267 was not updated and still reads:
apiVersion: mcp.stacklok.dev/v1alpha1This is wrong in two ways: (1) the API group mcp.stacklok.dev has never been the correct group (the real group, confirmed in groupversion_info.go, has always been toolhive.stacklok.dev), and (2) the version is v1alpha1 instead of the graduated v1beta1.
Specific code path: Any developer reading the architecture doc to understand the VirtualMCPServer resource kind and copy the apiVersion will get an invalid string on both dimensions.
Why existing code doesn't prevent it: No CI linter validates that YAML code blocks in architecture docs match the actual registered GroupVersion. The PR author caught the file path reference on line 288 but not the YAML example on line 267 above it.
Impact: Users applying the example YAML directly will get a no kind is registered for the type 'VirtualMCPServer' in scheme error (wrong group) and/or a version-not-found error. The string mcp.stacklok.dev appears nowhere else in the repository, making this example maximally misleading.
How to fix: Change line 267 from apiVersion: mcp.stacklok.dev/v1alpha1 to apiVersion: toolhive.stacklok.dev/v1beta1.
Step-by-step proof:
- Open
docs/arch/02-core-concepts.md - Line 267:
apiVersion: mcp.stacklok.dev/v1alpha1← wrong group + wrong version - Line 291 (same file): references
cmd/thv-operator/api/v1beta1/← was updated by this PR cmd/thv-operator/api/v1beta1/groupversion_info.go:GroupVersion = schema.GroupVersion{Group: "toolhive.stacklok.dev", Version: "v1beta1"}← authoritative source- The PR commit message lists 'apiVersion string updates in YAML/Markdown' as a stated goal, but this YAML example was not updated.
| // SPDX-FileCopyrightText: Copyright 2025 Stacklok, Inc. | ||
| // SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| // Package v1alpha1 contains API Schema definitions for the toolhive v1alpha1 API group | ||
| // Package v1beta1 contains API Schema definitions for the toolhive v1beta1 API group | ||
| // +kubebuilder:object:generate=true | ||
| // +groupName=toolhive.stacklok.dev | ||
| package v1alpha1 | ||
| package v1beta1 | ||
|
|
||
| import ( | ||
| "k8s.io/apimachinery/pkg/runtime/schema" |
There was a problem hiding this comment.
🔴 The documentation template at docs/operator/templates/markdown/type.tpl contains hardcoded hasSuffix "/api/v1alpha1" checks (lines 47 and 76) that filter which types appear in generated CRD API docs; after this PR renames the package directory to api/v1beta1, these checks will never match and all CRD API types will be silently omitted from the generated documentation. The fix is to update both occurrences of "/api/v1alpha1" to "/api/v1beta1" in that template file.
Extended reasoning...
What the bug is and how it manifests
The file docs/operator/templates/markdown/type.tpl uses two hasSuffix checks to decide whether a given Go type should be rendered in the generated CRD API reference documentation:
- Line 47:
{{- $isAPIType := hasSuffix "/api/v1alpha1" $type.Package -}} - Line 76:
{{- $refIsAPIType := hasSuffix "/api/v1alpha1" .Package -}}
This PR renames the package directory from cmd/thv-operator/api/v1alpha1/ to cmd/thv-operator/api/v1beta1/. After the rename, all type packages will have a path ending in /api/v1beta1, so both hasSuffix "/api/v1alpha1" expressions will evaluate to false for every type. The template will silently skip rendering all CRD API types.
The specific code path that triggers it
When the documentation generator walks the Go types from the operator API package, it evaluates $isAPIType for each top-level type. Because $type.Package now ends with /api/v1beta1 rather than /api/v1alpha1, the suffix check fails. The type is then only rendered if it has a +gendoc marker comment. A search of cmd/thv-operator/api/v1beta1/ finds zero +gendoc markers, so the fallback condition also fails for every type.
Why existing code does not prevent it
The template was written when only v1alpha1 existed and the hardcoded string was never parameterized. The PR diff updates Go source files, CRD YAML manifests, Helm charts, and import paths, but it does not update docs/operator/templates/markdown/type.tpl. There is no test that runs the doc generator and verifies non-empty output, so the regression would not be caught by CI.
Impact
On the next documentation generation run after this PR merges, the entire CRD API reference section (all MCPServer, VirtualMCPServer, ToolHiveConfig, and related types) will vanish from the rendered docs. Operators and users relying on the generated reference will find blank pages with no error or warning to indicate something went wrong, because the template simply skips the type rather than emitting an error.
How to fix it
Change both hardcoded strings in docs/operator/templates/markdown/type.tpl:
- {{- $isAPIType := hasSuffix "/api/v1alpha1" $type.Package -}}
+ {{- $isAPIType := hasSuffix "/api/v1beta1" $type.Package -}}- {{- $refIsAPIType := hasSuffix "/api/v1alpha1" .Package -}}
+ {{- $refIsAPIType := hasSuffix "/api/v1beta1" .Package -}}Step-by-step proof
- Before the PR:
$type.PackageforMCPServerequalsgithub.com/stacklok/toolhive/cmd/thv-operator/api/v1alpha1.hasSuffix "/api/v1alpha1"→true. Type is rendered. - After the PR: the package directory is
api/v1beta1, so$type.Packageequalsgithub.com/stacklok/toolhive/cmd/thv-operator/api/v1beta1.hasSuffix "/api/v1alpha1"→false. - The fallback
$hasGendoccheck looks for a+gendoccomment on the type. No such markers exist inapi/v1beta1/. - Both conditions are false; the
{{- if or $isAPIType $hasGendoc -}}block is skipped. - Result: zero CRD types appear in the generated documentation output.
|
|
||
| var ( | ||
| // GroupVersion is group version used to register these objects | ||
| GroupVersion = schema.GroupVersion{Group: "toolhive.stacklok.dev", Version: "v1alpha1"} | ||
| GroupVersion = schema.GroupVersion{Group: "toolhive.stacklok.dev", Version: "v1beta1"} | ||
|
|
||
| // SchemeBuilder is used to add go types to the GroupVersionKind scheme | ||
| SchemeBuilder = &scheme.Builder{GroupVersion: GroupVersion} |
There was a problem hiding this comment.
🟡 The Stage 2 migration sed commands in design-crd-graduation-v1beta1.md are all no-ops because the source pattern equals the replacement (both sides say v1beta1 instead of v1alpha1 as the source). Lines 105, 113, 117, 119, and 122 are affected; only line 109 is correct. A developer following this guide would run the scripts and find none of their files changed, requiring manual intervention to complete the migration.
Extended reasoning...
Bug: The new design document design-crd-graduation-v1beta1.md contains Stage 2 shell migration scripts with sed substitution commands where the source pattern is identical to the replacement string. In a sed substitution s|A|B|g, if A == B the command is a no-op — it matches occurrences of A and replaces them with the same string A, resulting in no change.
Affected lines: The document has six sed commands in Stage 2. Five of them are broken:
- Line 105:
sed -i 's|cmd/thv-operator/api/v1beta1|cmd/thv-operator/api/v1beta1|g'— source should bev1alpha1 - Line 113:
sed -i 's|toolhive.stacklok.dev/v1beta1|toolhive.stacklok.dev/v1beta1|g'— source should bev1alpha1 - Line 117: same no-op pattern as line 113
- Line 119:
sed -i 's|api/v1beta1|api/v1beta1|g'— source should beapi/v1alpha1 - Line 122:
sed -i 's|"toolhive.stacklok.dev/v1beta1"|"toolhive.stacklok.dev/v1beta1"|g'— source should bev1alpha1
Only line 109 is correct: sed -i 's|mcpv1alpha1|mcpv1beta1|g' correctly renames the Go package alias from the old name to the new name.
Root cause: The document was authored after the mechanical rename had already been applied to the codebase. The author likely copy-pasted from the already-renamed code rather than from the original v1alpha1 strings, accidentally writing the post-rename value as both source and target.
Why existing tests/CI do not catch this: The file is pure documentation (a design doc markdown file). There are no automated checks that validate the correctness of sed commands embedded in markdown. The production code itself is correctly renamed — this bug only affects the migration guide.
Impact: A developer following this guide to perform the migration on a downstream fork or a future similar graduation would run Stage 2 and observe zero file changes for five of the six commands. They would need to debug why the migration failed and manually identify the correct source patterns. This could cause confusion and wasted time during future API version graduations.
Fix: Change the source (left-hand side) of each broken sed substitution from v1beta1 to v1alpha1:
- Line 105:
s|cmd/thv-operator/api/v1alpha1|cmd/thv-operator/api/v1beta1|g - Line 113:
s|toolhive.stacklok.dev/v1alpha1|toolhive.stacklok.dev/v1beta1|g - Line 117: same fix as 113
- Line 119:
s|api/v1alpha1|api/v1beta1|g - Line 122:
s|"toolhive.stacklok.dev/v1alpha1"|"toolhive.stacklok.dev/v1beta1"|g
Step-by-step proof: Suppose a file contains the line import "github.com/stacklok/toolhive/cmd/thv-operator/api/v1alpha1". Running the broken command sed -i 's|cmd/thv-operator/api/v1beta1|cmd/thv-operator/api/v1beta1|g' file.go would find no match (the file still says v1alpha1) and leave the file unchanged. Running the corrected command sed -i 's|cmd/thv-operator/api/v1alpha1|cmd/thv-operator/api/v1beta1|g' file.go would correctly rename the import to v1beta1.
Large PR justification has been provided. Thank you!
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
Signal API stability by promoting all ToolHive CRDs to v1beta1. This is a clean break with no conversion webhook — v1beta1 becomes the sole storage and served version. Changes are purely mechanical: directory rename (api/v1alpha1 -> api/v1beta1), package name updates, import path and alias updates, apiVersion string updates in YAML/Markdown, and regeneration of all generated code (deepcopy, CRD manifests, webhook manifests, Helm charts, swagger docs, CRD API docs). No logic changes, no field removals, no behavioral changes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update straggler v1alpha1 references in architecture docs, CRD ref config, VirtualMCPServer API docs, and the type.tpl template used by crd-ref-docs to filter API types. Regenerated CRD API docs with the corrected template. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> # Conflicts: # docs/operator/crd-api.md
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The VirtualMCPServer example in 02-core-concepts.md used mcp.stacklok.dev (which never existed) instead of the correct toolhive.stacklok.dev group. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update mcpv1alpha1 alias in resolver_configref_test.go which was added to main after the original branch. Regenerate CRD API docs to include new types from main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5fc1e31 to
ab532b0
Compare
Summary
Signal API stability by promoting all ToolHive CRDs from
v1alpha1tov1beta1. This is a clean break — no conversion webhook, no multi-version serving. Users must recreate resources with the newapiVersion: toolhive.stacklok.dev/v1beta1.Closes #2556
Medium level
cmd/thv-operator/api/v1alpha1/tocmd/thv-operator/api/v1beta1/and updated the package name +GroupVersionconstantcmd/thv-operator/api/v1alpha1→v1beta1) and import aliases (mcpv1alpha1→mcpv1beta1,v1alpha1→v1beta1) across controllers, support packages, and testsapiVersionstrings fromtoolhive.stacklok.dev/v1alpha1totoolhive.stacklok.dev/v1beta1in examples, chainsaw fixtures, and deploy manifestspkg/export/k8s.goand related test assertionsType of change
Test plan
task test)task lint-fix)task build)grep -r "v1alpha1" . --include="*.go" --include="*.yaml" --include="*.yml" --include="*.md" --include="*.json" | grep "toolhive"returns zero results in production codeDoes this introduce a user-facing change?
Yes. Users must update all CRD manifests from
apiVersion: toolhive.stacklok.dev/v1alpha1toapiVersion: toolhive.stacklok.dev/v1beta1. Existingv1alpha1resources must be deleted and recreated with the new API version. This is expected for an alpha-to-beta graduation.Special notes for reviewers
This PR touches 377 files but is 100% mechanical — every change follows one of three patterns: (1) directory/package rename, (2) import path update, (3) string literal update. The high file count is because the rename cascades through every file that imports the API types. Spot-checking a sample of files is sufficient; reviewing every line would not add value.
The PR is large by line count but falls under the acceptable exception for generated/mechanical changes (same category as dependency updates).
Large PR Justification
Generated with Claude Code