Skip to content

feat(overlay): add generate-metadata workflow for Package entity creation#22

Open
davidfestal wants to merge 1 commit intoredhat-developer:mainfrom
davidfestal:feat/generate-metadata-workflow
Open

feat(overlay): add generate-metadata workflow for Package entity creation#22
davidfestal wants to merge 1 commit intoredhat-developer:mainfrom
davidfestal:feat/generate-metadata-workflow

Conversation

@davidfestal
Copy link
Copy Markdown
Member

Summary

  • Add workflows/generate-metadata.md — a 6-phase workflow that generates missing Package metadata YAML files and audits existing metadata for consistency within overlay workspaces
  • Add scripts/derive-metadata.py — a Python helper script for deterministic metadata derivation (name shortening, OCI URL, supportedVersions, plugins-list parsing, env var extraction)
  • Add <path_resolution> and <shell_permissions> directives to SKILL.md for reliable script invocation and sandbox-safe GitHub API calls across all workflows
  • Rewrite references/metadata-format.md with real-world examples, correct catalog-entities/extensions/ paths, and comprehensive field documentation
  • Delegate onboard-plugin Phase 4 to the new generate-metadata workflow
  • Add tests/unit/test_derive_metadata.py with 47 tests covering all public functions

Workflow Phases

  1. Workspace Identification — resolve target workspace
  2. Scan for Missing Metadata — detect plugins without metadata files, flag consistency issues
  3. Fetch Upstream Source — retrieve package.json + config.d.ts via gh api (agent-direct, no subprocess)
  4. Analyze Config & Wiring — generate appConfigExamples from config schemas, delegate frontend wiring
  5. Plugin Entity Resolution — determine partOf references or Plugin entity creation
  6. Write Files & Report — assemble YAML, update smoke-tests/test.env, audit existing metadata, propose commit/PR

Test plan

  • 289 tests pass (47 new + 242 existing)
  • Run generate-metadata workflow against a real overlay workspace (e.g., analytics) to validate end-to-end
  • Verify onboard-plugin workflow still works with Phase 4 delegation

Made with Cursor

…tion

Add a 6-phase workflow that generates missing Package metadata files and
audits existing ones for consistency. Key capabilities:

- Scans workspaces for plugins missing metadata files
- Derives deterministic fields (name, OCI URL, supportedVersions) from
  source.json, plugins-list.yaml, and upstream package.json
- Fetches config.d.ts from upstream to generate appConfigExamples
- Audits supportedVersions consistency and empty appConfigExamples
- Updates smoke-tests/test.env with placeholder variables
- Delegates from onboard-plugin Phase 4

Also adds path_resolution and shell_permissions directives to SKILL.md
for reliable script invocation across all workflows, and rewrites the
metadata-format reference with real examples and correct paths.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the overlay skill with a new workflow and helper script to generate/audit Backstage Package metadata for overlay workspaces, and updates related documentation and onboarding guidance.

Changes:

  • Add workflows/generate-metadata.md to define a phased process for scanning, generating, and auditing metadata.
  • Add scripts/derive-metadata.py plus new unit tests to support deterministic metadata derivation and audit checks.
  • Update overlay skill docs (SKILL.md, references/metadata-format.md, and onboard-plugin.md) to route Phase 4 to the new workflow and document the metadata format.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tests/unit/test_derive_metadata.py Adds unit tests for the new derive-metadata helper script.
skills/overlay/workflows/onboard-plugin.md Delegates Phase 4 metadata work to the new generate-metadata workflow.
skills/overlay/workflows/generate-metadata.md New workflow describing scan/derive/audit steps and expected outputs.
skills/overlay/SKILL.md Adds path resolution + shell permission guidance; adds routing entry for metadata workflow.
skills/overlay/scripts/derive-metadata.py New CLI script to scan workspaces, derive fields, and perform audits.
skills/overlay/references/metadata-format.md Rewrites metadata format reference with updated paths and examples.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


import importlib.util
import json
import textwrap
Comment on lines +10 to +12
python scripts/derive-metadata.py --workspace argocd
python scripts/derive-metadata.py --workspace argocd --package-json '{"name":"@backstage-community/plugin-argocd","version":"2.8.0","backstage":{"role":"frontend-plugin"}}'
python scripts/derive-metadata.py --extract-env-vars metadata-file.yaml
Comment on lines +45 to +52
def shorten_name(name: str) -> str:
"""Apply shortening rules only if name exceeds K8S_NAME_LIMIT."""
if len(name) <= K8S_NAME_LIMIT:
return name
shortened = name
for old, new in SHORTEN_RULES:
shortened = shortened.replace(old, new)
return shortened
Comment on lines +121 to +140
def find_missing_metadata(workspace_dir: Path, plugins: list[dict]) -> list[dict]:
"""Identify plugins that lack metadata files.

Uses a heuristic: for each plugin path, check if any existing metadata file's
packageName corresponds to that path. Falls back to filename pattern matching.
"""
metadata_dir = workspace_dir / "metadata"
existing_files = list(metadata_dir.glob("*.yaml")) if metadata_dir.exists() else []
existing_names = {f.stem for f in existing_files}

missing = []
for plugin in plugins:
path = plugin["path"]
path_suffix = path.rstrip("/").split("/")[-1] if path != "." else ""
found = any(path_suffix and path_suffix in name for name in existing_names)
if not found and path != ".":
missing.append(plugin)
elif path == "." and not existing_names:
missing.append(plugin)
return missing
Comment on lines +536 to +542
if args.package_json:
pkg = json.loads(args.package_json)
fields = derive_plugin_fields(
pkg, args.workspace, args.plugin_path, source,
supported_versions, existing,
)
print(json.dumps(fields, indent=2 if is_tty else None))
Comment thread skills/overlay/SKILL.md
</path_resolution>

<shell_permissions>
Prefer running `gh api` and `gh search code` as **direct shell commands** rather than via Python subprocess. Direct `gh` calls go through the user's command allowlist without triggering permission prompts. Python scripts that only do local work (file I/O, JSON processing, field derivation) also need no extra permissions. Only request `full_network` for Python scripts that internally spawn `gh` as a subprocess — the sandbox blocks network access from child processes.
Comment on lines +38 to +41
Run the scan command from the overlay repo root (no network needed):

```bash
python3 scripts/derive-metadata.py scan --workspace <workspace>
version: 10.17.0
backstage:
role: frontend-plugin
supportedVersions: 1.45.3
version: 1.4.0
backstage:
role: backend-plugin
supportedVersions: 1.48.3
Comment on lines +431 to +437
is_tty = os.isatty(sys.stdout.fileno())

if args.command == "extract-env-vars":
content = Path(args.file).read_text()
env_vars = extract_env_vars(content)
output = {"env_vars": env_vars}
print(json.dumps(output, indent=2 if is_tty else None))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants