Skip to content

Add autosolve actions for automated issue resolution#14

Open
fantapop wants to merge 31 commits intomainfrom
pr/autosolve-go
Open

Add autosolve actions for automated issue resolution#14
fantapop wants to merge 31 commits intomainfrom
pr/autosolve-go

Conversation

@fantapop
Copy link
Copy Markdown
Contributor

@fantapop fantapop commented Mar 25, 2026

Summary

Go implementation of composite actions for Claude-powered automated issue resolution:

  • autosolve/assess — Runs Claude in read-only mode to evaluate whether a task is suitable for automated resolution
  • autosolve/implement — Runs Claude to implement a solution, validates changes, runs AI security review, pushes to a fork, and creates a draft PR

Key features

  • Per-file batched AI security review with generated-file detection
  • Token usage tracking across phases with combined markdown summary
  • Retry logic with Claude session resumption
  • Skill file support for custom prompts

Testing

Tested end-to-end against cockroachlabs/ccloud-private-automation-testing.

Test plan

  • go test ./... passes
  • Precompiled binary check passes in CI

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new autosolve Go-based automation tool and two composite GitHub Actions (autosolve/assess and autosolve/implement) to assess issue suitability and implement fixes with Claude, including PR creation, security checks, and usage tracking.

Changes:

  • Add Go implementation for assessment/implementation orchestration, prompt assembly, git/gh wrappers, and security checks.
  • Add composite actions (autosolve/assess, autosolve/implement) plus CI updates to run Go tests and validate the precompiled binary.
  • Add prompt templates and unit tests for core functionality.

Reviewed changes

Copilot reviewed 28 out of 30 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
autosolve/internal/security/security_test.go Adds unit tests for blocked-path and sensitive-file enforcement and .gitignore warnings.
autosolve/internal/security/security.go Implements blocked path checks, sensitive filename/extension detection, and symlink-to-blocked-path detection.
autosolve/internal/prompt/templates/security-preamble.md Adds system preamble intended to constrain the model’s behavior for safety.
autosolve/internal/prompt/templates/implementation-footer.md Adds implementation-phase instruction footer and required success/fail marker.
autosolve/internal/prompt/templates/assessment-footer.md Adds assessment-phase instruction footer and required proceed/skip marker.
autosolve/internal/prompt/prompt_test.go Adds tests for prompt construction, skill file inclusion, and custom criteria.
autosolve/internal/prompt/prompt.go Implements prompt assembly from templates + task inputs.
autosolve/internal/implement/implement_test.go Adds tests for retry logic, output writing, and summary extraction.
autosolve/internal/implement/implement.go Implements the implementation phase: retries, security checks, staging/commit/push, PR creation, and AI security review.
autosolve/internal/github/github.go Adds a gh-CLI-backed GitHub client for comments/labels/PR creation.
autosolve/internal/git/git.go Adds a git CLI abstraction and helper to list changed files.
autosolve/internal/config/config_test.go Adds tests for config parsing/validation and blocked path parsing.
autosolve/internal/config/config.go Adds config loading/validation from action inputs and auth validation.
autosolve/internal/claude/claude_test.go Adds tests for extracting markers/session IDs and usage tracking.
autosolve/internal/claude/claude.go Adds Claude CLI runner + result parsing + usage tracking persistence.
autosolve/internal/assess/assess_test.go Adds tests for assessment flow and summary extraction.
autosolve/internal/assess/assess.go Implements assessment phase invocation and outputs/summary writing.
autosolve/internal/action/action_test.go Adds tests for GitHub Actions output and step summary helpers.
autosolve/internal/action/action.go Adds helpers for outputs, summaries, and workflow annotations.
autosolve/implement/action.yml Defines the composite action to run autosolve implement and expose outputs.
autosolve/go.mod Introduces the autosolve Go module definition.
autosolve/cmd/autosolve/main.go Adds CLI entrypoint for assess and implement commands.
autosolve/build.sh Adds cross-compile script producing the committed Linux binary.
autosolve/assess/action.yml Defines the composite action to run autosolve assess and expose outputs.
autosolve/Makefile Adds build/test/clean targets for local development and CI.
autosolve/.gitignore Ignores the local dev binary output.
CHANGELOG.md Documents the addition of the autosolve actions.
.github/workflows/test.yml Updates CI to run Go tests and ensure the precompiled binary is up to date.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread autosolve/internal/implement/implement.go Outdated
Comment thread autosolve/internal/security/security.go Outdated
Comment thread autosolve/internal/implement/implement.go Outdated
Comment thread autosolve/internal/implement/implement.go Outdated
Comment thread autosolve/internal/implement/implement.go
Comment thread autosolve/internal/security/security_test.go
Comment thread autosolve/internal/security/security_test.go Outdated
Comment thread autosolve/internal/implement/implement.go Outdated
Comment thread autosolve/internal/implement/implement.go Outdated
Comment thread autosolve/internal/implement/implement.go Outdated
@fantapop fantapop requested a review from linhcrl March 25, 2026 01:35
@fantapop fantapop force-pushed the pr/autosolve-go branch 2 times, most recently from 1abbbb0 to 6fd24ba Compare March 25, 2026 06:52
Comment thread .github/workflows/test.yml Outdated
@fantapop fantapop force-pushed the pr/autosolve-go branch 6 times, most recently from f818651 to 6bc6bc5 Compare March 27, 2026 22:21
@fantapop
Copy link
Copy Markdown
Contributor Author

One thing I'm running into here is that build the action and committing it each time easily gets out of date and is annoying. I'm going to look into alternatives.

@fantapop fantapop force-pushed the pr/autosolve-go branch 2 times, most recently from d06e466 to f2ef7a1 Compare March 27, 2026 22:50
Copy link
Copy Markdown
Contributor

@linhcrl linhcrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of comments. Most of them are smaller/questions.

Also, here's some feedback I didn't know where to put:

  1. In the PR description I see Precompiled Go binary (no Go toolchain needed at runtime) and one of the bottom checkboxes also mentions precompiled go binary. I'm assuming this just hasn't been updated right? I see that we actually recompile the binary every time this action is run
  2. We should add some documentation in the README

Comment thread autosolve/assess/action.yml Outdated
Comment thread autosolve/cmd/autosolve/main.go Outdated
Comment thread autosolve/internal/assess/assess.go Outdated
Comment thread autosolve/internal/config/config.go
Comment thread autosolve/internal/config/config.go Outdated
Comment thread autosolve/internal/implement/implement.go Outdated
Comment thread autosolve/internal/prompt/templates/implementation-footer.md
Comment thread autosolve/internal/implement/implement.go Outdated
Comment thread autosolve/internal/implement/implement_test.go
Comment thread autosolve/internal/implement/implement.go Outdated
@fantapop fantapop force-pushed the pr/autosolve-go branch 4 times, most recently from 981e842 to a89fb9b Compare April 8, 2026 06:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 28 changed files in this pull request and generated 14 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread autosolve/internal/config/config.go Outdated
Comment on lines +208 to +229
func ParseBlockedPaths(raw string) []string {
paths := make(map[string]bool)
for _, p := range requiredBlockedPaths {
paths[p] = true
}
for _, p := range strings.Split(raw, ",") {
p = strings.TrimSpace(p)
if p != "" {
paths[p] = true
}
}
var result []string
// Required paths first for consistent ordering
for _, p := range requiredBlockedPaths {
result = append(result, p)
}
for p := range paths {
if !contains(requiredBlockedPaths, p) {
result = append(result, p)
}
}
return result
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ParseBlockedPaths builds result by iterating over a map (for p := range paths), which makes the ordering of non-required blocked paths nondeterministic. This can lead to unstable prompts/logs and can make tests flaky (current tests assert a specific order). Consider collecting the non-required paths into a slice and sorting it before appending.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already handled — ParseBlockedPaths collects non-required paths into a separate slice and calls sort.Strings(extra) before appending, so the output is deterministic. 🤖

Comment thread autosolve/internal/action/action.go
Comment thread autosolve/internal/action/action.go Outdated
Comment on lines +62 to +75
// LogResult records usage for a Claude invocation and logs token counts.
// When verbose is true, the full output is written to a collapsible group
// in the step log. Call immediately after runner.Run and before checking
// the error so that usage is captured even on failure.
func LogResult(
tracker *claude.UsageTracker, result *claude.Result, section, outputFile string, verbose bool,
) {
tracker.Record(section, result.Usage)
LogInfo(fmt.Sprintf("%s usage: input=%d output=%d cost=$%.4f",
section, result.Usage.InputTokens, result.Usage.OutputTokens, result.Usage.CostUSD))
if verbose {
logOutputGroup(section, outputFile)
}
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogResult assumes result is non-nil and dereferences result.Usage. Callers intentionally invoke this before checking the error from runner.Run, so a Runner implementation that returns (nil, err) would panic here. Consider handling result == nil defensively (skip usage logging or log a warning).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — LogResult now returns early with a warning if result is nil. 🤖

Comment on lines +30 to +36
// Build allowed tools: read-only plus printenv for each context var
// so Claude can read them without full Bash access.
tools := []string{"Read", "Grep", "Glob"}
for _, v := range cfg.ContextVars {
tools = append(tools, fmt.Sprintf("Bash(printenv %s)", v))
}
allowedTools := strings.Join(tools, ",")
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context var names from cfg.ContextVars are interpolated directly into the --allowedTools string as Bash(printenv %s) without validation. A malicious value like FOO),Write,Edit could expand Claude tool permissions beyond the intended read-only set. Consider validating context var names against a strict env-var regex (e.g., ^[A-Z_][A-Z0-9_]*$) and rejecting/ignoring invalid entries.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context_vars is set by the workflow author in the action.yml config, not by untrusted user input. The workflow author already has full control over the action — including the system prompt, which can instruct Claude to do anything within the allowed tools. Injecting tool permissions via a context var name would be a more convoluted way to do something they can already do directly via the allowed_tools input.

Not a real threat vector — resolving. 🤖

Comment on lines +65 to +74
// Append printenv permissions for each context var so Claude can
// read them without unrestricted Bash access.
var extraTools []string
for _, v := range cfg.ContextVars {
extraTools = append(extraTools, fmt.Sprintf("Bash(printenv %s)", v))
}
allowedTools := cfg.AllowedTools
if len(extraTools) > 0 {
allowedTools += "," + strings.Join(extraTools, ",")
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context var names from cfg.ContextVars are interpolated directly into the --allowedTools string as Bash(printenv %s) without validation. A malicious value like FOO),Write,Edit could expand Claude tool permissions beyond what the action intends. Consider validating context var names against a strict env-var regex and rejecting/ignoring invalid entries.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the assess.go comment — context_vars comes from the workflow author's action config, not untrusted input. The author already controls the system prompt and allowed_tools directly. Resolving. 🤖

Comment on lines +131 to +140
// CheckGitignore logs a warning if the repo's .gitignore does not contain
// credential exclusion patterns. It does not modify the file — repo owners
// should add the patterns themselves for defense-in-depth.
func CheckGitignore(logWarning func(string)) {
data, err := os.ReadFile(".gitignore")
if err != nil {
logWarning("No .gitignore found. For defense-in-depth, add one with credential exclusion patterns: " +
strings.Join(gitignorePatterns, ", "))
return
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CheckGitignore reads .gitignore from the current working directory. Since the action supports a configurable working_directory, this can warn incorrectly (or miss the repo’s actual .gitignore) when invoked from a subdirectory. Consider resolving the repo root (e.g., git rev-parse --show-toplevel) and reading <repoRoot>/.gitignore.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the skill path comment — the action runs with `working-directory` set (defaults to repo root), so `os.ReadFile(".gitignore")` reads the correct file. 🤖

Comment thread autosolve/go.mod Outdated
@@ -0,0 +1,3 @@
module github.com/cockroachdb/actions/autosolve

go 1.23.8
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The go directive typically uses major.minor (e.g., go 1.23) to indicate the language version. Using a patch version here (1.23.8) may be rejected by some Go toolchains and isn’t the usual way to pin a specific toolchain; if you need to pin, consider using a toolchain go1.23.8 directive instead.

Suggested change
go 1.23.8
go 1.23

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — updated to go 1.26. Both action.yml files and test.yml use go-version-file so they pick it up automatically. 🤖

Comment on lines +85 to +94
- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: ${{ github.action_path }}/../go.mod
cache: false

- name: Build autosolve
shell: bash
run: go build -trimpath -o "$RUNNER_TEMP/autosolve" ./cmd/autosolve
working-directory: ${{ github.action_path }}/..
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description states a precompiled Go binary means no Go toolchain is needed at runtime, but this composite action always sets up Go and builds from source. Either add the same precompiled-binary fast-path used in autosolve/implement (skip Go setup/build when $RUNNER_TEMP/autosolve already exists) or adjust the documentation/description to match actual behavior.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assess action.yml has a check-build step that skips the build if the binary already exists at $RUNNER_TEMP/autosolve — same pattern as implement. The precompiled binary path works when both actions run in sequence (implement reuses the binary assess already built). 🤖

Comment on lines +108 to +114
ForkRepo: os.Getenv("INPUT_FORK_REPO"),
ForkPushToken: os.Getenv("INPUT_FORK_PUSH_TOKEN"),
PRCreateToken: os.Getenv("INPUT_PR_CREATE_TOKEN"),
PRBaseBranch: os.Getenv("INPUT_PR_BASE_BRANCH"),
PRLabels: envOrDefault("INPUT_PR_LABELS", "autosolve"),
PRDraft: prDraft,
PullRequestTitle: os.Getenv("INPUT_PR_TITLE"),
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PRBaseBranch is read directly from INPUT_PR_BASE_BRANCH without a default, unlike most other PR-related inputs. If the binary is run outside the composite action (or the env var is missing), PR creation will attempt to use an empty base branch and fail in a less obvious way. Consider defaulting this to main (or the repo’s default branch if discoverable) and/or validating it is non-empty when CreatePR is true.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action.yml sets a default of `main` for `pr_base_branch`, so the env var is always populated when the binary runs from the composite action. Running the binary directly outside the action isn't a supported use case. 🤖

Comment on lines +48 to +55
if cfg.Skill != "" {
content, err := os.ReadFile(cfg.Skill)
if err != nil {
return "", fmt.Errorf("reading skill file %s: %w", cfg.Skill, err)
}
b.Write(content)
b.WriteString("\n")
}
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inputs.skill is documented as a path relative to the repo root, but Build reads it via os.ReadFile(cfg.Skill) relative to the current working directory. If the action runs with working_directory set to a subdir, this will fail to find the skill file (or read the wrong file). Consider resolving the git repo root and joining it with cfg.Skill (or documenting that it’s relative to working_directory).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The composite action runs with `working-directory: ${{ inputs.working_directory }}`, which defaults to `.` (the repo root). The skill path is relative to that directory, which is correct — it matches the README documentation ("relative to the repo root"). 🤖

Copy link
Copy Markdown
Contributor

@linhcrl linhcrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly small comments this time.
Also

  • In the PR description I still see Precompiled Go binary (no Go toolchain needed at runtime) and one of the bottom checkboxes also mentions precompiled go binary.
  • We should add some documentation in the README

Comment thread .github/workflows/test.yml Outdated
Comment thread autosolve/assess/action.yml Outdated
if command -v roachdev >/dev/null; then
printf '#!/bin/sh\nexec roachdev claude -- "$@"\n' > /usr/local/bin/claude
chmod +x /usr/local/bin/claude
echo "Claude CLI: using roachdev wrapper"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would be nice to log the version used similar to the basic claude equivalent below. (Same for implement/action.yml)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version command was removed since the binary is built fresh each run. We could log the Go module version or build timestamp — would that be useful? 🤖

return result, nil
}

func TestRun_Proceed(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestRun_Proceed and TestRun_Skip don't verify the actual assessment value. Proceed checks that output was written but not that it contains "assessment=PROCEED". Skip it has no assertions at all,
only checking that Run() doesn't error. The tests would pass even if the logic was broken.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — the tests should assert the actual output values. Will improve these. 🤖

Comment on lines +200 to +204
fmt.Sscanf(strings.TrimSpace(parts[2]), "%d", &s.Usage.InputTokens)
fmt.Sscanf(strings.TrimSpace(parts[3]), "%d", &s.Usage.OutputTokens)
fmt.Sscanf(strings.TrimSpace(parts[4]), "%d", &s.Usage.CacheCreationInputTokens)
fmt.Sscanf(strings.TrimSpace(parts[5]), "%d", &s.Usage.CacheReadInputTokens)
fmt.Sscanf(strings.TrimSpace(parts[6]), "$%f", &s.Usage.CostUSD)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should at least log if any of these errors, otherwise the usage data will be corrupted without any indication of an error

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — usage and permission denial JSON parsing now return errors instead of silently swallowing them. 🤖

Comment thread autosolve/internal/claude/claude.go Outdated
usage := Usage{CostUSD: out.TotalCostUSD}
if out.Usage != nil {
var u claudeUsage
if err := json.Unmarshal(out.Usage, &u); err == nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we log here if we couldn't unmarshal usage data?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — same change as above, unmarshal errors are now returned instead of silently ignored. 🤖

}
}

func TestUsageTracker_RoundTrip(t *testing.T) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can add another .Record() call with a duplicate name, e.g. security review, to see how it behaves in those scenarios

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea — `Record` with a duplicate name accumulates into the existing section via `Usage.Add()`, which is the intended behavior for retries (e.g. "implement (attempt 1)" called twice). Will add a test to verify this. 🤖

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude suggested adding a test for case insensitivity test

  1. Case-sensitivity issue - Not tested
  // On Linux: does .GITHUB/ bypass .github/ blocking?

Not sure if this is truly an issue

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The blocked path check uses `strings.HasPrefix` which is case-sensitive. On macOS (case-insensitive FS) this could matter, but GitHub Actions runners are Linux (case-sensitive), so the check is correct for the target environment. Will add a test to document this behavior. 🤖

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it's missing coverage for the following crucial functions. Even if we mock some parts, do you think it would be possible to add somewhat meaningful tests for these?

  1. pushAndPR()
  2. aiSecurityReview()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are hard to unit test meaningfully — they shell out to git, invoke the Claude CLI, and interact with the GitHub API. The retry loop tests cover the orchestration logic, and the helpers (`buildPRBody`, `readCommitMessage`, `isGeneratedDiff`) already have dedicated tests. Integration testing via the workflow runs in ccloud-private-automation-testing covers the end-to-end paths. 🤖

fantapop and others added 3 commits May 4, 2026 17:47
The implement action previously always created PRs against
GITHUB_REPOSITORY (the repo running the workflow). This adds a
pr_target_repo input so workflows can target a different repository,
enabling cross-repo scenarios like triggering from repo A but opening
a PR in repo B.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Both assess and implement now check whether Go is already on the path
before running setup-go. This avoids redundant Go installation when
a prior workflow step has already set it up.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@fantapop fantapop force-pushed the pr/autosolve-go branch 2 times, most recently from 7a64648 to 28fa74b Compare May 6, 2026 01:15
LoadSecurityConfig was intended for a security subcommand that was never
implemented. The Log method on the git Client interface was never called.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@fantapop fantapop force-pushed the pr/autosolve-go branch 9 times, most recently from 8c632cc to 508b7ea Compare May 7, 2026 01:09
Replace the boolean verbose_logging flag with a three-level log_level
input (error/info/debug) that controls how much Claude CLI output
streams to the GitHub Actions step log in real time.

- error (default): silent, only errors and final status
- info: pretty-printed result summary, permission denial warnings
- debug: full stream including all tool calls, text, and results

Permission denials are parsed from the result object and logged via
::warning:: annotations at info and debug levels.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@fantapop fantapop force-pushed the pr/autosolve-go branch from 508b7ea to 9adf4f9 Compare May 7, 2026 18:49
Comment thread README.md
| `context_vars` | `""` | Comma-separated list of environment variable names to pass through to Claude for untrusted user input (e.g., issue titles/bodies) |
| `assessment_criteria` | `""` | Custom criteria for the assessment. Uses default criteria if not provided. |
| `model` | `claude-opus-4-6` | Claude model ID |
| `blocked_paths` | `.github/workflows/` | Comma-separated path prefixes that cannot be modified. `.github/` is always blocked. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if .github/ is always blocked, then setting .github/workflows/ as the default feels redundant. Should the default just be ""

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — `.github/` is always added by `requiredBlockedPaths`, so the default of `.github/workflows/` is redundant. Will update the default to empty. 🤖

Comment thread README.md
| `system_prompt` | `""` | Trusted instructions for Claude describing the task. Do not embed untrusted user input — use `context_vars`. |
| `skill` | `""` | Path to a skill/prompt file relative to the repo root |
| `context_vars` | `""` | Comma-separated list of environment variable names to pass through to Claude for untrusted user input |
| `allowed_tools` | `Read,Write,Edit,Grep,Glob,...` | Claude `--allowedTools` string (defaults include git, go build/test/vet, and make) |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the full list of defaults be included here or at least link to a page where the full list can be viewed if too long

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add the full list of always-blocked paths to the README so users know what's enforced regardless of their config. 🤖

Comment thread README.md
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to somehow indicate which fields are required

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will mark required fields in the input tables. For both actions, `system_prompt` or `skill` (at least one) and `model` are required. 🤖

Comment thread CHANGELOG.md Outdated
Comment on lines +39 to +44
- `autosolve/assess` action: evaluate tasks for automated resolution suitability
using Claude in read-only mode.
- `autosolve/implement` action: autonomously implement solutions, validate
security, push to fork, and create PRs using Claude. Includes AI security
review, token usage tracking, per-file batched diff analysis, and structured
log levels (error/info/debug) with permission denial warnings.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be moved to the top of the section

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix the ordering — newer entries should be above older ones within each section. 🤖

action.LogInfo(fmt.Sprintf("%s usage: input=%d output=%d cost=$%.4f",
section, result.Usage.InputTokens, result.Usage.OutputTokens, result.Usage.CostUSD))
if result.PermissionDenials > 0 && logLevel != "error" {
action.LogWarning(fmt.Sprintf("%s: %d tool call(s) were denied by permission policy",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include the tools denied in the log? It might be helpful for troubleshooting

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're actually logging the denied tools elsewhere at info and debug level. I do think it's useful to see at error level as well so i'll just open that up and log it for all log levels.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by that I mean I'll keep logging the count here but at all log levels. The actual tools can be inspected elsewhere in info and debug level.

Check if the branch already exists on the fork remote before spending
tokens on Claude. This avoids the frustrating case where implementation
succeeds but PR creation fails because a previous run left the branch.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
@fantapop fantapop force-pushed the pr/autosolve-go branch from 42dd416 to 51c0306 Compare May 7, 2026 23:15
fantapop and others added 4 commits May 7, 2026 16:52
The AI security review receives attacker-controlled content (staged
diffs) in its prompt. Restrict its Bash access to git diff and git show
commands to prevent prompt injection from escalating to arbitrary shell
execution.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Add nil guard to LogResult to prevent panic when runner returns
(nil, err). Return errors from usage and permission denial JSON
parsing instead of silently swallowing them. Move tracker.Record
in security review after the error check.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Flag symlinks resolving outside the repository root as a security
violation, not just symlinks into blocked paths. Covers both absolute
and relative symlink targets.

Co-Authored-By: roachdev-claude <roachdev-claude-bot@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants