Skip to content

jira-agent: refresh token before Phase 3 and increase Phase 4 max-turns#80447

Open
enxebre wants to merge 1 commit into
openshift:mainfrom
enxebre:enxebre/jira-agent-token-refresh-and-turns
Open

jira-agent: refresh token before Phase 3 and increase Phase 4 max-turns#80447
enxebre wants to merge 1 commit into
openshift:mainfrom
enxebre:enxebre/jira-agent-token-refresh-and-turns

Conversation

@enxebre

@enxebre enxebre commented Jun 11, 2026

Copy link
Copy Markdown
Member

Summary

  • Refresh the GitHub App fork token before Phase 3 (address review findings), which pushes code. Previously the only refresh happened between Phase 3 and Phase 4, so if Phases 1-2 exceeded the 1-hour token lifetime, Phase 3's push failed with Invalid username or token.
  • Increase Phase 4 (PR creation) --max-turns from 15 to 30 to give enough headroom for push retries when pre-push hooks fail.

Root cause

Observed in periodic-jira-agent run 2065130139884195840:

  1. Token generated at job start (~17:18)
  2. Phases 1-2 ran for ~1h40m
  3. Phase 3 tried to push at 19:01 → fatal: Authentication failed (token expired)
  4. Token refreshed before Phase 4, but Phase 4 only had 15 turns and burned them on pre-push hook retries without creating the PR

/cc @openshift/hypershift-team

🤖 Generated with Claude Code

Summary by CodeRabbit

This PR improves the reliability of the HyperShift Jira Agent periodic CI job, which automates code changes and pull request creation through a four-phase pipeline.

Changes Made

1. Token Refresh Before Phase 3
Added an early GitHub App fork token refresh right before Phase 3 (addressing review findings). Since Phases 1–2 can run longer than the 1-hour GitHub App token lifetime, without this refresh, Phase 3's git push operations would fail with "Invalid username or token" errors. The token is now refreshed both before Phase 3 and again before Phase 4 to ensure all operations use valid credentials.

2. Increased Phase 4 Max-Turns
Increased the --max-turns parameter for Phase 4 (PR creation) from 15 to 30. This provides additional retries for pre-push hook failures during PR creation, preventing premature exhaustion of retry attempts before the PR is successfully created.

Impact

These changes directly address a failure observed in periodic run 2065130139884195840, where:

  • Phases 1–2 ran ~1h40m, exceeding the token lifetime
  • Phase 3's push at 19:01 failed with authentication error
  • Phase 4 exhausted its 15 retry turns on pre-push hook failures without creating the PR

The modified script is located in the CI operator step registry for the HyperShift Jira Agent and affects how automated issue resolution and PR creation are handled in the periodic job pipeline.

Phase 3 (address review findings) pushes code to the fork, but its
GitHub App token was generated at job start and can expire after 1 hour.
When Phases 1-2 run long, Phase 3's push fails with "Invalid username
or token".

Add a fork token refresh before Phase 3 starts (matching the existing
refresh before Phase 4).

Also increase Phase 4 max-turns from 15 to 30. When the push encounters
pre-push hook failures, 15 turns is not enough to diagnose, fix, and
retry before creating the PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@enxebre: GitHub didn't allow me to request PR reviews from the following users: openshift/hypershift-team.

Note that only openshift members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

Summary

  • Refresh the GitHub App fork token before Phase 3 (address review findings), which pushes code. Previously the only refresh happened between Phase 3 and Phase 4, so if Phases 1-2 exceeded the 1-hour token lifetime, Phase 3's push failed with Invalid username or token.
  • Increase Phase 4 (PR creation) --max-turns from 15 to 30 to give enough headroom for push retries when pre-push hooks fail.

Root cause

Observed in periodic-jira-agent run 2065130139884195840:

  1. Token generated at job start (~17:18)
  2. Phases 1-2 ran for ~1h40m
  3. Phase 3 tried to push at 19:01 → fatal: Authentication failed (token expired)
  4. Token refreshed before Phase 4, but Phase 4 only had 15 turns and burned them on pre-push hook retries without creating the PR

/cc @openshift/hypershift-team

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

This PR updates a bash orchestration script that manages GitHub App authentication and AI-assisted pull request creation for the HyperShift Jira agent. The script now refreshes the fork token earlier (before Phase 3) to avoid expiry during code-addressing work, clarifies Phase 4 token regeneration timing in comments, and increases Claude's conversational turns from 15 to 30 during PR generation.

Changes

HyperShift Jira Agent Token Refresh and Claude Configuration

Layer / File(s) Summary
GitHub App token refresh before Phase 3 and Phase 4 documentation
ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh
A token refresh block is inserted before Phase 3 to regenerate the fork installation token and update git's credential helper for subsequent pushes, with error handling. Comments and logging in the existing Phase 4 regeneration section are updated to explicitly document why the Phase 3 refresh is needed and when Phase 4 regeneration occurs.
Claude PR generation max-turns configuration
ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh
The --max-turns parameter in Phase 4's Claude model invocation is increased from 15 to 30 to allow more dialogue turns during PR creation.

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and specifically describes the two main changes: token refresh timing for Phase 3 and increased max-turns for Phase 4, matching the actual modifications in the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR #80447 changes only ci-operator/.../hypershift-jira-agent-process-commands.sh (1 file, .sh); no Ginkgo *_test.go or It/Describe title updates found.
Test Structure And Quality ✅ Passed PR #80447 changes only hypershift-jira-agent-process-commands.sh (bash); no Ginkgo test code/It blocks were modified, so Ginkgo quality requirements are not applicable.
Microshift Test Compatibility ✅ Passed PR #80447 changes only ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh; no Ginkgo e2e tests were added/modified, so no MicroShift API compatibility...
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR only updates hypershift-jira-agent-process-commands.sh (GitHub App token refresh and Claude --max-turns). No new Ginkgo e2e tests are involved, so SNO compatibility assumptions don’t apply.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only a CI/CD orchestration shell script (hypershift-jira-agent-process-commands.sh), not deployment manifests, operator code, or controllers. No scheduling constraints are introduced.
Ote Binary Stdout Contract ✅ Passed PR #80447 modifies only the hypershift-jira-agent bash script; no Go/OTE binary code changed, and the script contains no klog/fmt stdout patterns.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR only changes hypershift-jira-agent-process-commands.sh (token refresh, --max-turns); no new Ginkgo e2e tests or IPv4/external-internet compatibility assumptions.
No-Weak-Crypto ✅ Passed In the PR script, no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB usage found; only openssl dgst -sha256 -sign is present, with no secret/token non-constant-time comparisons detected.
Container-Privileges ✅ Passed PR changes only hypershift-jira-agent-process-commands.sh; searches found no privileged/hostPID/hostNetwork/hostIPC/securityContext/allowPrivilegeEscalation/SYS_ADMIN keys.
No-Sensitive-Data-In-Logs ✅ Passed Reviewed hypershift-jira-agent-process-commands.sh: PR refresh/log changes print only status and numeric token-usage counts; no echo/printf outputs secret/token/JWT values (token used only in Autho...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2026
@enxebre

enxebre commented Jun 11, 2026

Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@enxebre: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-openshift-hypershift-main-periodic-jira-agent N/A periodic Registry content changed

Prior to this PR being merged, you will need to either run and acknowledge or opt to skip these rehearsals.

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh (1)

743-797: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Only mark the Jira issue processed after PR creation actually succeeds.

From Line 743 onward, the script adds agent-processed, transitions the issue, increments PROCESSED_COUNT, and records SUCCESS even when Lines 703-705 left PR_URL empty because Phase 4 failed. Since the search JQL excludes agent-processed, those failures will not be retried on the next run and the issue can get stuck without a PR. Gate the Jira mutations and success accounting on PR_URL being present, with a separate success path for the intentional “no code changes” case.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh`
around lines 743 - 797, The script currently always adds the 'agent-processed'
label, transitions the issue, sets assignee, increments PROCESSED_COUNT and
writes "SUCCESS" to STATE_FILE even when PR creation failed (PR_URL is empty);
wrap the entire Jira mutation + success accounting block (the code using
LABEL_RESPONSE / transition_issue / set_assignee, PROCESSED_COUNT increment and
the echo to STATE_FILE) in a guard if [ -n "$PR_URL" ]; then ... fi so those
actions only run when PR_URL is present; add an explicit else branch that
handles the intentional "no code changes" case by writing a distinct state (e.g.
"NO_CHANGES") to STATE_FILE or skipping Jira mutations but still incrementing
PROCESSED_COUNT if appropriate, and ensure you reference the existing symbols
LABEL_RESPONSE, transition_issue, set_assignee, PROCESSED_COUNT, PR_URL and
STATE_FILE when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh`:
- Around line 540-549: The refresh block for the GitHub App token using
generate_github_token (GITHUB_TOKEN_FORK) only logs failures and continues,
which allows Phase 3/4 to run with expired credentials; update the logic in the
token refresh sections (the GITHUB_TOKEN_FORK and GITHUB_TOKEN_UPSTREAM refresh
blocks around generate_github_token) to either retry token generation a few
times with backoff or immediately fail the script when refresh returns
empty/"null" before entering the push/PR phases—i.e., on failure do a bounded
retry of generate_github_token and if still unsuccessful call exit 1 (or
otherwise abort the run) instead of merely echoing an error so Phase 3/4 never
proceed with stale credentials.

---

Outside diff comments:
In
`@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh`:
- Around line 743-797: The script currently always adds the 'agent-processed'
label, transitions the issue, sets assignee, increments PROCESSED_COUNT and
writes "SUCCESS" to STATE_FILE even when PR creation failed (PR_URL is empty);
wrap the entire Jira mutation + success accounting block (the code using
LABEL_RESPONSE / transition_issue / set_assignee, PROCESSED_COUNT increment and
the echo to STATE_FILE) in a guard if [ -n "$PR_URL" ]; then ... fi so those
actions only run when PR_URL is present; add an explicit else branch that
handles the intentional "no code changes" case by writing a distinct state (e.g.
"NO_CHANGES") to STATE_FILE or skipping Jira mutations but still incrementing
PROCESSED_COUNT if appropriate, and ensure you reference the existing symbols
LABEL_RESPONSE, transition_issue, set_assignee, PROCESSED_COUNT, PR_URL and
STATE_FILE when making the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 28595b10-a32b-49b6-ad45-cb1c24d7657f

📥 Commits

Reviewing files that changed from the base of the PR and between fbbd18c and 0aadefb.

📒 Files selected for processing (1)
  • ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh

Comment on lines +540 to +549
# Refresh tokens before Phase 3 since it pushes code.
# Phases 1-2 can exceed the 1-hour GitHub App token lifetime.
echo "Refreshing GitHub App tokens before Phase 3..."
GITHUB_TOKEN_FORK=$(generate_github_token "$INSTALLATION_ID_FORK")
if [ -z "$GITHUB_TOKEN_FORK" ] || [ "$GITHUB_TOKEN_FORK" = "null" ]; then
echo "ERROR: Failed to refresh GitHub App token for fork"
else
git config --global credential.helper "!f() { echo username=x-access-token; echo password=${GITHUB_TOKEN_FORK}; }; f"
echo "Fork token refreshed"
fi

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't continue into Phase 3/4 with a failed token refresh.

Lines 544-549 and Lines 621-633 only log refresh failures, then fall through into the push/PR phases with the previous credentials still configured. In the same long-running case this change is addressing, that leaves Phase 3 pushing with an expired fork token and Phase 4 calling gh with an expired upstream token. Retry the refresh or fail the current issue before entering the dependent phase.

Also applies to: 616-633

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/hypershift/jira-agent/process/hypershift-jira-agent-process-commands.sh`
around lines 540 - 549, The refresh block for the GitHub App token using
generate_github_token (GITHUB_TOKEN_FORK) only logs failures and continues,
which allows Phase 3/4 to run with expired credentials; update the logic in the
token refresh sections (the GITHUB_TOKEN_FORK and GITHUB_TOKEN_UPSTREAM refresh
blocks around generate_github_token) to either retry token generation a few
times with backoff or immediately fail the script when refresh returns
empty/"null" before entering the push/PR phases—i.e., on failure do a bounded
retry of generate_github_token and if still unsuccessful call exit 1 (or
otherwise abort the run) instead of merely echoing an error so Phase 3/4 never
proceed with stale credentials.

@enxebre

enxebre commented Jun 12, 2026

Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@enxebre: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@enxebre

enxebre commented Jun 12, 2026

Copy link
Copy Markdown
Member Author

/pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@enxebre: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

@enxebre: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-openshift-hypershift-main-periodic-jira-agent 0aadefb link unknown /pj-rehearse periodic-ci-openshift-hypershift-main-periodic-jira-agent

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant