WEB-4812: codex hook falls back to ~/.unbound/config.json for API key#137
Open
AvutR wants to merge 1 commit into
Open
WEB-4812: codex hook falls back to ~/.unbound/config.json for API key#137AvutR wants to merge 1 commit into
AvutR wants to merge 1 commit into
Conversation
The codex hook resolved its API key ONLY from the UNBOUND_CODEX_API_KEY env var, which setup writes to ~/.zshrc/.bashrc. Codex launched from VS Code's extension host, GUI launchers, terminals opened before setup ran, or non-zsh/bash shells never sees that var, so every exchange was silently dropped: send_to_api no-ops, policy checks fail open, and report_error_to_gateway needs the key too — nothing reaches Sentry or the gateway. Only trace is 'No API key present in send_to_api function' in ~/.codex/hooks/error.log. Claude Code hit this exact bug and fixed it in WEB-4145 with a two-tier get_api_key(); cursor and copilot have the same fallback. Codex was the only tool never ported — this file already reads api_key from ~/.unbound/config.json for the discovery gate and MCP scan dispatch. Port get_api_key() to codex (env first, then ~/.unbound/config.json, which both user-level setup.py and MDM write) and use it in main(). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Comment on lines
+1143
to
+1146
| return None | ||
| except Exception as e: | ||
| log_error(f"Failed to read config file: {e}", 'config') | ||
| return None |
There was a problem hiding this comment.
log_error fires before _cached_api_key is populated
log_error internally calls report_error_to_gateway(message, category, _cached_api_key). When get_api_key() is invoked at the top of main(), the module-level _cached_api_key is still None (it's only updated on the next line after get_api_key() returns). So if the config file contains malformed JSON, the error is correctly written to error.log, but gateway reporting silently no-ops because the key is None. In this particular failure mode the key was unreadable anyway so there is no additional loss, but the ordering means gateway-side visibility for startup config errors will remain limited until the key is resolved.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Root cause (WEB-4812, Slack)
codex/hooks/unbound.pyresolved its API key only from theUNBOUND_CODEX_API_KEYenv var (main(), was line 1523). Setup writes that var to~/.zshrc/~/.bashrc(user-level) or per-user rc files (MDMset_env_var_system_wide), so any codex process that doesn't inherit a freshly-sourced zsh/bash environment runs keyless:With no key, the failure is triply silent by design:
send_to_apino-ops (exchange dropped), pretool policy checks fail open to allow, andreport_error_to_gatewayitself requires the key — so nothing reaches the gateway or Sentry. The only trace isNo API key present in send_to_api functionin~/.codex/hooks/error.log.This is sibling drift: Claude Code hit the identical bug and fixed it in WEB-4145 (#56) with a two-tier
get_api_key(); cursor (cursor/unbound.py:1182) and copilot (copilot/hooks/unbound.py:938) have the same fallback. Codex was the only tool never ported — even though this same file already readsapi_keyfrom~/.unbound/config.jsonfor the discovery gate and MCP scan dispatch, and both setup paths write that file.Fix
Port
get_api_key()(env →~/.unbound/config.json) to codex and use it inmain(). Two-line behavioral change + docstring.Evidence (sandboxed repro, fully offline)
Ran the real hook script with fake HOME, gateway at a closed port,
UserPromptSubmit+Stopevents on stdin:No API key present in send_to_api function— exchange droppedAPI request failed: curl (7)— send attempted ✓API request failed: curl (7)— send attempted with config key ✓Tests
codex/hooks/test_get_api_key.py(6 tests): env-wins, config fallback, no-source → None, invalid JSON → None + logged, missing field → None, and amain()-level Stop-event regression asserting the exchange is sent with the config key when the env var is absent.test_identity.py33/33 pass;binary/tests73 passed/31 skipped (vendored-contract unaffected);py_compile+ pyflakes undefined-name gate clean (CI parity).Rollout notes
SELF_UPDATE_URLpoints at rawmain, checked onSessionStartevery ~2h — no reinstall needed once merged.unbound-hookbinary vendors this file — included in the nextrelease-macos-runtimebuild.🤖 Generated with Claude Code
Greptile Summary
This PR ports the two-tier API key lookup (
env var → ~/.unbound/config.json) from Claude Code / cursor / copilot into the codex hook, fixing a silent-drop regression where any codex process launched outside a sourced shell (VS Code extension host, GUI launchers, pre-setup terminals) never received an API key and silently dropped every exchange.get_api_key()with env-var-first, config-file-fallback resolution; handlesFileNotFoundError,JSONDecodeError, and a generalExceptionguard, logging errors locally on decode failures.os.getenv(...)call inmain()withget_api_key(), and ships 6 tests (unit + amain()-level Stop-event regression) confirming the fallback path is exercised end-to-end.Confidence Score: 4/5
Safe to merge — the two-line behavioral change is isolated to key resolution, the fallback path is well-tested, and it correctly restores behaviour for the affected launch environments.
The core fix is a faithful port of an already-proven pattern. The one observation is that
log_errorinsideget_api_key()fires while_cached_api_keyis stillNone, so config-parse errors are written locally toerror.logbut won't be forwarded to the gateway. This is an edge-case observability gap rather than a correctness problem.No files require special attention; the change is self-contained in
codex/hooks/unbound.pyand fully covered by the new test file.Important Files Changed
get_api_key()with env-var-first, config-file-fallback lookup;main()swapped to call it. Logic is correct and matches the pattern used in claude/cursor/copilot. Minor:log_errorinsideget_api_key()fires while_cached_api_keyis stillNone, so gateway reporting of config-parse errors won't reach the gateway (local error.log is still written).main()with a Stop event to confirm the full send path uses the config key when env is absent.Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A["main() called"] --> B["get_api_key()"] B --> C{"UNBOUND_CODEX_API_KEY set?"} C -- yes --> D["return env key"] C -- no --> E{"config.json exists?"} E -- no --> F["return None"] E -- yes --> G{"Valid JSON with api_key?"} G -- yes --> H["return config key"] G -- "malformed JSON" --> I["log_error, return None"] G -- "missing field" --> J["return None"] G -- "other OSError" --> K["log_error, return None"] D --> L["_cached_api_key = api_key"] H --> L F --> L I --> L J --> L K --> L L --> M{"api_key present?"} M -- yes --> N["send_to_api succeeds"] M -- no --> O["send_to_api no-ops, silent drop"]Reviews (1): Last reviewed commit: "WEB-4812: codex hook falls back to ~/.un..." | Re-trigger Greptile