[codex] Add model multi-agent version selector#25032
Conversation
|
This change is part of the following stack: Change managed by git-spice. |
49285e0 to
142825e
Compare
142825e to
a56d2a5
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a56d2a596b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let multi_agent_runtime = session_configuration | ||
| .multi_agent_runtime | ||
| .with_model(&model_info, &per_turn_config.features); |
There was a problem hiding this comment.
Update session runtime when the model changes
When thread/settings switches models, this recomputes the selector only for the new TurnContext; SessionConfiguration::apply still stores the previous multi_agent_runtime. Session-level consumers such as Session::multi_agent_runtime() in AgentControl::spawn_forked_thread will then make fork-history filtering decisions from the old model, so switching between a V1/no-agent model and a cataloged V2 model can inject or retain the wrong multi-agent-v2 usage hints in forked subagents. Please recompute and persist the runtime alongside the collaboration mode update.
Useful? React with 👍 / 👎.
| pub(crate) fn with_model(self, model_info: &ModelInfo, features: &ManagedFeatures) -> Self { | ||
| Self { | ||
| collab_tools_disabled: self.collab_tools_disabled, | ||
| ..Self::resolve(model_info, features) |
There was a problem hiding this comment.
Re-enable collaboration tools after switching to V2
For a subagent started at agent_max_depth on a V1/Collab model, startup calls disable_collab_tools() to hide further V1 spawning. If that same thread later switches to a cataloged V2 model, with_model() updates version but preserves collab_tools_disabled, so collab_tools_enabled() remains false and the V2 agent tools are still omitted even though V2 intentionally bypasses the depth cap. Reset this flag when the selected model resolves to V2.
Useful? React with 👍 / 👎.
| version: model_info | ||
| .multi_agent_version | ||
| .or_else(|| multi_agent_version_from_features(features)), |
There was a problem hiding this comment.
Honor review-mode collaboration disables
start_review_conversation explicitly disables SpawnCsv, Collab, and MultiAgentV2 so the review delegate cannot re-enable blocked tools, but this selector now takes a catalog multi_agent_version before consulting those disabled features. If the review model is tagged v2, the review subagent resolves as multi-agent-v2 anyway and spec_plan exposes the agent tools, bypassing that review-only restriction. Add a runtime override for contexts that must suppress collaboration tools.
Useful? React with 👍 / 👎.
| self.send_input(new_thread.thread_id, initial_operation) | ||
| .await?; | ||
| if !new_thread.thread.enabled(Feature::MultiAgentV2) { | ||
| if !multi_agent_v2_enabled { |
There was a problem hiding this comment.
Keep completion watcher for V1-spawned V2 children
When a V1 parent uses spawn_agent with a requested model whose metadata selects V2, the V1 handler creates a thread-spawn source without an agent_path, but this branch skips the completion watcher solely because the child runtime is V2. Without an agent path the child cannot send the V2 inter-agent completion notification, and the V1 parent no longer receives the injected completion message it previously relied on. Base this decision on the parent protocol/path availability rather than only the child's runtime.
Useful? React with 👍 / 👎.
5685e1f to
2986281
Compare
2986281 to
6e5fc13
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6e5fc130cf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6e5fc130cf
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| (turn_context.multi_agent_version == Some(MultiAgentVersion::V2)).then_some( | ||
| turn_context | ||
| .config | ||
| .multi_agent_v2 |
There was a problem hiding this comment.
Enforce the advertised V2 spawn cap
When catalog metadata selects V2 while the MultiAgentV2 feature flag is disabled, this advertises multi_agent_v2.max_concurrent_threads_per_session to the model, but spawning is still enforced by AgentControl::spawn_agent_with_metadata via reserve_spawn_slot(config.agent_max_threads), and config only maps the V2 cap into agent_max_threads when the feature flag is enabled. In that override scenario the model can be told, for example, that 17 concurrent threads are allowed while spawn_agent starts failing at the legacy agents.max_threads/default cap; the effective runtime cap needs to be updated alongside the tool spec.
Useful? React with 👍 / 👎.
6e5fc13 to
2a1d049
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2a1d049450
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
24b2891 to
856b02f
Compare
856b02f to
6c07446
Compare
…i-agent-version # Conflicts: # codex-rs/app-server/tests/common/models_cache.rs # codex-rs/codex-api/tests/models_integration.rs # codex-rs/core/src/session/review.rs # codex-rs/core/src/session/turn_context.rs # codex-rs/core/src/tools/spec_plan.rs # codex-rs/core/src/tools/spec_plan_tests.rs # codex-rs/core/tests/suite/model_runtime_selectors.rs # codex-rs/core/tests/suite/model_switching.rs # codex-rs/core/tests/suite/models_cache_ttl.rs # codex-rs/core/tests/suite/personality.rs # codex-rs/core/tests/suite/remote_models.rs # codex-rs/core/tests/suite/rmcp_client.rs # codex-rs/core/tests/suite/spawn_agent_description.rs # codex-rs/core/tests/suite/view_image.rs # codex-rs/models-manager/src/model_info.rs # codex-rs/protocol/src/openai_models.rs # codex-rs/tools/src/tool_config_tests.rs
|
@codex review this |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b3655164a5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| msg: &EventMsg, | ||
| ) { | ||
| if !self.enabled(Feature::MultiAgentV2) { | ||
| if turn_context.multi_agent_version != Some(MultiAgentVersion::V2) { |
There was a problem hiding this comment.
Align completion notifications with spawn protocol
When the catalog selector disagrees with the MultiAgentV2 feature flag, this child-side guard no longer matches AgentControl's watcher gate at core/src/agent/control.rs:342: metadata-selected V2 with the flag off gets both the V2 mailbox notification and the legacy injected parent message, while a V2 parent that spawns a child model selecting V1 gets neither path because the watcher is suppressed by the inherited feature flag. Use one resolved source of truth for the spawned thread's completion protocol so parents neither miss nor duplicate child completion notifications.
Useful? React with 👍 / 👎.
|
|
||
| fn max_concurrent_threads_per_session(turn_context: &TurnContext) -> Option<usize> { | ||
| multi_agent_v2_enabled(turn_context).then_some( | ||
| (turn_context.multi_agent_version == Some(MultiAgentVersion::V2)).then_some( |
There was a problem hiding this comment.
Enforce the advertised V2 concurrency cap
When multi_agent_version = v2 comes from model metadata while the MultiAgentV2 feature flag is off, this advertises the multi_agent_v2.max_concurrent_threads_per_session limit in the V2 spawn tool, but actual spawn reservation still uses config.agent_max_threads, which is derived from the legacy agents.max_threads path unless the feature flag is enabled during config loading. In that selector-only V2 session the model can be told a higher V2 cap applies and then receive unexpected spawn failures at the lower legacy/default limit.
Useful? React with 👍 / 👎.
| session_source: &SessionSource, | ||
| ) -> Option<&'a str> { | ||
| if !turn_context.features.enabled(Feature::MultiAgentV2) { | ||
| if turn_context.multi_agent_version != Some(MultiAgentVersion::V2) { |
There was a problem hiding this comment.
Keep forked V2 usage hints in sync
When this selector enables the V2 root/subagent usage hints without the MultiAgentV2 feature flag, full-history fork spawning still filters and appends those hints using parent_thread.enabled(Feature::MultiAgentV2) and config.features.enabled(Feature::MultiAgentV2) in core/src/agent/control.rs:419-475. A selector-only V2 fork therefore carries the parent's root-agent guidance into the child and misses the subagent guidance, while a V1-selected child with the feature flag on can receive V2 fork guidance; that puts the wrong developer instructions in the child model context.
Useful? React with 👍 / 👎.
| ToolMode::Direct | ||
| } | ||
| }); | ||
| let multi_agent_version = model_info.multi_agent_version.or_else(|| { |
There was a problem hiding this comment.
Use selector for interrupt history markers
Resolving the selected multi-agent version only on TurnContext leaves interrupt handling on the stale feature flag: the live abort path calls InterruptedTurnHistoryMarker::from_config(task.turn_context.config.as_ref()), which chooses the V2 developer marker solely from config.features.enabled(Feature::MultiAgentV2). With a V1-selected model and the feature flag on, or a selector-only V2 model with the flag off, the next turn/fork records the wrong interrupt guidance format in model-visible history.
Useful? React with 👍 / 👎.
…sion' into codex/model-info-multi-agent-version
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e92ded1249
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let interrupted_marker = | ||
| InterruptedTurnHistoryMarker::from_config(&config, /*multi_agent_version*/ None); |
There was a problem hiding this comment.
Pass the resolved runtime when forking interrupted history
The live abort path now passes turn_context.multi_agent_version, but this fork helper still hard-codes None, so InterruptedTurnHistoryMarker::from_config falls back to the raw MultiAgentV2 feature flag. When a model selector chooses V2 with the flag off (or V1 with the flag on), interrupted fork snapshots record the wrong marker format in model-visible history; thread forks need the resolved selector just like the abort path.
Useful? React with 👍 / 👎.
| let child_multi_agent_version = | ||
| resolve_multi_agent_version_for_config(&session, &config).await?; | ||
| apply_spawn_agent_overrides(&mut config, child_depth, child_multi_agent_version); |
There was a problem hiding this comment.
Reject V1 children that exceed the depth cap
When a V2 parent requests a child model whose metadata resolves to V1 and child_depth > agent_max_depth, this path only disables that child’s future collab tools via apply_spawn_agent_overrides; unlike the V1 handler, it never rejects the over-depth spawn. That lets legacy/V1 agents be created beyond the configured depth limit through the V2 spawn_agent tool, so the depth-limit check needs to run after resolving the child runtime and before spawning non-V2 children.
Useful? React with 👍 / 👎.
| let child_multi_agent_version = | ||
| resolve_multi_agent_version_for_config(&session, &config).await?; |
There was a problem hiding this comment.
Send V1 child spawns as user input
If the requested child model resolves to V1 here, the later match still wraps the initial prompt in Op::InterAgentCommunication whenever the V2 task name creates an agent path. That means a V1-planned child receives an assistant commentary JSON mailbox item instead of a normal user task, without V2 usage hints or tools to make that protocol clear; keep the plain UserInput path when child_multi_agent_version is not V2.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7de051e1ec
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let multi_agent_version = resolve_multi_agent_version( | ||
| &model_info, | ||
| &per_turn_config, | ||
| &parent_turn_context.session_source, | ||
| ); |
There was a problem hiding this comment.
Prevent review turns from inheriting V2 tools
When the configured review model is cataloged with multi_agent_version = v2, this resolves the review turn as V2 even though the context below keeps the parent session_source rather than SubAgentSource::Review. The new guard in spec_plan only suppresses collaboration tools for review sources, so a normal CLI review using a V2-tagged review model will expose spawn_agent/wait_agent during review mode despite the review-only tool restrictions. Resolve this with a review source or otherwise force collaboration off for review turns.
Useful? React with 👍 / 👎.
…i-agent-version # Conflicts: # codex-rs/core/src/thread_manager.rs # codex-rs/thread-store/src/types.rs
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e0edbe022f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let multi_agent_version = match &conversation_history { | ||
| InitialHistory::Resumed(_) => conversation_history.get_multi_agent_version(), |
There was a problem hiding this comment.
Preserve collaboration tools for legacy resumed threads
When resuming a rollout written before this field existed, get_multi_agent_version() returns None because the old session_meta line has no multi_agent_version; this branch then stores None for the session instead of falling back to the current model/config selector. Since spec_plan now only adds collaboration tools when turn_context.multi_agent_version.is_some(), any legacy thread resumed after upgrading loses spawn_agent/wait_agent even when Collab or MultiAgentV2 is enabled. Please fall back to resolve_multi_agent_version(...) when resumed history has no persisted selector.
Useful? React with 👍 / 👎.
|
Superseded by the clean two-PR stack:
Closing this PR in favor of the smaller replacements. |
|
Superseded by the final three-PR stack:
The split keeps storage, runtime behavior, and catalog resolution independently reviewable. |
Why
Some models need to select which existing multi-agent tool family they receive through model catalog metadata. Models without that metadata must continue to follow the existing
CollabandMultiAgentV2feature flags, including when a newer server sends an enum value this client does not recognize.What changed
ModelInfo.multi_agent_versionmetadata withv1andv2NoneNonefrom the existing feature flagsv1select the current multi-agent tool family and explicitv2select the existingMultiAgentV2tool familyMultiAgentVersiondirectly onTurnContext, outsideConfigStack
Coverage
v1andv2overrides/v1/models, verifies the outbound/responsestool family for explicitv1andv2, and checks the configured v2 concurrency cap