Skip to content

[codex] Add model multi-agent version selector#25032

Closed
aibrahim-oai wants to merge 37 commits into
mainfrom
codex/model-info-multi-agent-version
Closed

[codex] Add model multi-agent version selector#25032
aibrahim-oai wants to merge 37 commits into
mainfrom
codex/model-info-multi-agent-version

Conversation

@aibrahim-oai
Copy link
Copy Markdown
Collaborator

@aibrahim-oai aibrahim-oai commented May 29, 2026

Why

Some models need to select which existing multi-agent tool family they receive through model catalog metadata. Models without that metadata must continue to follow the existing Collab and MultiAgentV2 feature flags, including when a newer server sends an enum value this client does not recognize.

What changed

  • add optional ModelInfo.multi_agent_version metadata with v1 and v2
  • treat omitted and unknown wire values as None
  • resolve None from the existing feature flags
  • let explicit v1 select the current multi-agent tool family and explicit v2 select the existing MultiAgentV2 tool family
  • carry the resolved MultiAgentVersion directly on TurnContext, outside Config
  • use the resolved value for turn creation, model switches, review turns, and tool planning

Stack

Coverage

  • add protocol coverage for omitted, known, and unknown enum values
  • add focused coverage for flag fallback and explicit v1 and v2 overrides
  • add core integration coverage that fetches remote model metadata through /v1/models, verifies the outbound /responses tool family for explicit v1 and v2, and checks the configured v2 concurrency cap

@aibrahim-oai
Copy link
Copy Markdown
Collaborator Author

aibrahim-oai commented May 29, 2026

This change is part of the following stack:

Change managed by git-spice.

@aibrahim-oai aibrahim-oai force-pushed the codex/model-info-multi-agent-version branch from 49285e0 to 142825e Compare May 29, 2026 07:12
@aibrahim-oai aibrahim-oai force-pushed the codex/model-info-multi-agent-version branch from 142825e to a56d2a5 Compare May 29, 2026 07:21
@aibrahim-oai aibrahim-oai marked this pull request as ready for review May 29, 2026 07:29
@aibrahim-oai aibrahim-oai requested a review from a team as a code owner May 29, 2026 07:29
@aibrahim-oai aibrahim-oai marked this pull request as draft May 29, 2026 07:33
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a56d2a596b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +506 to +508
let multi_agent_runtime = session_configuration
.multi_agent_runtime
.with_model(&model_info, &per_turn_config.features);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Update session runtime when the model changes

When thread/settings switches models, this recomputes the selector only for the new TurnContext; SessionConfiguration::apply still stores the previous multi_agent_runtime. Session-level consumers such as Session::multi_agent_runtime() in AgentControl::spawn_forked_thread will then make fork-history filtering decisions from the old model, so switching between a V1/no-agent model and a cataloged V2 model can inject or retain the wrong multi-agent-v2 usage hints in forked subagents. Please recompute and persist the runtime alongside the collaboration mode update.

Useful? React with 👍 / 👎.

Comment on lines +47 to +50
pub(crate) fn with_model(self, model_info: &ModelInfo, features: &ManagedFeatures) -> Self {
Self {
collab_tools_disabled: self.collab_tools_disabled,
..Self::resolve(model_info, features)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Re-enable collaboration tools after switching to V2

For a subagent started at agent_max_depth on a V1/Collab model, startup calls disable_collab_tools() to hide further V1 spawning. If that same thread later switches to a cataloged V2 model, with_model() updates version but preserves collab_tools_disabled, so collab_tools_enabled() remains false and the V2 agent tools are still omitted even though V2 intentionally bypasses the depth cap. Reset this flag when the selected model resolves to V2.

Useful? React with 👍 / 👎.

Comment on lines +27 to +29
version: model_info
.multi_agent_version
.or_else(|| multi_agent_version_from_features(features)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor review-mode collaboration disables

start_review_conversation explicitly disables SpawnCsv, Collab, and MultiAgentV2 so the review delegate cannot re-enable blocked tools, but this selector now takes a catalog multi_agent_version before consulting those disabled features. If the review model is tagged v2, the review subagent resolves as multi-agent-v2 anyway and spec_plan exposes the agent tools, bypassing that review-only restriction. Add a runtime override for contexts that must suppress collaboration tools.

Useful? React with 👍 / 👎.

Comment thread codex-rs/core/src/agent/control.rs Outdated
self.send_input(new_thread.thread_id, initial_operation)
.await?;
if !new_thread.thread.enabled(Feature::MultiAgentV2) {
if !multi_agent_v2_enabled {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep completion watcher for V1-spawned V2 children

When a V1 parent uses spawn_agent with a requested model whose metadata selects V2, the V1 handler creates a thread-spawn source without an agent_path, but this branch skips the completion watcher solely because the child runtime is V2. Without an agent path the child cannot send the V2 inter-agent completion notification, and the V1 parent no longer receives the injected completion message it previously relied on. Base this decision on the parent protocol/path availability rather than only the child's runtime.

Useful? React with 👍 / 👎.

@aibrahim-oai aibrahim-oai force-pushed the codex/model-info-multi-agent-version branch from 5685e1f to 2986281 Compare May 29, 2026 07:44
@aibrahim-oai aibrahim-oai marked this pull request as ready for review May 29, 2026 07:48
@aibrahim-oai aibrahim-oai force-pushed the codex/model-info-multi-agent-version branch from 2986281 to 6e5fc13 Compare May 29, 2026 07:51
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6e5fc130cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread codex-rs/core/src/tools/spec_plan.rs
Comment thread codex-rs/core/src/tools/spec_plan.rs
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6e5fc130cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread codex-rs/core/src/tools/spec_plan.rs
Comment on lines +363 to 366
(turn_context.multi_agent_version == Some(MultiAgentVersion::V2)).then_some(
turn_context
.config
.multi_agent_v2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce the advertised V2 spawn cap

When catalog metadata selects V2 while the MultiAgentV2 feature flag is disabled, this advertises multi_agent_v2.max_concurrent_threads_per_session to the model, but spawning is still enforced by AgentControl::spawn_agent_with_metadata via reserve_spawn_slot(config.agent_max_threads), and config only maps the V2 cap into agent_max_threads when the feature flag is enabled. In that override scenario the model can be told, for example, that 17 concurrent threads are allowed while spawn_agent starts failing at the legacy agents.max_threads/default cap; the effective runtime cap needs to be updated alongside the tool spec.

Useful? React with 👍 / 👎.

@aibrahim-oai aibrahim-oai force-pushed the codex/model-info-multi-agent-version branch from 6e5fc13 to 2a1d049 Compare May 29, 2026 07:58
Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2a1d049450

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread codex-rs/core/src/session/turn_context.rs Outdated
aibrahim-oai added a commit that referenced this pull request May 29, 2026
aibrahim-oai added a commit that referenced this pull request May 29, 2026
@aibrahim-oai aibrahim-oai force-pushed the codex/model-info-multi-agent-version branch from 24b2891 to 856b02f Compare May 29, 2026 08:11
aibrahim-oai added a commit that referenced this pull request May 29, 2026
@aibrahim-oai aibrahim-oai force-pushed the codex/model-info-multi-agent-version branch from 856b02f to 6c07446 Compare May 29, 2026 08:23
…i-agent-version

# Conflicts:
#	codex-rs/app-server/tests/common/models_cache.rs
#	codex-rs/codex-api/tests/models_integration.rs
#	codex-rs/core/src/session/review.rs
#	codex-rs/core/src/session/turn_context.rs
#	codex-rs/core/src/tools/spec_plan.rs
#	codex-rs/core/src/tools/spec_plan_tests.rs
#	codex-rs/core/tests/suite/model_runtime_selectors.rs
#	codex-rs/core/tests/suite/model_switching.rs
#	codex-rs/core/tests/suite/models_cache_ttl.rs
#	codex-rs/core/tests/suite/personality.rs
#	codex-rs/core/tests/suite/remote_models.rs
#	codex-rs/core/tests/suite/rmcp_client.rs
#	codex-rs/core/tests/suite/spawn_agent_description.rs
#	codex-rs/core/tests/suite/view_image.rs
#	codex-rs/models-manager/src/model_info.rs
#	codex-rs/protocol/src/openai_models.rs
#	codex-rs/tools/src/tool_config_tests.rs
@aibrahim-oai
Copy link
Copy Markdown
Collaborator Author

@codex review this

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b3655164a5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

msg: &EventMsg,
) {
if !self.enabled(Feature::MultiAgentV2) {
if turn_context.multi_agent_version != Some(MultiAgentVersion::V2) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Align completion notifications with spawn protocol

When the catalog selector disagrees with the MultiAgentV2 feature flag, this child-side guard no longer matches AgentControl's watcher gate at core/src/agent/control.rs:342: metadata-selected V2 with the flag off gets both the V2 mailbox notification and the legacy injected parent message, while a V2 parent that spawns a child model selecting V1 gets neither path because the watcher is suppressed by the inherited feature flag. Use one resolved source of truth for the spawned thread's completion protocol so parents neither miss nor duplicate child completion notifications.

Useful? React with 👍 / 👎.


fn max_concurrent_threads_per_session(turn_context: &TurnContext) -> Option<usize> {
multi_agent_v2_enabled(turn_context).then_some(
(turn_context.multi_agent_version == Some(MultiAgentVersion::V2)).then_some(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce the advertised V2 concurrency cap

When multi_agent_version = v2 comes from model metadata while the MultiAgentV2 feature flag is off, this advertises the multi_agent_v2.max_concurrent_threads_per_session limit in the V2 spawn tool, but actual spawn reservation still uses config.agent_max_threads, which is derived from the legacy agents.max_threads path unless the feature flag is enabled during config loading. In that selector-only V2 session the model can be told a higher V2 cap applies and then receive unexpected spawn failures at the lower legacy/default limit.

Useful? React with 👍 / 👎.

session_source: &SessionSource,
) -> Option<&'a str> {
if !turn_context.features.enabled(Feature::MultiAgentV2) {
if turn_context.multi_agent_version != Some(MultiAgentVersion::V2) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep forked V2 usage hints in sync

When this selector enables the V2 root/subagent usage hints without the MultiAgentV2 feature flag, full-history fork spawning still filters and appends those hints using parent_thread.enabled(Feature::MultiAgentV2) and config.features.enabled(Feature::MultiAgentV2) in core/src/agent/control.rs:419-475. A selector-only V2 fork therefore carries the parent's root-agent guidance into the child and misses the subagent guidance, while a V1-selected child with the feature flag on can receive V2 fork guidance; that puts the wrong developer instructions in the child model context.

Useful? React with 👍 / 👎.

ToolMode::Direct
}
});
let multi_agent_version = model_info.multi_agent_version.or_else(|| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use selector for interrupt history markers

Resolving the selected multi-agent version only on TurnContext leaves interrupt handling on the stale feature flag: the live abort path calls InterruptedTurnHistoryMarker::from_config(task.turn_context.config.as_ref()), which chooses the V2 developer marker solely from config.features.enabled(Feature::MultiAgentV2). With a V1-selected model and the feature flag on, or a selector-only V2 model with the flag off, the next turn/fork records the wrong interrupt guidance format in model-visible history.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e92ded1249

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread codex-rs/core/src/thread_manager.rs Outdated
Comment on lines +901 to +902
let interrupted_marker =
InterruptedTurnHistoryMarker::from_config(&config, /*multi_agent_version*/ None);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Pass the resolved runtime when forking interrupted history

The live abort path now passes turn_context.multi_agent_version, but this fork helper still hard-codes None, so InterruptedTurnHistoryMarker::from_config falls back to the raw MultiAgentV2 feature flag. When a model selector chooses V2 with the flag off (or V1 with the flag on), interrupted fork snapshots record the wrong marker format in model-visible history; thread forks need the resolved selector just like the abort path.

Useful? React with 👍 / 👎.

Comment on lines +111 to +113
let child_multi_agent_version =
resolve_multi_agent_version_for_config(&session, &config).await?;
apply_spawn_agent_overrides(&mut config, child_depth, child_multi_agent_version);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject V1 children that exceed the depth cap

When a V2 parent requests a child model whose metadata resolves to V1 and child_depth > agent_max_depth, this path only disables that child’s future collab tools via apply_spawn_agent_overrides; unlike the V1 handler, it never rejects the over-depth spawn. That lets legacy/V1 agents be created beyond the configured depth limit through the V2 spawn_agent tool, so the depth-limit check needs to run after resolving the child runtime and before spawning non-V2 children.

Useful? React with 👍 / 👎.

Comment on lines +111 to +112
let child_multi_agent_version =
resolve_multi_agent_version_for_config(&session, &config).await?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Send V1 child spawns as user input

If the requested child model resolves to V1 here, the later match still wraps the initial prompt in Op::InterAgentCommunication whenever the V2 task name creates an agent path. That means a V1-planned child receives an assistant commentary JSON mailbox item instead of a normal user task, without V2 usage hints or tools to make that protocol clear; keep the plain UserInput path when child_multi_agent_version is not V2.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7de051e1ec

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread codex-rs/core/src/session/review.rs Outdated
Comment on lines +60 to +64
let multi_agent_version = resolve_multi_agent_version(
&model_info,
&per_turn_config,
&parent_turn_context.session_source,
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Prevent review turns from inheriting V2 tools

When the configured review model is cataloged with multi_agent_version = v2, this resolves the review turn as V2 even though the context below keeps the parent session_source rather than SubAgentSource::Review. The new guard in spec_plan only suppresses collaboration tools for review sources, so a normal CLI review using a V2-tagged review model will expose spawn_agent/wait_agent during review mode despite the review-only tool restrictions. Resolve this with a review source or otherwise force collaboration off for review turns.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e0edbe022f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +542 to +543
let multi_agent_version = match &conversation_history {
InitialHistory::Resumed(_) => conversation_history.get_multi_agent_version(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve collaboration tools for legacy resumed threads

When resuming a rollout written before this field existed, get_multi_agent_version() returns None because the old session_meta line has no multi_agent_version; this branch then stores None for the session instead of falling back to the current model/config selector. Since spec_plan now only adds collaboration tools when turn_context.multi_agent_version.is_some(), any legacy thread resumed after upgrading loses spawn_agent/wait_agent even when Collab or MultiAgentV2 is enabled. Please fall back to resolve_multi_agent_version(...) when resumed history has no persisted selector.

Useful? React with 👍 / 👎.

@aibrahim-oai
Copy link
Copy Markdown
Collaborator Author

Superseded by the clean two-PR stack:

Closing this PR in favor of the smaller replacements.

@aibrahim-oai
Copy link
Copy Markdown
Collaborator Author

Superseded by the final three-PR stack:

The split keeps storage, runtime behavior, and catalog resolution independently reviewable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants