feat: carry Foundry agent and resource definitions in azure.yaml#8779
feat: carry Foundry agent and resource definitions in azure.yaml#8779huimiu wants to merge 55 commits into
Conversation
…dry service target
jongio
left a comment
There was a problem hiding this comment.
Solid architectural change. The decomposition of Foundry resources into individual azure.yaml service entries with uses: ordering is well-designed, and the fallback collectors (sibling service wins, bundled agent config is the legacy fallback) are cleanly implemented with good test coverage. A few observations below.
jongio
left a comment
There was a problem hiding this comment.
Incremental review (22 commits since last pass at 01463ff). CI is green, overall architecture is clean. The new pkg/foundry package is well-designed: proper cycle detection, depth bounds, error typing, and good test coverage. The resource-service split across per-host extensions is consistent and well-factored.
Prior unresolved items (4 comments from my earlier review remain open with no replies):
sanitizeServiceNameonly strips spaces (resource_services.go:193)- Path traversal in
resolveSkillInstructions(skills service_target.go:180) - Silent yaml.Marshal skip (agent_definition.go:348)
- Silent empty connName skip (resource_services.go:87)
I won't re-post these, but they still apply to the current HEAD.
New observations on the pkg/foundry package:
-
ExpandEnvsentinel restoration correctness (templating.go:68-70): The sentinel is guaranteed absent from the input, but env var expansion runs after masking. If an env var resolves to a string containing the sentinel prefix (azdFoundryTemplateSpan_0_),strings.Replacewill match it and inject a Foundry span where it doesn't belong. Extremely unlikely in practice, but the correctness invariant ("Foundry expressions are always returned byte-for-byte") is violated in that case. A simple fix: after expansion, verify each placeholder appears exactly once before replacing, or use a sentinel that encodes a random token per call. -
looksLikeInstructionsPathheuristic (includes.go:262-268): Inline prose that ends with.mdor.txt(e.g., "Follow the steps in README.md") will be incorrectly classified as a file path and rebased. The current check (single-line + suffix) is a reasonable heuristic, but consider also requiring a path separator or./prefix, or at least documenting the false-positive risk so authors of$ref'd files know to avoid ambiguous instructions values. -
Nil map-value guard (resource_services.go:222-229, 255-261, 287-293): The
collectProjectDeployments/collectConnections/collectToolboxescollectors iterate the services map and dereference each value's Host field. If a map ever contains a nil value (unlikely from azd core, but possible from test code or future refactors), this panics. A nil guard insortedServiceswould be cheap insurance.
All three are low-to-medium severity. The overall approach is solid; the design of ResolveFileRefs + YAMLDocument as a read/write pair sharing path logic is thoughtful. Ship it once the prior items are addressed.
jongio
left a comment
There was a problem hiding this comment.
Incremental review (1 commit since last pass at e59e938). The fix commit d6fd4da is clean and well-scoped:
-
Legacy host restored:
microsoft.foundryadded toFoundryServiceHostsfor backward compat. Tests cover mixed-host ambiguity detection. Schemarefcorrectly points to/main/...microsoft.foundry.json. -
Whitespace endpoint trimming:
strings.TrimSpaceapplied consistently infoundryServiceEndpoint,warnNetworkIgnoredInBrownfield, and the synthesizer's brownfield check. Blank endpoints now correctly trigger greenfield mode. Test added. -
Missing brownfield warning:
warnNetworkIgnoredInBrownfieldwas only called on the synthesizer path. Now also called on the on-disk Bicep path whenendpoint:is set. Good catch.
No concerns. CI is green for the core pipelines; agents-public build is still pending (expected timing). LGTM on this increment.
Azure Dev CLI Install InstructionsInstall scriptsMacOS/Linux
bash: pwsh: WindowsPowerShell install MSI install Standalone Binary
MSI
Documentationlearn.microsoft.com documentationtitle: Azure Developer CLI reference
|
jongio
left a comment
There was a problem hiding this comment.
Incremental review (5 new commits since d6fd4da). All five address prior review feedback:
- Path traversal rejection in
resolveSkillInstructionsis correct: checks beforefilepath.Join, covers\and/separators, has test coverage for nested traversal (sub/../../escape.md). - Warnings for empty sanitized names are user-facing on stderr, consistent with
WarnLegacyAgentShape. - Schema docs now accurately reflect that remote URLs aren't supported.
azure.ai.*/schemas/*.jsonglob ensures cross-extension `` resolution works offline.- Marshal failure logging makes the validation skip visible during troubleshooting.
No new issues found in this increment.
| @@ -1,7 +1,7 @@ | |||
| // Copyright (c) Microsoft Corporation. All rights reserved. | |||
There was a problem hiding this comment.
Why are these files moved into the top level azd pkg?
| // instead of service-level properties. Tracks migration of older Foundry | ||
| // projects off the legacy shape. | ||
| FoundryAgentLegacyConfigKey = AttributeKey{ | ||
| Key: attribute.Key("foundry.agent.legacy_config_shape"), |
There was a problem hiding this comment.
What actually sets this? Is this something the azd folks are alright with being added to telemetry?
| @@ -0,0 +1,496 @@ | |||
| // Copyright (c) Microsoft Corporation. All rights reserved. | |||
There was a problem hiding this comment.
What is the plan to stop supporting old formats? Adding this file now gives at least a third place where we have agent objects defined, making it all that much more difficult to make sure things are being updated in the right places. Would be good if we have an in place plan to switch completely and remove all of the extra logic
| @@ -1,6 +1,6 @@ | |||
| { | |||
There was a problem hiding this comment.
Why do we need both these files, and the individual schemas in the other extensions?
Summary
azure.yamlbecomes the single source of truth for a Foundry agent and its companion resources. The agent definition moves off disk into the agent's service entry, and every Foundry resource — the project and its model deployments, connections, toolboxes, skills, and routines — becomes a first-class service with a realazd deploytarget. Existing projects keep working through legacy fallbacks that emit deprecation signals.What changed
azure.yaml. Carried inline on theazure.ai.agentservice entry instead of a separate on-diskagent.yaml.initstops writing that file; every agent command (deploy, run, listen, update, endpoint show) reads from the service entry. Projects on the old on-disk shape still load, with a deprecation warning and a telemetry signal.host: azure.ai.<kind>entries, each owned by its extension.azd deploynow performs a real upsert per resource (previously placeholders);initwrites the entries and wires their provision/deploy order.endpoint:). When a Foundry service setsendpoint:, provisioning is skipped and azd connects to that project instead of creating one.${VAR}from the azd environment at deploy time, while server-side${{...}}placeholders are preserved.azure.yamlJSON schema (stable + alpha) gains per-host validation for all six Foundry hosts; the legacy singlemicrosoft.foundryhost stays accepted so existing projects keep validating.$refresolver, theazure.yamledit helper, and the${VAR}expander move into core (pkg/foundry) so all Foundry extensions share one implementation.Before merge
The
connections,routines, andtoolboxesextensions temporarilyreplacethe azd core dependency with the in-tree checkout to build against the newpkg/foundryhelpers. Remove these directives once the core change lands and bump the published dependency.