Sync with Microsoft ONNX Runtime - 09062026#1127
Merged
Merged
Conversation
### Description Limit CUDA compile concurrency for NodeJS build to 1. ### Motivation and Context NodeJS is hanging on Nuget CUDA 13 pipeline. microsoft#28736 introduced a hack fix that explicitly limited concurrency for CUDA compilation for some tasks, but did not apply it to NodeJS. Apply the same hack to NodeJS.
### Summary Two new ML Program op builders, both produced by transformer attention-mask graphs: - **`Where` → ML Program `select`.** `WhereOpBuilder` gates the X/Y branches to float / float16 and requires `cond` to be bool. - **`And` → ML Program `logical_and`,** via a new `LogicalOpBuilder`. Inputs must be bool. Both are ML-Program-only; `IsOpSupportedImpl` rejects them on the NeuralNetwork format so such nodes fall back to CPU. ### Depends on the bool-Cast PR `And`'s inputs and output are all bool, and a CoreML partition cannot have bool I/O, so a meaningful `And` test sandwiches it between `int ↔ bool` casts (the bool stays internal). This branch is therefore **stacked on `coreml-cast-bool`** — the `cb43b7c75f` commit in this PR is the bool-Cast PR and will drop from this diff once that one merges (via `git merge main`). `Where` needs no such scaffolding: its `cond` can be a constant initializer and X/Y/output are float. ### Tests (`coreml_basic_test.cc`) - `Where_MLProgram` — Where with a constant bool `cond` runs on CoreML, matches CPU. - `WhereNeuralNetworkNotSupported` — Where falls back on the NeuralNetwork format. - `WhereNonFloatBranchesNotSupported` — an int32 Where falls back to CPU. - `And_MLProgram` — a `Cast → And → Cast` chain runs fully on CoreML, matches CPU. - `AndNeuralNetworkNotSupported` — the chain falls back on the NeuralNetwork format. Doc: `coreml_supported_mlprogram_ops.md` lists `And` and `Where`. ### Series — CoreML EP coverage for transformer / diffusion graphs - microsoft#28595 — Support bool Cast in ML Program *(prerequisite)* - microsoft#28596 — Add Sin and Cos unary ops *(independent)* - **microsoft#28597 — Add Where and And builders** *(this PR — depends on microsoft#28595)* - microsoft#28598 — Add GatherND builder *(depends on microsoft#28595)* Together with microsoft#28278 (scalar-`Gather`), the series takes BERT / GPT-2 / ViT / diffusion-UNet graphs — tiny and full-size — from 2 CoreML partitions to 1, with zero graph breaks. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ft#28839) ### Description > **NOTE: Replacement of microsoft#28781. The original PR was accidentally affected by an unexpected agentic AI commit.** Fix a pytest collection error in the Windows GPU CUDA CI pipeline caused by `test_convert_tf_models_to_pytorch.py` failing to locate `convert_tf_models_to_pytorch.py` at module load time. Two changes were made: - **`onnxruntime/test/python/transformers/test_convert_tf_models_to_pytorch.py`**: Updated the path resolution logic to first check if `convert_tf_models_to_pytorch.py` is in the same directory as the test file (the case after CMake copies it into the build output), then fall back to the correct source-tree path using `parents[2]` (resolving to the `onnxruntime/` subdirectory, three levels up from `test/python/transformers/`). - **`cmake/onnxruntime_python.cmake`**: Added a `cmake -E copy` command to deploy `convert_tf_models_to_pytorch.py` from `onnxruntime/python/tools/transformers/` into the `transformers/` build output directory alongside the existing test files, so the first-choice path resolution works in CI. ### Motivation and Context The CI job `Windows GPU CUDA CI Pipeline Test Job` was failing during pytest collection with: ``` FileNotFoundError: 'D:\\a\\_work\\_temp\\onnxruntime\\python\\tools\\transformers\\convert_tf_models_to_pytorch.py' ``` The test used a hardcoded `parents[4]` offset assuming the test file resided at `onnxruntime/test/python/transformers/` in the source tree (where `parents[4]` = repo root). In CI, pytest test files are copied to the build output directory (e.g. `$runner_temp/build/RelWithDebInfo/RelWithDebInfo/transformers/`), where `parents[4]` resolves to the runner's temp directory rather than the workspace root, so the source file was never found and collection aborted with exit code 2. --------- Co-authored-by: Copilot <copilot@github.com>
…soft#28800) This pull request improves memory management and exception safety in the ONNX Runtime Model Editor C and C++ APIs, particularly around ownership transfer of graph/model components (inputs, outputs, initializers, nodes, and graphs). The changes ensure that ownership is only transferred on success, preventing double-free and dangling pointer issues, and update documentation and types to reflect the new ownership semantics. **Key changes:** ### Memory Management and Ownership Semantics - Added custom deleters (`OrtValueDeleter`, `OrtValueInfoDeleter`, `OrtNodeDeleter`, `OrtGraphDeleter`) for all major ONNX Runtime types, ensuring destruction always routes through the correct API release functions and preventing accidental double-free or memory leaks. (`onnxruntime/core/graph/model_editor_api_types.h`) - Updated internal storage in `ModelEditorGraph` to use `unique_ptr` with the appropriate custom deleters for inputs, outputs, initializers, and nodes, enforcing correct ownership and destruction. (`onnxruntime/core/graph/model_editor_api_types.h`) ### API and Documentation Improvements - Clarified and expanded documentation for ownership transfer and atomicity in the C API (`onnxruntime_c_api.h`), specifying all-or-nothing behavior: ownership is only transferred on success, and pointers are nulled out to make the transfer explicit. (`onnxruntime/core/session/onnxruntime_c_api.h`) [[1]](diffhunk://#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48L7775-R7780) [[2]](diffhunk://#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48L7791-R7801) [[3]](diffhunk://#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48L7806-R7842) [[4]](diffhunk://#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48L7838-R7867) [[5]](diffhunk://#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48L7887-R7910) - Updated C++ API comments and signatures to reflect strong exception safety: ownership is only transferred if the operation succeeds, and on failure, input objects remain unchanged and owned by the caller. (`onnxruntime/core/session/onnxruntime_cxx_api.h`) [[1]](diffhunk://#diff-17f64e8b38fcdcd25e90abcabeec4b420956b15fe63868a5d0b270c376bde209L3624-R3638) [[2]](diffhunk://#diff-17f64e8b38fcdcd25e90abcabeec4b420956b15fe63868a5d0b270c376bde209L3664-R3674) ### Implementation Updates - Modified C++ API implementations to transfer ownership only after a successful call, using `release()` only after the API call succeeds. This pattern is now used for adding initializers, nodes, and graphs. (`onnxruntime/core/session/onnxruntime_cxx_inline.h`) - Updated model editor code to handle new pointer types and ownership semantics, including moving out of `unique_ptr` when consuming initializers. (`onnxruntime/core/graph/graph.cc`) [[1]](diffhunk://#diff-e231a92b40d89409cc8e82436be0a15bc87ef95c93b303b9feaeab6e50c8835cL6846-R6848) [[2]](diffhunk://#diff-e231a92b40d89409cc8e82436be0a15bc87ef95c93b303b9feaeab6e50c8835cL6869-R6882) [[3]](diffhunk://#diff-e231a92b40d89409cc8e82436be0a15bc87ef95c93b303b9feaeab6e50c8835cL6895-R6903) ### Minor Cleanups - Removed unused or redundant `owned_` flags from model editor types, as ownership is now tracked via smart pointers. (`onnxruntime/core/graph/model_editor_api_types.h`) [[1]](diffhunk://#diff-495de13c86ea3c2f1eb9522cd8f9e3b8128eb58e673e47fb75868a378983f0c4L84) [[2]](diffhunk://#diff-495de13c86ea3c2f1eb9522cd8f9e3b8128eb58e673e47fb75868a378983f0c4L158) - Improved documentation consistency and removed unnecessary `OrtApi::` prefixes in comments. (`onnxruntime/core/session/onnxruntime_c_api.h`) [[1]](diffhunk://#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48L7907-R7928) [[2]](diffhunk://#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48L7927-R7954) [[3]](diffhunk://#diff-5845a5c76fb64abdc8f0cffe21b37f8da1712674eb3abc4cd87190891be1bd48L7953-R7980) These changes collectively make the ONNX Runtime Model Editor API safer and more robust, especially in the face of errors or exceptions.
ankitm3k
approved these changes
Jun 9, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Automated daily backmerge from ORT main to ovep-develop. No conflicts detected. Do NOT squash or rebase - use merge commit only.