[SYCL][UR][L0] Restrict USM residency to peers with enabled P2P access#21889
Conversation
7e50ca5 to
d0b4788
Compare
There was a problem hiding this comment.
Pull request overview
Implements peer-access–driven memory residency management for the Level Zero v2 adapter, wiring ext_oneapi_enable_peer_access/disable through UR to update USM pool residency, and adjusting SYCL’s peer-access API to avoid cross-platform usage.
Changes:
- Add L0 v2 peer-access implementation that toggles per-device peer state and propagates residency updates to all tracked contexts.
- Extend USM pool/provider plumbing to support runtime resident-device changes and add pool-manager iteration helpers.
- Update SYCL peer-access enable/disable to validate platforms; add initial (currently placeholder) UR tests.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| unified-runtime/test/adapters/level_zero/v2/memory_residency.cpp | Adds multi-device residency tests for peer-access (currently placeholders). |
| unified-runtime/source/common/ur_pool_manager.hpp | Adds descriptor helpers and pool-manager iteration with descriptor access. |
| unified-runtime/source/common/backtrace_lin.cpp | Introduces a constant for max backtrace frames. |
| unified-runtime/source/adapters/level_zero/v2/usm_p2p.cpp | New L0 v2 implementation for peer access enable/disable/info and context propagation. |
| unified-runtime/source/adapters/level_zero/v2/usm.hpp | Exposes USM pool API to change resident devices. |
| unified-runtime/source/adapters/level_zero/v2/usm.cpp | Updates provider creation to use peer-enabled residency model and adds residency-change plumbing. |
| unified-runtime/source/adapters/level_zero/v2/memory.cpp | Switches P2P eligibility check to the new “enabled peers” model. |
| unified-runtime/source/adapters/level_zero/v2/context.hpp | Adds APIs to query enabled peer relationships and to propagate residency changes. |
| unified-runtime/source/adapters/level_zero/v2/context.cpp | Removes precomputed P2P tables; tracks contexts; adds peer-access query helpers and residency propagation. |
| unified-runtime/source/adapters/level_zero/usm_p2p.cpp | Updates v1 behavior to log enable/disable as ignored (always enabled). |
| unified-runtime/source/adapters/level_zero/platform.hpp | Updates platform comment to reflect v2 peer-access usage of tracked contexts. |
| unified-runtime/source/adapters/level_zero/platform.cpp | Initializes per-device peer tables based on L0 P2P capability/properties. |
| unified-runtime/source/adapters/level_zero/device.hpp | Adds peer-status table to devices and stream operators. |
| unified-runtime/source/adapters/level_zero/device.cpp | Implements stream operators for device id and peer status. |
| unified-runtime/source/adapters/level_zero/context.cpp | Minor comment adjustment around context tracking in v1. |
| unified-runtime/source/adapters/level_zero/CMakeLists.txt | Moves usm_p2p.cpp into the v2 adapter build. |
| sycl/source/device.cpp | Adds same-platform validation for enable/disable peer access calls. |
| .github/copilot-instructions.md | Expands repository instructions/documentation for Copilot usage. |
d0b4788 to
24eaab3
Compare
174e3f1 to
e2e57bf
Compare
e2e57bf to
8346b35
Compare
8346b35 to
9e50ee8
Compare
9e50ee8 to
65c706f
Compare
65c706f to
a2b54f4
Compare
a2b54f4 to
08a08ab
Compare
|
@intel/llvm-gatekeepers please consider merging |
|
@ldorau, rebase please. |
@kswiecicki Done |
- Skip peers with disabled P2P in makeProvider (USM pool creation) - Add urUsmP2PEnablePeerAccessExp / urUsmP2PDisablePeerAccessExp - Track per-device peer status in ur_device_handle_t_::peers[] - Update existing USM pool residency on P2P enable/disable Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
- Fill in three placeholder multi-device tests in memory_residency.cpp - Tests verify P2P-driven residency: absent-on-peer without P2P, enable/disable state machine checks, end-to-end data transfer Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Extract common logic from ext_oneapi_enable_peer_access and ext_oneapi_disable_peer_access into a templated p2pAccessHelper function to avoid code duplication. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The disablePeerAccessStateMachineAndSourceAllocationPersists test was failing intermittently because deferred frees from the preceding test complete asynchronously, causing UR_DEVICE_INFO_GLOBAL_MEM_FREE to report more free memory than the baseline captured at the start of the test. Remove the unreliable source-device free-memory assertion and the allocation it required, keeping only the state-machine checks (disable succeeds, double-disable returns UR_RESULT_ERROR_INVALID_OPERATION). The source-device allocation property is already covered by allocatingDeviceMemoryWillResultInOOM which runs first in isolation.
…oreEnablingPeerAccess Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The peer table lives on peerDevice: peerDevice->peers[commandDevice->Id] tracks whether commandDevice is allowed to access peerDevice's allocations. Update urUsmP2PChangePeerAccessExp to lock peerDevice's mutex, read/write peerDevice's peer table, use peerDevice's platform for context iteration, and pass (peerDevice, commandDevice) to changeResidentDevice and validateP2PDevicePair. Also fix urUsmP2PPeerAccessGetInfoExp to query the peer table on peerDevice rather than commandDevice. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Add a P2P peer-status check in command_list_manager::appendUSMMemcpy. When both source and destination are device memory and the source resides on a different device, the adapter queries the source device's peer table to verify that access has been granted to the queue's device. Returns UR_RESULT_ERROR_INVALID_OPERATION if P2P access has not been enabled. Copies to host or shared memory are always allowed regardless of P2P state. Previously, zeCommandListAppendMemoryCopy would silently succeed for cross-device copies via the copy engine regardless of P2P state, making it impossible to test that ext_oneapi_disable_peer_access actually revokes access. Also adds negative-pair tests that verify urEnqueueUSMMemcpy fails when P2P is disabled: - enablePeerAccessStateMachineAndSourceAllocationFailsWithoutP2P - p2pReadFailsWithoutPeerAccessDisabled Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Adds sycl/test-e2e/USM/P2P/p2p_usm_residency.cpp to verify that the Level Zero v2 adapter correctly handles P2P access for USM device memory between peer devices. Phase 1: Allocates memory on dev0, fills it with a known pattern, enables P2P access from dev1 to dev0, then uses dev1's queue to copy the data to the host and verifies all values match. Phase 2 (opposite direction): Allocates memory on dev1, fills it with a different pattern, enables P2P access from dev0 to dev1, then uses dev0's queue to copy the data to the host and verifies correctness. Phase 3 (negative): Enables then disables P2P access from dev1 to dev0, then attempts a memcpy via dev1's queue. The test passes if the memcpy throws an exception or if the copied data does not match the original fill pattern, confirming that ext_oneapi_disable_peer_access actually revokes access. Also adds the 'two-or-more-gpu-devices' lit feature to lit.cfg.py, set when sycl-ls reports at least two GPU devices. The test uses this feature to skip on single-GPU machines. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The source-device free-memory check in enablePeerAccessStateMachineAndSourceAllocation was failing intermittently because deferred frees from earlier tests complete asynchronously, causing UR_DEVICE_INFO_GLOBAL_MEM_FREE to report more free memory than the baseline. Remove the unreliable assertion. The test's actual value is in verifying the P2P state machine (double-enable returns error) and the end-to-end data transfer correctness. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Disable P2P access before freeing the allocation. The previous order freed the memory while P2P was still enabled, leaving a brief window where the peer device held access rights to a released allocation. The correct cleanup sequence is to revoke peer access first and then free the memory. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Replace the nested if-ladder in appendUSMMemcpy with a flat static helper function checkP2PAccess that uses early returns, as suggested in the review. The logic is identical; the refactoring makes the control flow easier to follow. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
…bled test The old comment claimed that P2P must be enabled before the allocation, which is incorrect. urUsmP2PEnablePeerAccessExp calls changeResidentDevice on all contexts after updating the peer-status table, which retroactively makes already-existing allocations resident on the peer device. Enabling P2P after allocation is therefore equally valid, as demonstrated by the allocBeforeEnablingPeerAccess test. Update the comment to reflect the actual adapter behaviour. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The test exercises exactly the same code path as the existing enablePeerAccessStateMachineAndSourceAllocationFailsWithoutP2P: both allocate on devices[0] without P2P, attempt a copy via devices[1]'s queue, and assert UR_RESULT_ERROR_INVALID_OPERATION. Remove the duplicate to avoid maintaining two tests that provide no additional coverage. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
The test was calling Devs[0].ext_oneapi_enable_peer_access(Devs[1]), which sets Devs[1]->peers[Devs[0].Id] = ENABLED, meaning Devs[0] can access Devs[1]'s memory. However the actual P2P copy was Queues[1].copy(arr0, arr1, N) — Devs[1]'s queue reading arr0 which lives on Devs[0] — which requires the opposite direction: Devs[0]->peers[Devs[1].Id] = ENABLED, set by Devs[1].ext_oneapi_enable_peer_access(Devs[0]). The wrong direction was harmless with the L0v1 adapter (no peer-access check on memcpy), but the L0v2 adapter's checkP2PAccess enforces the correct direction and returned UR_RESULT_ERROR_INVALID_OPERATION. Also add ext_oneapi_disable_peer_access before freeing arr0, consistent with the rule that peer access should be revoked before the guarded allocation is released. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Verify that revoking peer access prevents subsequent USM copies. The test enables P2P (devices[1] -> devices[0]), confirms that a urEnqueueUSMMemcpy succeeds, then disables P2P and asserts that the same copy returns UR_RESULT_ERROR_INVALID_OPERATION. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Verify the transition from blocked to permitted: attempt a USM copy from devices[1]'s queue without P2P (expects UR_RESULT_ERROR_INVALID_OPERATION), then enable P2P on the same allocation and retry (expects success with correct data). Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Add testP2PReadFailsWithoutEnable: allocates on dev0, attempts a device-to-device memcpy via dev1's queue without ever calling ext_oneapi_enable_peer_access, and asserts a SYCL exception is thrown. This covers the case where P2P was never enabled (Phase 3 already covers the revoke case). Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
… P2P enable Add testP2PReadFailsThenSucceedsAfterEnable: attempts a device-to-device memcpy without P2P (expects a SYCL exception), then enables P2P on the same allocation and retries (expects success with correct data), then disables P2P and cleans up. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
|
@kswiecicki I will update the PR's description in a few minutes. |
Done |
|
@mateuszpn Have merge conflicts in |
|
@intel/llvm-gatekeepers please consider merging |
|
Need to fix one thing. |
|
Fixed issue found after rebase and incorrectly resolving the last merge conflict - thank you @mateuszpn ! |
When urContextCreate reuses the driver default context via zeDriverGetDefaultContext, the resulting ur_context_handle_t_ was not added to hPlatform->Contexts. urUsmP2PEnablePeerAccessExp / urUsmP2PDisablePeerAccessExp iterate Platform->Contexts to call changeResidentDevice() on every live context, so any context created through that fast path would silently miss all P2P residency updates — memory would not be made resident on newly enabled peers, and would not be evicted on disable. Fix by adding the context to hPlatform->Contexts in the fast path, the same way the regular zeContextCreate path and urContextCreateWithNativeHandle already do. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
… memory residency tests The tests were using 1 MB allocations, which fall within the disjoint pool's MaxPoolableSize of 4 MB for device USM. Pooled allocations are not immediately returned to the UMF memory provider on free; they stay in the pool's free-list cache with stale per-device residency from the preceding test. This cross-test interference caused allocationNotResidentOnPeerWithoutP2P to observe a free-memory decrease on the peer device that exceeded allocSize, failing the assertion. Define a single file-scope constant kAllocSize = 5 MB (> 4 MB) and use it in all tests. Allocations above MaxPoolableSize bypass the pool's free-list cache and go directly to the UMF level_zero memory provider, where residency is set at allocation time based on the current P2P state, and released when the allocation is freed back to the OS. Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
|
@intel/llvm-gatekeepers please merge |
Implement P2P access control for the Level Zero v2 adapter so that USM
device memory is made resident on a peer device only when peer access has
been explicitly enabled by the user.
platform.cpp: Populate a per-device peers[] table during device-cache
initialisation using zeDeviceGetP2PProperties + zeDeviceCanAccessPeer.
Each connectable pair starts as DISABLED; pairs without hardware
connectivity are NO_CONNECTION.
device.hpp/cpp: Introduce PeerStatus enum (NO_CONNECTION / DISABLED /
ENABLED) and a peers[] vector on ur_device_handle_t_.
usm_p2p.cpp (v2, new): Implement P2P enable/disable/query. Enable/disable
update peerDevice->peers[commandDevice->Id] and call
Context->changeResidentDevice() for every context on the platform.
context.cpp/hpp: Replace the static p2pAccessDevices table and
getP2PDevices() with two dynamic helpers that consult the live peers[]
table: getDevicesWhoseAllocationsCanBeAccessedFrom() and
getDevicesWhichCanAccessAllocationsPresentOn(). Add
changeResidentDevice() to propagate P2P changes to all USM pools.
usm.cpp/hpp: Update makeProvider() to use
getDevicesWhichCanAccessAllocationsPresentOn() so providers only pin
memory on peers with actively enabled access. Add changeResidentDevice()
to ur_usm_pool_handle_t_ which calls
umfLevelZeroMemoryProviderResidentDeviceChange() for the affected peer.
command_list_manager.cpp: Add checkP2PAccess() helper and call it from
appendUSMMemcpy() to reject copies between devices without enabled P2P,
returning UR_RESULT_ERROR_INVALID_OPERATION.
ur_pool_manager.hpp: Add supportsResidentDevices() and
forEachPoolWithDesc() to pool_descriptor / pool_manager.
sycl/source/device.cpp: Extract p2pAccessHelper<> template shared by
ext_oneapi_enable_peer_access and ext_oneapi_disable_peer_access to
avoid duplication and enforce the same-platform constraint.
Co-authored-by: Łukasz Ślusarczyk <lukasz.slusarczyk at intel.com>
Co-authored-by: Lukasz Dorau <lukasz.dorau at intel.com>