Fix data race in NetworkTransport send/receive continuations#207
Closed
1amageek wants to merge 3 commits intomodelcontextprotocol:mainfrom
Closed
Fix data race in NetworkTransport send/receive continuations#2071amageek wants to merge 3 commits intomodelcontextprotocol:mainfrom
1amageek wants to merge 3 commits intomodelcontextprotocol:mainfrom
Conversation
Remove Task { @mainactor in } wrapper and mutable continuation guard
flags that caused data races under Swift 6 strict concurrency.
NWConnection completion handlers are guaranteed to fire exactly once,
making the double-resume guard unnecessary.
- Extract reconnection logic into actor-isolated handleSendError()
- Resume continuations directly in NWConnection callbacks
- Simplify receiveData() by removing unnecessary guard pattern
- Concurrent sends: 100 parallel sends to surface continuation races - Send error with reconnection: verify continuation resumes and handleSendError triggers reconnection asynchronously - Send error without reconnection: verify no reconnection attempt - Receive error: verify continuation resumes without crash - Concurrent sends with intermittent errors: mix of success/failure to verify continuation is always resumed exactly once
- handleSendError: remove fixed 500ms sleep before reconnection. connect() already handles backoff via ReconnectionConfiguration. Properly catch connect() errors instead of swallowing with try?. - Tests: remove Task.sleep-based assertions. Verify send() throws immediately; reconnection behavior is covered by existing tests.
Collaborator
|
Thanks for the PR, I suppose the issue should be fixed here - https://github.com/modelcontextprotocol/swift-sdk/pull/206 but you PR definitely give some thoughts about general improvement of NetworkTransport. I will close this PR because the issue should be fixed already. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Task { @MainActor in }wrapper and mutablevarcontinuation guard flags (sendContinuationResumed/receiveContinuationResumed) that caused data races under Swift 6 strict concurrencyCheckedContinuationdirectly inNWConnectioncallbacks — Apple guarantees these fire exactly once, making double-resume guards unnecessaryhandleSendError(_:)method for safe access toisStopping,isConnected, andreconnectionConfigProblem
The
send()andreceiveData()methods use localvarflags captured by@Sendableclosures to guard against double continuation resume:These flags are neither actor-isolated nor otherwise synchronized, resulting in data race diagnostics under Swift 6 strict concurrency checking.
Fix
Since
NWConnection.send(completion: .contentProcessed)andNWConnection.receivecallbacks are each invoked exactly once per call, the double-resume guard is unnecessary. The fix:varflags andTask { @MainActor in }wrapperscontinuation.resume()directly in the NWConnection callbackhandleSendError(_:), an actor-isolated method that safely accesses actor stateTest plan
Concurrent sends do not cause data races— 100 parallel sends to verify no continuation raceSend error resumes continuation and triggers reconnection— error resumes immediately,handleSendErrorreconnects asynchronouslySend error without reconnection enabled does not reconnect— no reconnection when disabledReceive error resumes continuation without crash— receive error surfaces correctlyConcurrent sends with intermittent errors— mixed success/failure verifies continuation always resumes exactly once