-
Notifications
You must be signed in to change notification settings - Fork 83
Fix race condition causing crash when setting content filter during spin #1372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a race condition that was causing crashes when setting content filters on subscriptions while the node is spinning. The fix involves stopping the subscriber node before modifying the content filter, then restarting it afterwards.
Changes:
- Modified C++ error handling to capture error strings before calling cleanup operations
- Updated test cases to stop spinning before calling
setContentFilter()to prevent race conditions - Added
stop()calls at the end of async test promises to clean up properly
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/rcl_subscription_bindings.cpp | Fixed error handling by capturing error string before cleanup operations that may overwrite it |
| test/test-subscription-content-filter.js | Added stop/spin pattern around setContentFilter() calls and cleanup stop() calls in async promises |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/rcl_subscription_bindings.cpp
Outdated
| std::string error_string; | ||
| if (ret != RCL_RET_OK) { | ||
| error_string = rcl_get_error_string().str; | ||
| rcl_reset_error(); | ||
| } |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable error_string is declared without initialization and is only assigned when ret != RCL_RET_OK. If fini_ret != RCL_RET_OK but ret == RCL_RET_OK (line 300-305), the code will reuse this uninitialized error_string. Consider initializing it to an empty string or using a separate variable for the fini_ret error.
|
|
||
| const p2 = new Promise((resolve) => | ||
| setTimeout(() => { | ||
| this.subscriberNode.stop(); |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is stopping the node inside an async promise timeout without restarting it before the test ends. Unlike lines 299-301 and 372-374, this creates an inconsistent pattern where the node remains stopped. This could cause issues if the test framework expects proper cleanup. Consider whether the node should be left spinning or if this is intentional cleanup.
|
|
||
| const p2 = new Promise((resolve) => | ||
| setTimeout(() => { | ||
| this.subscriberNode.stop(); |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the previous issue, this test stops the node inside an async promise timeout without restarting it before the test ends. This creates an inconsistent pattern with the stop/spin/stop pattern used around setContentFilter() calls (lines 299-301 and 372-374). Consider whether the node should be left spinning or if this is intentional cleanup.
Investigated and resolved an intermittent crash (
SIGABRT/ "pure virtual method called") occurring in the underlying FastDDS layer.Crash log
✔ setContentFilter (2003ms) pure virtual method called terminate called without an active exception Aborted (core dumped) Error: Process completed with exit code 134.Callstack
Root Cause Analysis:
The crash was caused by a race condition between the main JavaScript thread and the background worker thread:
node.spin()is called to run therclexecutor and wait for messages.subscription.setContentFilter()from the main thread invokesrcl_subscription_set_content_filter, which modifies the underlying DDS DataReader.rcl_waitor checking for new data), it leads to memory corruption or accessing an object in an invalid state, resulting in the "pure virtual method called" termination.Fix Implementation:
Updated tests, examples, and documentation to enforce a thread-safe pattern. The node must now explicitly stop the background executor before modifying the content filter, and restart it afterwards:
Fix: #1368