Skip to content

Make sleep in ros2_control node optional#3213

Draft
urfeex wants to merge 6 commits intoros-controls:masterfrom
urfeex:make_sleep_optional
Draft

Make sleep in ros2_control node optional#3213
urfeex wants to merge 6 commits intoros-controls:masterfrom
urfeex:make_sleep_optional

Conversation

@urfeex
Copy link
Copy Markdown
Contributor

@urfeex urfeex commented Apr 13, 2026

Description

The PR makes the sleep that paces the controller manager loop optional. This can be useful in situations where

  • Hardware sends data from an internal control loop at a fixed rate
  • The hardware interface has a blocking read on that data
  • The blocking hardware interface is the only one running in the controller_manager loop or the other ones aren't significantly affected by that.

I am aware, that this would only cover a subset of all applications facing this issue. I was holding this back because of the efforts made in ros-controls/realtime_tools#478 but since this will require more designing, I wanted to propose this simple fix for simple systems at least.

Did you use Generative AI?

no

Additional Information

I've implemented a PoC using this with the ur_robot_driver in UniversalRobots/Universal_Robots_ROS2_Driver#1760. There, I also describe the impact on joint control that this would have.

TODOs

This is a draft PR for now, as a couple of things aren't finalized, yet. However, I would like some input on the first two points on the list below.

  • Verify concept -- I would like to verify that this concept would be accepted before implementing further
  • Finalize parameter naming -- I just picked a parameter name as I thought it made sense from our use case, but it might not be the best one to describe the behavior.
  • Add tests

@urfeex
Copy link
Copy Markdown
Contributor Author

urfeex commented Apr 13, 2026

@destogl @saikishor @bmagyar @christophfroehlich as discussed in the last meeting. This is basically what I had in mind. Could we move forward with that approach?

Copy link
Copy Markdown
Member

@christophfroehlich christophfroehlich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this approach as it is a very simple solution for the setup you have described. I'll test this on my hardware soon.

Comment thread controller_manager/src/ros2_control_node.cpp Outdated
Comment thread controller_manager/src/controller_manager_parameters.yaml Outdated
urfeex and others added 2 commits April 13, 2026 15:47
Co-authored-by: Christoph Fröhlich <christophfroehlich@users.noreply.github.com>
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 13, 2026

Codecov Report

❌ Patch coverage is 50.00000% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.02%. Comparing base (60a7244) to head (695a869).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
controller_manager/src/ros2_control_node.cpp 50.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3213      +/-   ##
==========================================
- Coverage   89.04%   89.02%   -0.03%     
==========================================
  Files         158      158              
  Lines       19587    19569      -18     
  Branches     1589     1586       -3     
==========================================
- Hits        17442    17422      -20     
- Misses       1486     1488       +2     
  Partials      659      659              
Flag Coverage Δ
unittests 89.02% <50.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
controller_manager/src/ros2_control_node.cpp 54.54% <50.00%> (-1.34%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@christophfroehlich
Copy link
Copy Markdown
Member

I tested this with my hardware, and the startup does not work reliably. But I had to switch the communication library to blocking mode, so I don't know if your change or the library is the culprit.
The system hangs at startup, the rclcpp log is already stopped at loading the hardware component but maybe the logging system already hangs.

For discussion: Is this approach safe?

  • can it ever reach the update even if the system is not at least in inactive state and then burn the CPU because the library may be not connected and not blocking?
  • What happens if the library is buggy, and it sometime or suddenly does not block: Should there be some minimum wait time to avoid crashing the whole system with infinite update rate?

@urfeex
Copy link
Copy Markdown
Contributor Author

urfeex commented Apr 13, 2026

Thanks for testing, I will look into why that could happen.

@urfeex
Copy link
Copy Markdown
Contributor Author

urfeex commented Apr 14, 2026

I've tested around a bit and also implemented a quick testing hardware interface that triggers read() using a fixed timer. I didn't experience any startup issues or even deadlocks. I'll try to clean that up a bit to make it useful for testing edge cases.

Copy link
Copy Markdown
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a couple of thoughts I had in mind

Comment thread controller_manager/src/controller_manager_parameters.yaml Outdated
Comment on lines +171 to +175
else if (hw_sync_enable)
{
std::this_thread::sleep_for(
std::chrono::microseconds(static_cast<int>(hw_sync_min_sleep_time * 1e6)));
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to cause a drift in a long run, does it make sense to measure the time of read -> update -> write and if it doesn't sleep sufficient, them we sleep here with a thorrtle warning

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this will cause a drift in the fault case where read is not blocking. However, for the "designed" use case it shouldn't add any drift as long as hw_sync_min_sleep_time + update time + write time + read time < cycle_time.

However, measuring the cycle time might be the better approach, agreed.

Copy link
Copy Markdown
Member

@saikishor saikishor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me. @urfeex I'm wondering one thing here, if we should use the hw_sync_min_cycle_time to do the comparison or do something like atleast 50% update cycle time has passed or not?. It's just that, if you have multiple controllers, you might reach a microsecond easily, that's why.

What's your take on this? Or do you prefer to keep it this way?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants