Description
When a controller's update() returns ERROR, the controller manager deactivates the entire chain group and then collects fallback controllers from every deactivated controller — not just the one that actually failed. This causes unintended fallback activations and resource conflicts.
Reproduction
Two controllers (controller_a and controller_b) each read a state interface exported by a shared ChainableControllerInterface (controller_hub). This forms a single chain group. Each has its own fallback:
controller_manager:
ros__parameters:
controller_hub:
type: my_package/ControllerHub
controller_a:
type: my_package/ControllerA
fallback_controllers: ["controller_a_safe"]
controller_b:
type: my_package/ControllerB
fallback_controllers: ["controller_b_safe"]
controller_a_safe:
type: my_package/ControllerASafe
controller_b_safe:
type: my_package/ControllerBSafe
Chain group topology:
controller_a ─┐
controller_b ─┤── read "controller_hub/some_state" ──► controller_hub (ChainableControllerInterface)
controller_a_safe ─┤
controller_b_safe ─┘
All 5 controllers end up in the same chain group because they share a chained interface through controller_hub.
When controller_a fails:
Expected:
Deactivating controllers : [ controller_a controller_b controller_a_safe controller_b_safe controller_hub ]
Activating fallback controllers : [ controller_a_safe ]
Only controller_a's fallback is activated.
Actual:
Deactivating controllers : [ controller_a controller_b controller_a_safe controller_b_safe controller_hub ]
Activating fallback controllers : [ controller_a_safe controller_b_safe ]
Both controller_a_safe and controller_b_safe are activated — even though controller_b did not fail. Since both claim the same command interfaces, this causes:
Resource conflict for controller 'controller_b_safe'. Command interface 'some_joint/position' is already claimed.
Root Cause
Step 1: When a controller's update() returns ERROR, the entire chain group is added to the deactivate list (controller_manager.cpp:3279-3292):
if (controller_ret != controller_interface::return_type::OK)
{
const std::vector<std::string> & controller_chain =
loaded_controller.controllers_chain_group;
for (const auto & chained_controller : controller_chain)
{
ros2_control::add_item(rt_buffer_.deactivate_controllers_list, chained_controller);
}
}
Step 2: Fallback controllers are then collected from every controller in the deactivate list — not just the one that failed (controller_manager.cpp:3301-3316):
for (const auto & failed_ctrl : rt_buffer_.deactivate_controllers_list)
{
// finds ctrl_it for each deactivated controller ...
for (const auto & fallback_controller : ctrl_it->info.fallback_controllers_names)
{
rt_buffer_.fallback_controllers_list.push_back(fallback_controller);
}
}
Since deactivate_controllers_list contains the entire chain group, every member's fallback list gets merged — even for controllers that did not fail.
Step 3: The merged list is activated with STRICT mode (controller_manager.cpp:3340-3345), which fails due to resource conflicts.
Proposed Fix
Only collect fallback controllers from the controller(s) that actually returned ERROR, not from every controller in the chain group:
// Track only the actually-failed controllers (before expanding to chain group)
if (controller_ret != controller_interface::return_type::OK)
{
ros2_control::add_item(rt_buffer_.failed_controllers_list, loaded_controller.info.name);
// ... existing chain group deactivation logic ...
}
// Later, collect fallbacks only from actually-failed controllers:
for (const auto & failed_ctrl : rt_buffer_.failed_controllers_list) // <-- changed
{
// ... existing fallback collection logic ...
}
Additionally, the fallback list should be deduplicated before activation.
Environment
- ros2_control:
master branch (commit c569292)
- ROS 2 distro: Jazzy / Rolling
Description
When a controller's
update()returnsERROR, the controller manager deactivates the entire chain group and then collects fallback controllers from every deactivated controller — not just the one that actually failed. This causes unintended fallback activations and resource conflicts.Reproduction
Two controllers (
controller_aandcontroller_b) each read a state interface exported by a sharedChainableControllerInterface(controller_hub). This forms a single chain group. Each has its own fallback:Chain group topology:
All 5 controllers end up in the same chain group because they share a chained interface through
controller_hub.When
controller_afails:Expected:
Only
controller_a's fallback is activated.Actual:
Both
controller_a_safeandcontroller_b_safeare activated — even thoughcontroller_bdid not fail. Since both claim the same command interfaces, this causes:Root Cause
Step 1: When a controller's
update()returnsERROR, the entire chain group is added to the deactivate list (controller_manager.cpp:3279-3292):Step 2: Fallback controllers are then collected from every controller in the deactivate list — not just the one that failed (
controller_manager.cpp:3301-3316):Since
deactivate_controllers_listcontains the entire chain group, every member's fallback list gets merged — even for controllers that did not fail.Step 3: The merged list is activated with
STRICTmode (controller_manager.cpp:3340-3345), which fails due to resource conflicts.Proposed Fix
Only collect fallback controllers from the controller(s) that actually returned
ERROR, not from every controller in the chain group:Additionally, the fallback list should be deduplicated before activation.
Environment
masterbranch (commitc569292)