-
Notifications
You must be signed in to change notification settings - Fork 386
Description
Summary
Unconfiguring a hardware component using the ~/set_hardware_component_state service results in code failures. This occurs because the on_cleanup() function removes resources while the read and write functions of the hardware component are still attempting to access those resources. The issue arises because the primary state is not set to UNCONFIGURED until after the on_cleanup function is executed, rather than before it is called.
Problem Description
- After
~/set_hardware_component_stateservice call is invoked, theset_hardware_component_state_srv_cb()function is executed. This function then callsset_component_state()function. In this function, thecleanup_hardware()function is called when the target state isPRIMARY_STATE_UNCONFIGURED. - The
cleanup_hardwarefunction uses the bind method to call thecleanupmethod of either the SystemInterface, SensorInterface, or ActuatorInterface class, depending on the type of hardware component. Thecleanup()function, in turn, calls theon_cleanupfunction of the hardware component class. In this context, theon_cleanupfunction is called in theURPositionHardwareInterfaceclass, which is defined in theUniversal_Robots_ROS2_Driverrepository and inherits fromhardware_interface::SystemInterface. - The
on_cleanupfunction removes and unassigns pointers, as well as cleans up threads, while the read and write functions of theControllerManagerclass continue running. The read and write functions in theControllerManagerclass call the corresponding read and write functions in theResourceManagerclass, which then invoke the read and write functions of the hardware components from theSystemInterface,SensorInterface, orActuatorInterfaceclasses. These functions first check if the state isPRIMARY_STATE_INACTIVEorPRIMARY_STATE_ACTIVEbefore executing the read and write operations on the hardware component. - If the
on_cleanupfunction is called and removes some of the resources while the state of the robot has not yet been set toUNCONFIGURED, the read and write functions of the hardware component can still be called. Since these functions attempt to access resources that have already been removed, this can result in code crashes.
Environment:
- OS: Ubuntu 20.04
- Version: Humble
Proposed Solution
To prevent such crashes, it's suggested to ensure that the state is properly set to UNCONFIGURED before any resources are cleaned up. This way, the read and write functions will not be invoked after resources have been removed, avoiding access to invalid or dangling pointers. Therefore, it is suggested to modify the cleanup() function in SystemInterface, SensorInterface or ActuatorInterface classes as follows
const rclcpp_lifecycle::State & System::cleanup()
{
if (impl_->get_state().id() == lifecycle_msgs::msg::State::PRIMARY_STATE_INACTIVE)
{
impl_->set_state(rclcpp_lifecycle::State(
lifecycle_msgs::msg::State::PRIMARY_STATE_UNCONFIGURED,
lifecycle_state_names::UNCONFIGURED));
switch (impl_->on_cleanup(impl_->get_state()))
{
case CallbackReturn::SUCCESS:
break;
case CallbackReturn::FAILURE:
case CallbackReturn::ERROR:
impl_->set_state(error());
break;
}
}
return impl_->get_state();
}