Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased Failure Rate of Gazebo Service Calls During Long Simulations #564

Open
paoloelle opened this issue Jan 1, 2025 · 8 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@paoloelle
Copy link

paoloelle commented Jan 1, 2025

Environment

  • OS Version: Ubuntu 22.04
  • Source installation tested with Gazebo Fortress and with Gazebo Harmonic (using Apptainer containerization)

Description

I'm running an evolutionary algorithm using ROS 2 Humble and Gazebo Fortress (as I said before, I also tested it with ROS 2 Jazzy and Gazebo Harmonic via Apptainer containerization). To do this, I need to call Gazebo services to stop/start the simulation and move various objects in the environment, including my robot. What I noticed is that after calling these commands, the simulations' performance slows down, especially because the service call often fails. However, the real-time factor doesn't decrease.

To test this undesired behavior, I implemented a simple simulation scenario, without any evolutionary algorithm on top of it. This simulation consists of just stopping the simulation, moving the robot and the objects, and restarting the simulation for 10 seconds. An example is shown in the video below.

simple_sim-2024-12-20_09.56.47.mp4

I let this simple simulation run for 24 hours and plotted the related results.
These show the total number of service calls for each hour of simulation, the service calls that didn't give back a successful answer, and the percentage of service calls that gave a successful answer. Please ignore the hour 25.

Total Service Calls Per Hour

Failed Service Calls Per Hour

Success Percentage of Service Calls Per Hour

@paoloelle paoloelle added the bug Something isn't working label Jan 1, 2025
@paoloelle paoloelle changed the title Increased Failure Rate of Gazebo Service Calls During Long Simulations with ROS 2 Increased Failure Rate of Gazebo Service Calls During Long Simulations Jan 1, 2025
@azeey
Copy link
Contributor

azeey commented Jan 7, 2025

Thanks for the detailed investigation into this issue. We're taking a look at the problem. However, it looks like this is a duplicate of #562. If so, do you mind closing one of them?

@caguero
Copy link
Collaborator

caguero commented Jan 9, 2025

Is there any change you could share the code you use to reproduce this issue? Thanks!

@paoloelle
Copy link
Author

paoloelle commented Jan 10, 2025

Yes, sure. These folder contains the files that I used for the environment:
sdf_files.zip

This is the bash script that I used to start/stop the simulation and move the robot and the objects:

#!/bin/bash

START_TIME=$(date +%s)

log() {
    CURRENT_TIME=$(date +%s)
    ELAPSED=$((CURRENT_TIME - START_TIME))
    HOURS=$((ELAPSED / 3600))
    MINUTES=$(((ELAPSED % 3600) / 60))
    SECONDS=$((ELAPSED % 60))
    printf "[$(date +"%H:%M:%S")][%02d:%02d:%02d] %s\n" "$HOURS" "$MINUTES" "$SECONDS" "$1"
}

# SIMPLE SCRIPT TO MOVE CONTINUOSLY OBJECTS AND ROBOT IN SIMULATION

#simulation_id=0

COUNTER=0 # counter for the call to ign services

#export ROS_DOMAIN_ID=$simulation_id && export IGN_PARTITION=$simulation_id

# export IGN_VERBOSE=1 # print more debugging info

#export IGN_IP=127.0.0.1 # FIXME not sure if it fixes the problem of the export Exception sending a multicast message:Network is unreachable
                        # this command should not be necessary


fastdds shm clean # https://github.com/eProsima/Fast-DDS/issues/2003#issuecomment-1160245640
log ""

while true; do

    #ros2 param set /ann_controller genome_id test

    #ros2 param set /ann_controller update_genome_id True
    
    COUNTER=$((COUNTER + 1))
    log "call $COUNTER"
    log "[DEBUG] Start simulation"
    ign service -s /world/arena/control --reqtype ignition.msgs.WorldControl --reptype ignition.msgs.Boolean --timeout 5000 --req 'pause: true'

    COUNTER=$((COUNTER + 1))
    log "call $COUNTER" 
    log "[DEBUG] Respawn turtlebot"
    # ign service -s /world/arena/set_pose --reqtype ignition.msgs.Pose --reptype ignition.msgs.Boolean --req "name: 'turtlebot4' position: { x: 1, y: 1, z: 1 } orientation: { x: 0, y: 0, z: 0, w: 1}" --timeout 2000
    ign service -s /world/arena/set_pose --reqtype ignition.msgs.Pose --reptype ignition.msgs.Boolean --req 'name: "turtlebot4" position: { x: 1.0, y: 1.0, z: 0.2 } orientation: { x: 0.0, y: 0.0, z: 0.0, w: 1.0 }' --timeout 5000
    sleep 1

    COUNTER=$((COUNTER + 1))
    log "call $COUNTER" 
    log "[DEBUG] Respawn object1"
    ign service -s /world/arena/set_pose --reqtype ignition.msgs.Pose --reptype ignition.msgs.Boolean --req "name: 'object1' position: { x: -1.5, y: 3, z: 2} orientation: { x: 0.0, y: 0.0, z: 0.0, w: 1.0 }" --timeout 5000
    sleep 1

    COUNTER=$((COUNTER + 1))
    log "call $COUNTER" 
    log "[DEBUG] Respawn object2"
    ign service -s /world/arena/set_pose --reqtype ignition.msgs.Pose --reptype ignition.msgs.Boolean --req "name: 'object2' position: { x: -1.2, y: 3, z: 2} orientation: { x: 0.0, y: 0.0, z: 0.0, w: 1.0 }" --timeout 5000   
    sleep 1
    
    COUNTER=$((COUNTER + 1))
    log "call $COUNTER"    
    log "[DEBUG] Respawn object3"
    ign service -s /world/arena/set_pose --reqtype ignition.msgs.Pose --reptype ignition.msgs.Boolean --req "name: 'object3' position: { x: -0.8, y: 3, z: 2} orientation: { x: 0.0, y: 0.0, z: 0.0, w: 1.0 }" --timeout 5000
    sleep 1

    COUNTER=$((COUNTER + 1))
    log "call $COUNTER" 
    log "[DEBUG] Respawn object4"
    ign service -s /world/arena/set_pose --reqtype ignition.msgs.Pose --reptype ignition.msgs.Boolean --req "name: 'object4' position: { x: 0.2, y: 3, z: 2} orientation: { x: 0.0, y: 0.0, z: 0.0, w: 1.0 }" --timeout 5000   
    sleep 1

    COUNTER=$((COUNTER + 1))
    log "call $COUNTER"   
    log "[DEBUG] Respawn object5"
    ign service -s /world/arena/set_pose --reqtype ignition.msgs.Pose --reptype ignition.msgs.Boolean --req "name: 'object5' position: { x: 0.8, y: 3, z: 2} orientation: { x: 0.0, y: 0.0, z: 0.0, w: 1.0 }" --timeout 5000 
    sleep 1

    COUNTER=$((COUNTER + 1))
    log "call $COUNTER"   
    log "[DEBUG] Respawn object6"
    ign service -s /world/arena/set_pose --reqtype ignition.msgs.Pose --reptype ignition.msgs.Boolean --req "name: 'object6' position: { x: 1.1, y: 3, z: 2} orientation: { x: 0.0, y: 0.0, z: 0.0, w: 1.0 }" --timeout 5000 
    sleep 1

    COUNTER=$((COUNTER + 1))
    log "call $COUNTER"
    log "[DEBUG] Respawn object7"
    ign service -s /world/arena/set_pose --reqtype ignition.msgs.Pose --reptype ignition.msgs.Boolean --req "name: 'object7' position: { x: 1.3, y: 3, z: 2} orientation: { x: 0.0, y: 0.0, z: 0.0, w: 1.0 }" --timeout 5000
    sleep 1

    COUNTER=$((COUNTER + 1))
    log "call $COUNTER"
    log "[DEBUG] Stop simulation"
    ign service -s /world/arena/control --reqtype ignition.msgs.WorldControl --reptype ignition.msgs.Boolean --timeout 5000 --req 'pause: false'


    sleep 10


done

I didn't post the codes to simulate the TurtleBot 4 robot since there are various folders to download for the simulation.

@paoloelle
Copy link
Author

@caguero just a follow-up. I also tried the setup where I move only the position of the objects without launching anything related to the Turtlebot 4, and the problem still arises. I also tried moving only the position of the Turtlebot 4 without moving the objects and, also in this case, the same problem occurs.

@caguero
Copy link
Collaborator

caguero commented Jan 14, 2025

Thanks, that's useful information.

@azeey
Copy link
Contributor

azeey commented Jan 14, 2025

@paoloelle
Copy link
Author

paoloelle commented Jan 14, 2025

@azeey I didn't try with a simpler robot, anyway as I mentioned here

@caguero just a follow-up. I also tried the setup where I move only the position of the objects without launching anything related to the Turtlebot 4, and the problem still arises. I also tried moving only the position of the Turtlebot 4 without moving the objects and, also in this case, the same problem occurs.

I didn't spawn any robot at all in the test where I move only the objects.

@paoloelle
Copy link
Author

paoloelle commented Jan 16, 2025

@azeey I tried with the rrobot (I set it as a static entity) but the same thing happened. The output from the terminal where I launch Gazebo is NodeShared::RecvSrvRequest() error sending response: Host unreachable when I call a service and I receive back a Service call timed out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: In progress
Development

No branches or pull requests

3 participants