You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
My question relates to the RL results in Table 3 of the paper. I’m trying to use the iclr19 branch to generate at least 10 such results (for each level) to get stable mean and variance. The train_rl.py script seems to do almost everything required. But at the bottom of that file, after calculating the success rate over the (default of 512) episodes that were tested, the success rate is not actually logged. The mean return is logged instead.
Adding the following line (right after the calculation of success_rate) seems to log the missing number:
logger.info("Success rate {: .4f} reached after {} training episodes".format(success_rate, status['num_episodes']))
Also, it seems that the default save_interval of 1000 is too large for some of the easier levels. For instance, to get sufficiently frequent tests on GoToRedBallGrey, I call the script like this:
Then to obtain the sample efficiency, I just look in the log for the first success rate to exceed 0.99, and take the number of training episodes up to that point. For seed=1, it happens on this line:
main: 2019-06-17 01:27:36,671: Success rate 0.9922 reached after 30769 training episodes
Is this the right way to generate more RL results like those in Table 3? Or is there an easier way?
Thank you for this excellent environment!
The text was updated successfully, but these errors were encountered:
Thanks for your question. In fact, we are using .csv logs to compute when 99% success rate is reached. There is a PR underway that automates the process, you can try using the rl_dataeff.py script from it:
Hi,
My question relates to the RL results in Table 3 of the paper. I’m trying to use the iclr19 branch to generate at least 10 such results (for each level) to get stable mean and variance. The train_rl.py script seems to do almost everything required. But at the bottom of that file, after calculating the success rate over the (default of 512) episodes that were tested, the success rate is not actually logged. The mean return is logged instead.
Adding the following line (right after the calculation of success_rate) seems to log the missing number:
logger.info("Success rate {: .4f} reached after {} training episodes".format(success_rate, status['num_episodes']))
Also, it seems that the default save_interval of 1000 is too large for some of the easier levels. For instance, to get sufficiently frequent tests on GoToRedBallGrey, I call the script like this:
python scripts/train_rl.py --env BabyAI-GoToRedBallGrey-v0 --save-interval 10
Then to obtain the sample efficiency, I just look in the log for the first success rate to exceed 0.99, and take the number of training episodes up to that point. For seed=1, it happens on this line:
main: 2019-06-17 01:27:36,671: Success rate 0.9922 reached after 30769 training episodes
Is this the right way to generate more RL results like those in Table 3? Or is there an easier way?
Thank you for this excellent environment!
The text was updated successfully, but these errors were encountered: