Logging RL results and tracking them with ModelCheckpoint(monitor=...) #5883
-
| I am using Pytorch Lightning in an RL setting and want to save a model when it hits a new max average reward. I am using the Tensorboard logger where I return my neural network loss in the  And then I am saving my RL environment rewards using in  And every 5 epochs I am also writing out another RL reward loss where I use the best actions rather than sampling from them: My question is, how can I set my ModelCheckpoint to monitor  I know that in the new PL version  I have spent a lot of time looking through the docs and for examples of this but I have found the logging docs on this to be quite sparse and difficult to even get everything to log in the first place. I am using Pytorch Lightning 1.0.5 and Pytorch 1.7.0. Thank you for any help/guidance. | 
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
| Hi! thanks for your contribution!, great first issue! | 
Beta Was this translation helpful? Give feedback.
-
| I have multiple comments that I did not verify yet but they might help 
 So in summary, I imagine something like this: # Model
def training_epoch_end(self, outputs):
    # ... compute reward losses
    
    if self.current_epoch % self.hparams['eval_every']==0:
        self.last_eval_mean = # compute the new eval mean
     self.log("eval_mean", self.last_eval_mean)
# Trainer
trainer = Trainer(callbacks=[ModelCheckpoint(monitor="eval_mean")]
# or maybe also try
trainer = Trainer(callbacks=[ModelCheckpoint(monitor="eval_mean", period=self.hparams['eval_every'])] | 
Beta Was this translation helpful? Give feedback.
-
| Thanks for this all of this. It sounds like the fundamental problem may be that with my code I was not logging from  I will try this and let you know if it works. | 
Beta Was this translation helpful? Give feedback.
-
| I had a very similar issue: in my reinforcement learning framework I wanted to measure the validation performance of my agent. Of course I would do so without a  Maybe pytorch_lightning could at least give a warning once one tries to use  | 
Beta Was this translation helpful? Give feedback.
-
| Regarding the  | 
Beta Was this translation helpful? Give feedback.
I have multiple comments that I did not verify yet but they might help
self.logonly works within a selection of hooks currently. I suggest you try to move the relevant code totraining_epoch_endwhereself.logshould work correctly.ModelCheckpoint(monitor=)explicitly.periodparameter to only run on the epochs you update the monitor quantity. 2) Cache the last value and log it in the epochs between your regular interval, to make the ModelCheckpoint see it as unchanged. The second option may even be the defau…