You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! You presented a paper regarding simulation of industrial challenges with OBP at the RecSys'22. I found it very interesting and want to understand some details. I found the pr with source code and notebooks with experiments, but did not find a documentation describing the idea and details of the new functionality. So, could you help me with some questions:
What kind of reward functions do you have and how are they trained? I found logistic_reward_function, linear_reward_function and others placed here. Unfortunately I have not realised what it the training data for them and if they are retrained every simulation round.
What is the functionality of BanditEnvironmentSimulator?
It would be great if you can share some papers (except for the one from RecSys'22), schemas, demos, tutorials explaining your simulation framework details.
The text was updated successfully, but these errors were encountered:
Hi! You presented a paper regarding simulation of industrial challenges with OBP at the RecSys'22. I found it very interesting and want to understand some details. I found the pr with source code and notebooks with experiments, but did not find a documentation describing the idea and details of the new functionality. So, could you help me with some questions:
It would be great if you can share some papers (except for the one from RecSys'22), schemas, demos, tutorials explaining your simulation framework details.
The text was updated successfully, but these errors were encountered: