Skip to content

question on pendulum reward function #4

@Haichao-Zhang

Description

@Haichao-Zhang

Can you explain why this reward function (-cos(theta)-0.1*sin(theta) ... ) is used for pendulum?

costs = y + .1 * x + .1 * (thetadot ** 2) + .001 * (torque ** 2)

And why does it need to be different from the original reward function from openai-gym?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions