LLM4RD: LLM for Reward Design

Briefly Introduction

Originates from Eureka.

The test env is mainly on SMPLOlympics

/llm4rd：

llm4rd.py: llm reward design loop
/envs: change a little from SMPLOlympics/phc/env/tasks (temporarily copy boxing sports), and the obs of tasks (used for in-context learning prompting)
/outputs: history llm output result (using LLaMA3)

smpl_envs/tasks:

imitate the file system of SMPLOlympics. after other parts have been completed, this folder can be changed to the real SMPLOlympics file path (SMPLOlympics/phc/env/tasks)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
llm4rd		llm4rd
smpl_envs/tasks		smpl_envs/tasks
README.MD		README.MD
details.md		details.md