Originates from Eureka.
- using LLM to design reward for different kinds of sports
- in-context learning
- see details.md to know more about the project.
The test env is mainly on SMPLOlympics
/llm4rd:
llm4rd.py: llm reward design loop/envs: change a little fromSMPLOlympics/phc/env/tasks(temporarily copy boxing sports), and the obs of tasks (used for in-context learning prompting)/outputs: history llm output result (using LLaMA3)
smpl_envs/tasks:
- imitate the file system of
SMPLOlympics. after other parts have been completed, this folder can be changed to the realSMPLOlympicsfile path (SMPLOlympics/phc/env/tasks)