Has this been tested on anything? #86

JulesGM · 2022-11-10T06:02:55Z

JulesGM
Nov 10, 2022

Has this been tested on anything?

LouisCastricato · 2022-11-10T12:25:17Z

LouisCastricato
Nov 10, 2022

Yep! We have an example we'll be merging soon where we got openai's learning to summarize reward model working with TRLX on a 20b language model. We also have a very minimal version of CodeRL working, it's included as an example here.

We've also been discussing TRLX with plenty of RLHF industry folks and have gotten a few seals of approval at this point.

0 replies

panyi121 · 2024-01-17T06:08:37Z

panyi121
Jan 17, 2024

What's the largest PPO model size that has been trained and tested with TRLX? Can you share some performance metrics, i.e. GPU count, training time?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Has this been tested on anything? #86

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Has this been tested on anything? #86

Uh oh!

JulesGM Nov 10, 2022

Replies: 2 comments

Uh oh!

Uh oh!

LouisCastricato Nov 10, 2022

Uh oh!

panyi121 Jan 17, 2024

JulesGM
Nov 10, 2022

LouisCastricato
Nov 10, 2022

panyi121
Jan 17, 2024