dumb question but "why can't we just" 🙈 #70
mindplay-dk
started this conversation in
Off Topic
Replies: 1 comment 4 replies
-
Defeats the purpose of the project, which is to demonstrate that the o1 reasoning advance is replicable. Swiping their CoTs will not accomplish that goal. |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This suggestion is way too obvious to work, but humor me:
Why can't we just take the input/output from an existing CoT or ToT agent program and fine-tune on that?
Literally just run queries through the most successful CoT/ToT agent and train on their output?
We already have programs that produce the kind of reasoning workflow we'd like the LLM to learn, right?
Obviously someone has thought of this and it won't work, I'm just curious to learn why. 😌
Beta Was this translation helpful? Give feedback.
All reactions