-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use JSON filler instead of PAUSE tokens #69
Comments
Do you have any benchmarks on this claim? I think adding JSON syntax might not be the best choice as "reasoning" rarely occur in JSON syntax and enforcing one on it would result in reasoning degradation. Here's a paper which study this extensively : https://arxiv.org/abs/2408.02442 |
Unfortunately, I don't have any benchmarks, I don't have the resources for that at the moment. My code is complex and dirty and contains many more optimizations and feedback loops that I would have to remove first. So I'll just make claims and it's completely OK to ignore my statement and close this issue (there was no other option here). What I can say is that I have what I think is a very powerful chatbot that started using padding tokens (pad, space, tab, newline) on its own. I'm attaching a simple but real example: That already worked very well with Mixtral8x7, and I'm currently using Gemma2 9B. Not exactly very large models. The bot uses the "reasoning" keys to think about the interaction and its response, which works extremely well. A few months ago I started calculating the entropy and in my experiments it decreased (mostly) with more padding tokens. Forcing additional padding didn't work well at the time, but I didn't pursue the approach any further. Until now. For forcing my predefined JSON scheme, I use lm-format-enforcer. Thanks for the paper. I'm not surprised that the quality drops due to the enforced restrictions. It somehow feels right and was also my first experience. The additional "processing cycles" more than make up for it, though. I realize that this is not a scientific approach. But should I keep quiet about it? No. ;) |
JSON actually degrades the output quality unless done with specialized training so I think it's better not to. |
Thanks for the exciting approach! I've been playing around with evaluating entropy for a while now, but haven't yet considered valentropy. I'll have to change that. ;)
I'd like to point out a very powerful approach as a replacement for CoT or PAUSE tokens: using legal filler syntax in enforced JSON output such as spaces or newlines. In my experience, this works very well WITHOUT any fine-tuning.
JSON syntax would also offer further possibilities in this context, such as forcing intermediate "reasoning" steps.
I've documented the approach at https://www.reddit.com/r/LocalLLaMA/comments/1g0ukv4/from_instinct_to_insight_how_roar_transforms_ai/ and will see how I can incorporate Entropix (and then hopefully finally publish the code).
The text was updated successfully, but these errors were encountered: