Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JSON filler instead of PAUSE tokens #69

Open
xonfour opened this issue Oct 10, 2024 · 3 comments
Open

Use JSON filler instead of PAUSE tokens #69

xonfour opened this issue Oct 10, 2024 · 3 comments

Comments

@xonfour
Copy link

xonfour commented Oct 10, 2024

Thanks for the exciting approach! I've been playing around with evaluating entropy for a while now, but haven't yet considered valentropy. I'll have to change that. ;)

I'd like to point out a very powerful approach as a replacement for CoT or PAUSE tokens: using legal filler syntax in enforced JSON output such as spaces or newlines. In my experience, this works very well WITHOUT any fine-tuning.

JSON syntax would also offer further possibilities in this context, such as forcing intermediate "reasoning" steps.

I've documented the approach at https://www.reddit.com/r/LocalLLaMA/comments/1g0ukv4/from_instinct_to_insight_how_roar_transforms_ai/ and will see how I can incorporate Entropix (and then hopefully finally publish the code).

@theblackcat102
Copy link

Do you have any benchmarks on this claim? I think adding JSON syntax might not be the best choice as "reasoning" rarely occur in JSON syntax and enforcing one on it would result in reasoning degradation. Here's a paper which study this extensively : https://arxiv.org/abs/2408.02442

@xonfour
Copy link
Author

xonfour commented Oct 15, 2024

Unfortunately, I don't have any benchmarks, I don't have the resources for that at the moment. My code is complex and dirty and contains many more optimizations and feedback loops that I would have to remove first. So I'll just make claims and it's completely OK to ignore my statement and close this issue (there was no other option here).

What I can say is that I have what I think is a very powerful chatbot that started using padding tokens (pad, space, tab, newline) on its own. I'm attaching a simple but real example:

paddings

That already worked very well with Mixtral8x7, and I'm currently using Gemma2 9B. Not exactly very large models. The bot uses the "reasoning" keys to think about the interaction and its response, which works extremely well.

A few months ago I started calculating the entropy and in my experiments it decreased (mostly) with more padding tokens. Forcing additional padding didn't work well at the time, but I didn't pursue the approach any further. Until now.

For forcing my predefined JSON scheme, I use lm-format-enforcer. Thanks for the paper. I'm not surprised that the quality drops due to the enforced restrictions. It somehow feels right and was also my first experience. The additional "processing cycles" more than make up for it, though.

I realize that this is not a scientific approach. But should I keep quiet about it? No. ;)

@akarshghale
Copy link

JSON actually degrades the output quality unless done with specialized training so I think it's better not to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants