Use JSON filler instead of PAUSE tokens #69

xonfour · 2024-10-10T22:21:18Z

Thanks for the exciting approach! I've been playing around with evaluating entropy for a while now, but haven't yet considered valentropy. I'll have to change that. ;)

I'd like to point out a very powerful approach as a replacement for CoT or PAUSE tokens: using legal filler syntax in enforced JSON output such as spaces or newlines. In my experience, this works very well WITHOUT any fine-tuning.

JSON syntax would also offer further possibilities in this context, such as forcing intermediate "reasoning" steps.

I've documented the approach at https://www.reddit.com/r/LocalLLaMA/comments/1g0ukv4/from_instinct_to_insight_how_roar_transforms_ai/ and will see how I can incorporate Entropix (and then hopefully finally publish the code).

theblackcat102 · 2024-10-13T07:08:19Z

Do you have any benchmarks on this claim? I think adding JSON syntax might not be the best choice as "reasoning" rarely occur in JSON syntax and enforcing one on it would result in reasoning degradation. Here's a paper which study this extensively : https://arxiv.org/abs/2408.02442

xonfour · 2024-10-15T18:55:28Z

Unfortunately, I don't have any benchmarks, I don't have the resources for that at the moment. My code is complex and dirty and contains many more optimizations and feedback loops that I would have to remove first. So I'll just make claims and it's completely OK to ignore my statement and close this issue (there was no other option here).

What I can say is that I have what I think is a very powerful chatbot that started using padding tokens (pad, space, tab, newline) on its own. I'm attaching a simple but real example:

That already worked very well with Mixtral8x7, and I'm currently using Gemma2 9B. Not exactly very large models. The bot uses the "reasoning" keys to think about the interaction and its response, which works extremely well.

A few months ago I started calculating the entropy and in my experiments it decreased (mostly) with more padding tokens. Forcing additional padding didn't work well at the time, but I didn't pursue the approach any further. Until now.

For forcing my predefined JSON scheme, I use lm-format-enforcer. Thanks for the paper. I'm not surprised that the quality drops due to the enforced restrictions. It somehow feels right and was also my first experience. The additional "processing cycles" more than make up for it, though.

I realize that this is not a scientific approach. But should I keep quiet about it? No. ;)

akarshghale · 2024-10-17T12:56:36Z

JSON actually degrades the output quality unless done with specialized training so I think it's better not to.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use JSON filler instead of PAUSE tokens #69

Use JSON filler instead of PAUSE tokens #69

xonfour commented Oct 10, 2024

theblackcat102 commented Oct 13, 2024

xonfour commented Oct 15, 2024 •

edited

Loading

akarshghale commented Oct 17, 2024

Use JSON filler instead of PAUSE tokens #69

Use JSON filler instead of PAUSE tokens #69

Comments

xonfour commented Oct 10, 2024

theblackcat102 commented Oct 13, 2024

xonfour commented Oct 15, 2024 • edited Loading

akarshghale commented Oct 17, 2024

xonfour commented Oct 15, 2024 •

edited

Loading