Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

konrad-gerlach · 2025-01-10T15:07:25Z

This PR mainly affects the TextEnvironment class and adds caching in between generation calls, in order to not have to recompute all previous activations when generating the next segment. This is mainly intended for use cases where many tool calls are performed sequentially and thus the activations for the (possibly quite large) system prompt would have to be calculated at each step. For stability, caching is optional.

Bug fixes:
This issue also addresses two bugs I encountered:

max_length checking in TextEnvironment class threw an error, as it assumed batching was present, when no batching existed.
I fixed the bug and also added a check at generation time to ensure, that the padded inputs also do not exceed max length.
The StringStoppingCriteria did not take generated eos tokens into account, which I have now fixed.

RE testing:
I only made sure, that the tests in tests/test_environments.py were completing.
Using make test some tests were failing and the tests were taking a long time to run. However, the only tests, which call TextEnvironment seem to be in test_environments.py, so the rest should be unaffected as far as I know. Nevertheless, I would be grateful, if somebody else could run all the tests before merging. I suspect, that my environment may not be ideally configured. Is testing automated via a CI?

konrad-gerlach · 2025-01-10T16:02:23Z

I would be very grateful for a review by:
@lvwerra
@vwxyzjn
@younesbelkada
@qgallouedec
or any others, that feel up to the task.

konrad-gerlach · 2025-01-10T21:56:55Z

I was unable to execute the pre-commit hook, so I manually ran the linter.

docs/source/text_environments.md

trl/environment/base_environment.py

qgallouedec · 2025-01-12T15:48:37Z

Thanks for the PR!
Let's see what's the CI outputs.

HuggingFaceDocBuilderDev · 2025-01-12T15:52:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

konrad-gerlach · 2025-01-12T18:46:43Z

Just to be sure, as I'm unfamiliar with their implementation: The trl Trainers like PPO should not try to back propagate through the generated tokens, right?

Co-authored-by: Quentin Gallouédec <[email protected]>

konrad-gerlach · 2025-01-12T19:59:57Z

The CI failing for Python 3.9 seems unrelated to this PR.

qgallouedec · 2025-01-12T20:49:07Z

The trl Trainers like PPO should not try to back propagate through the generated tokens, right?

Yes that's correct. The backprop is done on the output of a forward pass

konrad-gerlach · 2025-01-12T21:21:55Z

@qgallouedec Could you run the precommit to fix the linting issues? I haven't gotten it to work.

…gStoppingCriteria

konrad-gerlach · 2025-01-15T22:59:24Z

I'm still working on adding some more tests and cleaning up the code a bit.

qgallouedec · 2025-01-17T12:45:08Z

Ok, ping me when it's ready, I'll run the precommits and merge

konrad-gerlach · 2025-01-22T13:10:33Z

For future reference: Setting position_ids for generation did not appear necessary, as it seems to be handled here: https://github.com/huggingface/transformers/blob/241c04d36867259cdf11dbb4e9d9a60f9cb65ebc/src/transformers/generation/utils.py#L409-L416

konrad-gerlach · 2025-01-22T13:49:52Z

cache_position also appears to be handled automatically here https://github.com/huggingface/transformers/blob/241c04d36867259cdf11dbb4e9d9a60f9cb65ebc/src/transformers/generation/utils.py#L799-L806 and here https://github.com/huggingface/transformers/blob/241c04d36867259cdf11dbb4e9d9a60f9cb65ebc/src/transformers/generation/utils.py#L1560

Looking at the transformers generation code, it seems like there are currently issues with torch compile in transformers generate (mentioned in comment), see https://github.com/huggingface/transformers/blob/2e752ead46a8845e8a160d2043c1336447895690/src/transformers/generation/utils.py#L1582 -> I think, I will include a warning in the docs.

konrad-gerlach · 2025-01-22T14:47:45Z

It seems the model used for testing was not in fact GPT2 and the variable name was incorrect. Documentation will be updated accordingly.

konrad-gerlach · 2025-01-22T14:51:42Z

In the previous commits, I also fixed what I believe to be an off-by-one error in the StringStoppingCriteria: an inaccurate number for generated_tokens.

…gerlach/trl into text_environment_caching

konrad-gerlach · 2025-01-22T16:02:27Z

I noticed, that GPT2 seems to only support the legacy cache format, so I am adding support for this,

konrad-gerlach · 2025-01-22T18:18:50Z

In the previous commits, I also fixed what I believe to be an off-by-one error in the StringStoppingCriteria: an inaccurate number for generated_tokens.

It appears, that it was still not fixed. Working on a solution and on testing StringStoppingCriteria

konrad-gerlach · 2025-01-23T15:05:52Z

As the tests did not include a encoder-decoder architecture, I did not test for it either. I think, that this is out of scope for this Pull Request. Where this was of concern in _generate_batched, I mirrored the implementation already provided.

…cture

Konrad Gerlach added 9 commits January 10, 2025 10:02

feat: add caching for TextEnvironment and fix bugs

ab86162

feat: make TextEnvironment caching optional and add documentation

d09ec63

fix: failing TextEnvironment tests

b7885cc

test: add tests for TextEnvironment caching and fix cache combining bug

034c5f7

test: remove unnecessary parametrized class decorator

18eb106

docs: update TextEnvironmentDocs with caching

44fd184

fix: run linter on TextEnvironment and TextEnvironment tests

28601c2

fix: comment

2a7ec4e

fix: Args comment

af06d63

konrad-gerlach marked this pull request as draft January 10, 2025 15:16

fix: TextEnvironment cache combination and batching issue

f6f12b5

konrad-gerlach marked this pull request as ready for review January 10, 2025 15:57

konrad-gerlach force-pushed the text_environment_caching branch from 6a87c8d to 3f57ee9 Compare January 10, 2025 16:27

tests: make caching test more complex

ede7e81

konrad-gerlach force-pushed the text_environment_caching branch from 3f57ee9 to ede7e81 Compare January 10, 2025 16:33

konrad-gerlach marked this pull request as draft January 11, 2025 10:47

fix: combine caches of different sequence lengths

acddaa7

konrad-gerlach marked this pull request as ready for review January 11, 2025 12:58

Konrad Gerlach added 2 commits January 12, 2025 16:36

docs: update caching warning

e38940e

fix: prevent bos tokens in tool response

66d0ce4

qgallouedec reviewed Jan 12, 2025

View reviewed changes

docs/source/text_environments.md Outdated Show resolved Hide resolved

qgallouedec reviewed Jan 12, 2025

View reviewed changes

trl/environment/base_environment.py Outdated Show resolved Hide resolved

konrad-gerlach and others added 3 commits January 12, 2025 19:48

docs: Update docs/source/text_environments.md

a051e46

Co-authored-by: Quentin Gallouédec <[email protected]>

Update trl/environment/base_environment.py

9ea9287

Co-authored-by: Quentin Gallouédec <[email protected]>

Merge branch 'main' into text_environment_caching

ae1233a

fix: code cleanup

a2860bc

Konrad Gerlach and others added 2 commits January 14, 2025 22:06

fix: attended to invalid last generated token and off-by-one in Strin…

23014fb

…gStoppingCriteria

Merge branch 'main' into text_environment_caching

bdaa922

Merge branch 'main' into text_environment_caching

a097c5b

Konrad Gerlach and others added 2 commits January 21, 2025 15:59

fix: off by one error in StringStoppingCriteria

7324ee1

Merge branch 'main' into text_environment_caching

9b6a6ec

feat: test logits are same with and without caching

39763b1

fix: model and tokenizer were called gpt2 but were another model

b70f51c

Konrad Gerlach added 2 commits January 22, 2025 15:59

docs: add warning for torch.compile with TextEnvironment use_cache

7b2169d

Merge branch 'text_environment_caching' of https://github.com/konrad-…

c4b5400

…gerlach/trl into text_environment_caching

Konrad Gerlach added 3 commits January 23, 2025 11:06

fix: StringStoppingCriteria and add test

5725b18

refactor: move StoppingCriteria test

589dcb7

feat: add support for models without cache class support

5e1a7dd

Konrad Gerlach added 2 commits January 23, 2025 16:54

refactor: make caching code optional in TextEnvironment

cc99580

docs: TextEnvironment use_cache note untested Encoder-Decoder archite…

50119a8

…cture

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

konrad-gerlach commented Jan 10, 2025

konrad-gerlach commented Jan 10, 2025

konrad-gerlach commented Jan 10, 2025

qgallouedec commented Jan 12, 2025

HuggingFaceDocBuilderDev commented Jan 12, 2025

konrad-gerlach commented Jan 12, 2025 •

edited

Loading

konrad-gerlach commented Jan 12, 2025

qgallouedec commented Jan 12, 2025

konrad-gerlach commented Jan 12, 2025

konrad-gerlach commented Jan 15, 2025

qgallouedec commented Jan 17, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 23, 2025

Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

Are you sure you want to change the base?

Add generation caching in TextEnvironment and fix bugs in TextEnvironment #2556

Conversation

konrad-gerlach commented Jan 10, 2025

konrad-gerlach commented Jan 10, 2025

konrad-gerlach commented Jan 10, 2025

qgallouedec commented Jan 12, 2025

HuggingFaceDocBuilderDev commented Jan 12, 2025

konrad-gerlach commented Jan 12, 2025 • edited Loading

konrad-gerlach commented Jan 12, 2025

qgallouedec commented Jan 12, 2025

konrad-gerlach commented Jan 12, 2025

konrad-gerlach commented Jan 15, 2025

qgallouedec commented Jan 17, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 22, 2025

konrad-gerlach commented Jan 23, 2025

konrad-gerlach commented Jan 12, 2025 •

edited

Loading