Skip to content

Commit 4127e60

Browse files
committed
update CFGFSM documentation to reflect updated logic
1 parent f9574e9 commit 4127e60

File tree

1 file changed

+16
-12
lines changed

1 file changed

+16
-12
lines changed

outlines/fsm/fsm.py

+16-12
Original file line numberDiff line numberDiff line change
@@ -220,18 +220,22 @@ def allowed_token_ids(self, state: FSMState) -> List[int]:
220220
"""Generate a list of allowed tokens for the next step.
221221
222222
Upon initialization, the CFG incremental parser is used to determine the
223-
first regex.
224-
225-
This regex is used for proposals until either:
226-
227-
- The regex is exhausted, and its only remaining option is the EOS
228-
token, in which case we always transition to the next regex
229-
- The regex can be exhausted, but the EOS token is not the only
230-
remaining option, in which case we transition to the next regex with
231-
probability P (TODO) or remove the possibility of generating the EOS
232-
token and continue with the current regex
233-
234-
The CFG incremental parser is allowed to propose the EOS token from any final state,
223+
first regex and construct the first FSM to generate the first terminal.
224+
225+
This FSM is used for proposals until either:
226+
227+
- The FSM is exhausted, and its only remaining option is the EOS
228+
token, in which case we feed the generated terminal to the
229+
CFG incremental parser and allow it to propose the next regex
230+
corresponding to the next set of valid terminals.
231+
- The current FSM can be exhausted, but the EOS token is not the only
232+
remaining option. In this case we allow proposal of current terminal extensions,
233+
store the current FSM and its state, then also use the CFG parser
234+
to propose a new regex corresponding to terminating the current terminal
235+
and starting the next one. The model can then sample from either of these sets
236+
to determine whether to extend the current terminal or terminate it and start the next one.
237+
238+
The CFG incremental parser is allowed to propose the EOS token from any accepting state,
235239
and once it is generated, the FSM will continue to always generate the EOS token.
236240
237241
Parameters

0 commit comments

Comments
 (0)