Skip to content

Conversation

lapp0
Copy link
Contributor

@lapp0 lapp0 commented Jan 25, 2024

Overview:

  • Adds grammars to outlines.grammars and fixes some grammars
  • Adds benchmarks for CFGFSM (pytest --benchmark-cfg)
  • Fixes UnexpectedToken error resulting from a partially complete terminal being lexed as a full terminal (see fsm.py)

There are a lot of files changed in this PR, but they're mostly new lark grammars, lark grammar improvements, and many sample test files.

  • Python changes:
    • Add test cases
    • update outlines/grammars.py with new grammars
    • Prevent UnexpectedToken if terminal is partially generated and misrecognized as being a different terminal in fsm.py
  • Lark changes:
    • Fix grammars
    • Add grammars
  • Documentation:
    • Add documentation on creating grammars (Rendered)
    • Add documentation on using outlines.grammars
  • Add test generation files to outlines/tests/benchmark/cfg_samples/

Fixes:

Prerequisites

Create / Fix Grammars

  • outlines.grammars.csv
  • Fix ESCAPED_STRING in outlines.grammars.json
  • outlines.grammars.lark
  • outlines.grammars.sql_select (maybe reach out to https://motherduck.com/blog/duckdb-text2sql-llm/)
  • remove outlines.grammars.lisp
  • fix outlines.grammars.arithmetic recursion issue

Other Work

  • End to end + benchmark tests for grammars
  • Enable some lookarounds by changing how CFGFSM.regex_fsm is constructed in outlines/fsm/fsm.py
  • Fix ESCAPED_STRING in common.lark so it's compatible with interegular
  • Rename all test files e.g. foo.py.test so it doesn't confuse users searching the repo
  • test / bench samples for outlines.grammars.arithmetic
  • Document lookaround behavior for https://github.com/MegaIng/interegular/

Documentation Work

  • [ ] Documentation and examples of grammars in use
  • Documentation of how to create grammars

Out of scope because they require the introduction of context-sensitive indentation handling #592

  • outlines.grammars.python3 (and outlines.grammars.python3_interactive)
  • outlines.grammars.yaml

Out of scope otherwise

  • outlines.grammars.bash

Questions:

  • These tests take longer than all the other tests combined times 5. I suggest we by default only run one of these tests when pytest is run, and run all of them when pytest --do-benchmark is run. Running a single test will help ensure the changeset doesn't break CFG in general, without the need to test all grammars.
  • Should these be run within CI, or should PR authors manually run them when they're potentially impacting performance?

Output

Additional Benchmark Details:
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]:
	Tokens / Second: 16.201
	(Num Tokens: 2771, Time: 171.035 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-False]:
	Tokens / Second: 15.405
	(Num Tokens: 2771, Time: 179.879 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_many_features.sql.test-True]:
	Tokens / Second: 18.946
	(Num Tokens: 1294, Time: 68.300 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_many_features.sql.test-False]:
	Tokens / Second: 11.907
	(Num Tokens: 1294, Time: 108.675 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_simple.sql.test-True]:
	Tokens / Second: 21.232
	(Num Tokens: 17, Time: 0.801 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_simple.sql.test-False]:
	Tokens / Second: 2.913
	(Num Tokens: 17, Time: 5.836 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_order.sql.test-True]:
	Tokens / Second: 14.272
	(Num Tokens: 47, Time: 3.293 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_order.sql.test-False]:
	Tokens / Second: 3.614
	(Num Tokens: 47, Time: 13.004 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_having.sql.test-True]:
	Tokens / Second: 19.313
	(Num Tokens: 219, Time: 11.339 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_having.sql.test-False]:
	Tokens / Second: 6.501
	(Num Tokens: 219, Time: 33.686 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_nested_subquery.sql.test-True]:
	Tokens / Second: 17.400
	(Num Tokens: 462, Time: 26.551 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_nested_subquery.sql.test-False]:
	Tokens / Second: 8.587
	(Num Tokens: 462, Time: 53.801 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_coalesce.sql.test-True]:
	Tokens / Second: 21.573
	(Num Tokens: 230, Time: 10.662 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_coalesce.sql.test-False]:
	Tokens / Second: 7.271
	(Num Tokens: 230, Time: 31.631 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_union.sql.test-True]:
	Tokens / Second: 20.007
	(Num Tokens: 195, Time: 9.747 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[sql_select_select_union.sql.test-False]:
	Tokens / Second: 8.050
	(Num Tokens: 195, Time: 24.224 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[arithmetic_lots_of_ops.arithmetic.test-True]:
	Tokens / Second: 21.246
	(Num Tokens: 515, Time: 24.240 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[arithmetic_lots_of_ops.arithmetic.test-False]:
	Tokens / Second: 20.819
	(Num Tokens: 515, Time: 24.737 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[arithmetic_simple_math.arithmetic.test-True]:
	Tokens / Second: 21.222
	(Num Tokens: 26, Time: 1.225 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[arithmetic_simple_math.arithmetic.test-False]:
	Tokens / Second: 15.303
	(Num Tokens: 26, Time: 1.699 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit.json.test-True]:
	Tokens / Second: 35.228
	(Num Tokens: 381, Time: 10.815 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit.json.test-False]:
	Tokens / Second: 28.008
	(Num Tokens: 381, Time: 13.603 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_outlines.generate.samplers.mypy.json.test-True]:
	Tokens / Second: 20.789
	(Num Tokens: 13620, Time: 655.139 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_outlines.generate.samplers.mypy.json.test-False]:
	Tokens / Second: 17.473
	(Num Tokens: 13620, Time: 779.487 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit_no_indent.json.test-True]:
	Tokens / Second: 30.341
	(Num Tokens: 197, Time: 6.493 seconds)
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit_no_indent.json.test-False]:
	Tokens / Second: 18.822
	(Num Tokens: 197, Time: 10.466 seconds)

-------------------------------------------------------------------------------------------------------------------------- benchmark: 26 tests ---------------------------------------------------------------------------------------------------------------------------
Name (time in ms)                                                                                Min                     Max                    Mean            StdDev                  Median               IQR            Outliers     OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_benchmark_cfg_generation[sql_select_select_simple.sql.test-True]                       800.6749 (1.0)          800.6749 (1.0)          800.6749 (1.0)      0.0000 (1.0)          800.6749 (1.0)      0.0000 (1.0)           0;0  1.2489 (1.0)           1           1
test_benchmark_cfg_generation[arithmetic_simple_math.arithmetic.test-True]                1,225.1380 (1.53)       1,225.1380 (1.53)       1,225.1380 (1.53)     0.0000 (1.0)        1,225.1380 (1.53)     0.0000 (1.0)           0;0  0.8162 (0.65)          1           1
test_benchmark_cfg_generation[arithmetic_simple_math.arithmetic.test-False]               1,699.0278 (2.12)       1,699.0278 (2.12)       1,699.0278 (2.12)     0.0000 (1.0)        1,699.0278 (2.12)     0.0000 (1.0)           0;0  0.5886 (0.47)          1           1
test_benchmark_cfg_generation[sql_select_select_order.sql.test-True]                      3,293.2230 (4.11)       3,293.2230 (4.11)       3,293.2230 (4.11)     0.0000 (1.0)        3,293.2230 (4.11)     0.0000 (1.0)           0;0  0.3037 (0.24)          1           1
test_benchmark_cfg_generation[sql_select_select_simple.sql.test-False]                    5,836.3914 (7.29)       5,836.3914 (7.29)       5,836.3914 (7.29)     0.0000 (1.0)        5,836.3914 (7.29)     0.0000 (1.0)           0;0  0.1713 (0.14)          1           1
test_benchmark_cfg_generation[json_simple_fruit_no_indent.json.test-True]                 6,492.7985 (8.11)       6,492.7985 (8.11)       6,492.7985 (8.11)     0.0000 (1.0)        6,492.7985 (8.11)     0.0000 (1.0)           0;0  0.1540 (0.12)          1           1
test_benchmark_cfg_generation[sql_select_select_union.sql.test-True]                      9,746.7852 (12.17)      9,746.7852 (12.17)      9,746.7852 (12.17)    0.0000 (1.0)        9,746.7852 (12.17)    0.0000 (1.0)           0;0  0.1026 (0.08)          1           1
test_benchmark_cfg_generation[json_simple_fruit_no_indent.json.test-False]               10,466.2372 (13.07)     10,466.2372 (13.07)     10,466.2372 (13.07)    0.0000 (1.0)       10,466.2372 (13.07)    0.0000 (1.0)           0;0  0.0955 (0.08)          1           1
test_benchmark_cfg_generation[sql_select_select_coalesce.sql.test-True]                  10,661.6278 (13.32)     10,661.6278 (13.32)     10,661.6278 (13.32)    0.0000 (1.0)       10,661.6278 (13.32)    0.0000 (1.0)           0;0  0.0938 (0.08)          1           1
test_benchmark_cfg_generation[json_simple_fruit.json.test-True]                          10,815.3759 (13.51)     10,815.3759 (13.51)     10,815.3759 (13.51)    0.0000 (1.0)       10,815.3759 (13.51)    0.0000 (1.0)           0;0  0.0925 (0.07)          1           1
test_benchmark_cfg_generation[sql_select_select_having.sql.test-True]                    11,339.4663 (14.16)     11,339.4663 (14.16)     11,339.4663 (14.16)    0.0000 (1.0)       11,339.4663 (14.16)    0.0000 (1.0)           0;0  0.0882 (0.07)          1           1
test_benchmark_cfg_generation[sql_select_select_order.sql.test-False]                    13,004.0021 (16.24)     13,004.0021 (16.24)     13,004.0021 (16.24)    0.0000 (1.0)       13,004.0021 (16.24)    0.0000 (1.0)           0;0  0.0769 (0.06)          1           1
test_benchmark_cfg_generation[json_simple_fruit.json.test-False]                         13,603.4424 (16.99)     13,603.4424 (16.99)     13,603.4424 (16.99)    0.0000 (1.0)       13,603.4424 (16.99)    0.0000 (1.0)           0;0  0.0735 (0.06)          1           1
test_benchmark_cfg_generation[sql_select_select_union.sql.test-False]                    24,224.4266 (30.26)     24,224.4266 (30.26)     24,224.4266 (30.26)    0.0000 (1.0)       24,224.4266 (30.26)    0.0000 (1.0)           0;0  0.0413 (0.03)          1           1
test_benchmark_cfg_generation[arithmetic_lots_of_ops.arithmetic.test-True]               24,239.8884 (30.27)     24,239.8884 (30.27)     24,239.8884 (30.27)    0.0000 (1.0)       24,239.8884 (30.27)    0.0000 (1.0)           0;0  0.0413 (0.03)          1           1
test_benchmark_cfg_generation[arithmetic_lots_of_ops.arithmetic.test-False]              24,736.5881 (30.89)     24,736.5881 (30.89)     24,736.5881 (30.89)    0.0000 (1.0)       24,736.5881 (30.89)    0.0000 (1.0)           0;0  0.0404 (0.03)          1           1
test_benchmark_cfg_generation[sql_select_select_nested_subquery.sql.test-True]           26,551.4368 (33.16)     26,551.4368 (33.16)     26,551.4368 (33.16)    0.0000 (1.0)       26,551.4368 (33.16)    0.0000 (1.0)           0;0  0.0377 (0.03)          1           1
test_benchmark_cfg_generation[sql_select_select_coalesce.sql.test-False]                 31,631.2438 (39.51)     31,631.2438 (39.51)     31,631.2438 (39.51)    0.0000 (1.0)       31,631.2438 (39.51)    0.0000 (1.0)           0;0  0.0316 (0.03)          1           1
test_benchmark_cfg_generation[sql_select_select_having.sql.test-False]                   33,686.4307 (42.07)     33,686.4307 (42.07)     33,686.4307 (42.07)    0.0000 (1.0)       33,686.4307 (42.07)    0.0000 (1.0)           0;0  0.0297 (0.02)          1           1
test_benchmark_cfg_generation[sql_select_select_nested_subquery.sql.test-False]          53,801.4904 (67.20)     53,801.4904 (67.20)     53,801.4904 (67.20)    0.0000 (1.0)       53,801.4904 (67.20)    0.0000 (1.0)           0;0  0.0186 (0.01)          1           1
test_benchmark_cfg_generation[sql_select_select_many_features.sql.test-True]             68,299.6835 (85.30)     68,299.6835 (85.30)     68,299.6835 (85.30)    0.0000 (1.0)       68,299.6835 (85.30)    0.0000 (1.0)           0;0  0.0146 (0.01)          1           1
test_benchmark_cfg_generation[sql_select_select_many_features.sql.test-False]           108,675.0438 (135.73)   108,675.0438 (135.73)   108,675.0438 (135.73)   0.0000 (1.0)      108,675.0438 (135.73)   0.0000 (1.0)           0;0  0.0092 (0.01)          1           1
test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-True]                    171,034.5827 (213.61)   171,034.5827 (213.61)   171,034.5827 (213.61)   0.0000 (1.0)      171,034.5827 (213.61)   0.0000 (1.0)           0;0  0.0058 (0.00)          1           1
test_benchmark_cfg_generation[lark_lark_self_grammar.lark.test-False]                   179,879.0275 (224.66)   179,879.0275 (224.66)   179,879.0275 (224.66)   0.0000 (1.0)      179,879.0275 (224.66)   0.0000 (1.0)           0;0  0.0056 (0.00)          1           1
test_benchmark_cfg_generation[json_outlines.generate.samplers.mypy.json.test-True]      655,138.7894 (818.23)   655,138.7894 (818.23)   655,138.7894 (818.23)   0.0000 (1.0)      655,138.7894 (818.23)   0.0000 (1.0)           0;0  0.0015 (0.00)          1           1
test_benchmark_cfg_generation[json_outlines.generate.samplers.mypy.json.test-False]     779,487.4373 (973.54)   779,487.4373 (973.54)   779,487.4373 (973.54)   0.0000 (1.0)      779,487.4373 (973.54)   0.0000 (1.0)           0;0  0.0013 (0.00)          1           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Profile

$ pytest tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit.json-False] --benchmark-cprofile=cumtime

<removed noise>

--------------------------------------------------------------------------------- cProfile (time in s) ---------------------------------------------------------------------------------
tests/benchmark/test_benchmark_cfg_generation.py::test_benchmark_cfg_generation[json_simple_fruit.json-False]
ncalls	tottime	percall	cumtime	percall	filename:lineno(function)
1	0.0000	0.0000	14.3733	14.3733	outlines/tests/benchmark/test_benchmark_cfg_generation.py:130(<lambda>)
1	0.1738	0.1738	14.3732	14.3732	outlines/tests/benchmark/test_benchmark_cfg_generation.py:51(run_until_eos)
327	0.1854	0.0006	14.1685	0.0433	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/fsm.py:222(allowed_token_ids)
208	0.7643	0.0037	13.7202	0.0660	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/fsm.py:94(__init__)
208	0.0036	0.0000	7.3470	0.0353	outlines/.myenv/lib/python3.11/site-packages/outlines/caching.py:64(wrapper)
989/860	5.6061	0.0065	5.6063	0.0065	~:0(<built-in method builtins.sorted>)
208	0.2597	0.0012	3.2571	0.0157	outlines/.myenv/lib/python3.11/site-packages/outlines/caching.py:39(hash_arguments)
8	0.0003	0.0000	3.1981	0.3998	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/fsm.py:95(create_states_mapping)
8	0.0158	0.0020	3.0903	0.3863	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/regex.py:558(create_fsm_index_tokenizer)
8	0.3168	0.0396	2.9276	0.3660	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/regex.py:494(create_fsm_index_end_to_end)
416	0.0037	0.0000	2.7863	0.0067	outlines/.myenv/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1464(dumps)
416	0.0007	0.0000	2.7790	0.0067	outlines/.myenv/lib/python3.11/site-packages/cloudpickle/cloudpickle.py:1243(dump)
416	2.7783	0.0067	2.7783	0.0067	~:0(<function Pickler.dump at 0x7f49efb03ba0>)
84	2.4870	0.0296	2.4873	0.0296	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/regex.py:461(state_scan_tokens)
200	0.0008	0.0000	0.8564	0.0043	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:1224(__getitem__)
200	0.0018	0.0000	0.8556	0.0043	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:1123(get)
200	0.0018	0.0000	0.8514	0.0043	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:254(fetch)
200	0.8406	0.0042	0.8406	0.0042	~:0(<built-in method _pickle.load>)
416	0.2078	0.0005	0.2078	0.0005	~:0(<method 'update' of '_hashlib.HASH' objects>)
8	0.0001	0.0000	0.1214	0.0152	outlines/.myenv/lib/python3.11/site-packages/outlines/models/transformers.py:177(__hash__)
8	0.0000	0.0000	0.1213	0.0152	outlines/.myenv/lib/python3.11/site-packages/datasets/fingerprint.py:226(hash)
8	0.0000	0.0000	0.1203	0.0150	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:105(dumps)
8	0.0001	0.0000	0.1203	0.0150	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:100(dump)
8	0.0000	0.0000	0.1201	0.0150	outlines/.myenv/lib/python3.11/site-packages/dill/_dill.py:416(dump)
8	0.0001	0.0000	0.1200	0.0150	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:476(dump)
825	0.1125	0.0001	0.1125	0.0001	~:0(<method '__reduce_ex__' of 'object' objects>)
1050404	0.0843	0.0000	0.0843	0.0000	~:0(<method 'add' of 'set' objects>)
825601	0.0397	0.0000	0.0397	0.0000	~:0(<method 'setdefault' of 'dict' objects>)
534	0.0320	0.0001	0.0324	0.0001	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/fsm.py:128(allowed_token_ids)
17970/601	0.0269	0.0000	0.0740	0.0001	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:128(deepcopy)
8	0.0237	0.0030	0.0237	0.0030	outlines/.myenv/lib/python3.11/site-packages/outlines/fsm/regex.py:584(<dictcomp>)
6204	0.0214	0.0000	0.0333	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/parsers/lalr_parser_state.py:67(feed_token)
5811	0.0188	0.0000	0.0707	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:590(next_token)
730	0.0162	0.0000	0.0463	0.0001	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:972(crawl)
8838	0.0125	0.0000	0.0125	0.0000	~:0(<method 'match' of '_regex.Pattern' objects>)
440	0.0109	0.0000	0.0109	0.0000	~:0(<method 'execute' of 'sqlite3.Connection' objects>)
8	0.0108	0.0013	0.0108	0.0013	~:0(<built-in method _pickle.dumps>)
15053	0.0106	0.0000	0.0142	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:213(_future_new)
11867	0.0081	0.0000	0.0090	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:350(follow)
76602	0.0052	0.0000	0.0052	0.0000	~:0(<method 'append' of 'list' objects>)
48965	0.0043	0.0000	0.0043	0.0000	~:0(<method 'get' of 'dict' objects>)
48586	0.0039	0.0000	0.0039	0.0000	~:0(<built-in method builtins.id>)
45129	0.0035	0.0000	0.0035	0.0000	~:0(<built-in method builtins.len>)
27369	0.0046	0.0000	0.0094	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:117(<genexpr>)
23214	0.0038	0.0000	0.0055	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:103(__getitem__)
19879	0.0037	0.0000	0.0048	0.0000	~:0(<built-in method builtins.isinstance>)
19423	0.0037	0.0000	0.0037	0.0000	~:0(<built-in method builtins.getattr>)
17970	0.0076	0.0000	0.0101	0.0000	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/copy.py:243(_keep_alive)
16402	0.0016	0.0000	0.0016	0.0000	~:0(<built-in method builtins.issubclass>)
15894	0.0037	0.0000	0.0037	0.0000	~:0(<built-in method __new__ of type object at 0x7f4a8f1b1ba0>)
15053	0.0079	0.0000	0.0221	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:202(__new__)
8849	0.0051	0.0000	0.0164	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:262(__deepcopy__)
8799	0.0059	0.0000	0.0059	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:928(follow)
8784	0.0080	0.0000	0.0220	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:387(match)
8784	0.0070	0.0000	0.0095	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:292(feed)
8784	0.0055	0.0000	0.0313	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:587(match)
8784	0.0018	0.0000	0.0039	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:581(scanner)
8784	0.0016	0.0000	0.0016	0.0000	~:0(<method 'group' of '_regex.Match' objects>)
99	0.0002	0.0000	0.0010	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:116(_get_alphabet)
99	0.0002	0.0000	0.0002	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:133(<dictcomp>)
99	0.0001	0.0000	0.0006	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:131(from_groups)
99	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:126(_get_lengths)
99	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:168(simplify)
983	0.0002	0.0000	0.0002	0.0000	~:0(<method 'isupper' of 'str' objects>)
981	0.0012	0.0000	0.0013	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/fsm.py:191(__init__)
98/32	0.0001	0.0000	0.0004	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:361(_get_lengths)
95	0.0001	0.0000	0.0001	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:157(<dictcomp>)
936	0.0008	0.0000	0.0008	0.0000	~:0(<method 'write' of '_io.BytesIO' objects>)
92/8	0.0001	0.0000	0.0143	0.0018	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:517(group)
90	0.0000	0.0000	0.0000	0.0000	outlines/.myenv/lib/python3.11/site-packages/lark/lexer.py:70(_get_flags)
888	0.0003	0.0000	0.0004	0.0000	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:241(write)
88	0.0027	0.0000	0.0032	0.0000	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:706(chargroup_inner)
88	0.0000	0.0000	0.0000	0.0000	~:0(<method 'end' of 're.Match' objects>)
86/8	0.0003	0.0000	0.0852	0.0106	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:447(to_fsm)
4	0.0015	0.0004	0.0042	0.0010	outlines/.myenv/lib/python3.11/site-packages/diskcache/core.py:230(_write)
1434/8	0.0019	0.0002	0.0147	0.0018	outlines/.myenv/lib/python3.11/site-packages/interegular/utils/simple_parser.py:34(w)
696/8	0.0016	0.0002	0.1196	0.0150	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:535(save)
144/8	0.0010	0.0001	0.0145	0.0018	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:503(conc)
8	0.0008	0.0001	0.0008	0.0001	~:0(<method 'update' of 'xxhash.xxh64' objects>)
355/8	0.0007	0.0001	0.0106	0.0013	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:69(get_alphabet)
696/8	0.0007	0.0001	0.1198	0.0150	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:30(save)
12	0.0008	0.0001	0.0008	0.0001	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:172(_combine_char_groups)
696/8	0.0005	0.0001	0.1198	0.0150	outlines/.myenv/lib/python3.11/site-packages/dill/_dill.py:365(save)
339/16	0.0009	0.0001	0.0145	0.0009	outlines/.myenv/lib/python3.11/site-packages/interegular/patterns.py:512(obj)
8	0.0001	0.0000	0.1195	0.0149	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:197(_save_transformersPreTrainedTokenizerBase)
24/8	0.0002	0.0000	0.1190	0.0149	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:621(save_reduce)
24/8	0.0001	0.0000	0.1182	0.0148	outlines/.myenv/lib/python3.11/site-packages/dill/_dill.py:1190(save_module_dict)
24/8	0.0001	0.0000	0.1181	0.0148	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:965(save_dict)
24/8	0.0000	0.0000	0.1180	0.0148	outlines/.myenv/lib/python3.11/site-packages/datasets/utils/_dill.py:71(_batch_setitems)
24/8	0.0002	0.0000	0.1180	0.0147	/nix/store/qp5zys77biz7imbk6yy85q5pdv7qk84j-python3-3.11.6/lib/python3.11/pickle.py:978(_batch_setitems)
3814	0.0018	0.0000	0.0018	0.0000	~:0(<method 'write' of '_io.BufferedWriter' objects>)
231	0.0002	0.0000	0.0002	0.0000	~:0(<method 'values' of 'dict' objects>)
8162	0.0006	0.0000	0.0006	0.0000	~:0(<method 'update' of 'set' objects>)
24	0.0000	0.0000	0.0000	0.0000	~:0(<method 'union' of 'set' objects>)
468	0.0006	0.0000	0.0011	0.0000	~:0(<method 'union' of 'frozenset' objects>)
21	0.0000	0.0000	0.0000	0.0000	~:0(<method 'translate' of 'str' objects>)
720	0.0001	0.0000	0.0001	0.0000	~:0(<method 'tell' of '_io.BytesIO' objects>)
1825	0.0010	0.0000	0.0010	0.0000	~:0(<method 'startswith' of 'str' objects>)
64	0.0000	0.0000	0.0000	0.0000	~:0(<method 'split' of 'str' objects>)
12	0.0000	0.0000	0.0000	0.0000	~:0(<method 'rstrip' of 'str' objects>)
24	0.0000	0.0000	0.0000	0.0000	~:0(<method 'rpartition' of 'str' objects>)
1981	0.0005	0.0000	0.0005	0.0000	~:0(<method 'rindex' of 'str' objects>)
12	0.0000	0.0000	0.0000	0.0000	~:0(<method 'rfind' of 'str' objects>)
3260	0.0003	0.0000	0.0003	0.0000	~:0(<method 'replace' of 'str' objects>)
2	0.0000	0.0000	0.0000	0.0000	~:0(<method 'remove' of 'set' objects>)
84	0.0000	0.0000	0.0000	0.0000	~:0(<method 'pop' of 'set' objects>)
62	0.0000	0.0000	0.0000	0.0000	~:0(<method 'pop' of 'list' objects>)
350	0.0001	0.0000	0.0001	0.0000	~:0(<method 'pop' of 'dict' objects>)
120	0.0001	0.0000	0.0001	0.0000	~:0(<method 'match' of 're.Pattern' objects>)
1750	0.0003	0.0000	0.0003	0.0000	~:0(<method 'keys' of 'dict' objects>)

@rlouf rlouf added enhancement structured generation Linked to structured generation grammar labels Jan 26, 2024
@lapp0 lapp0 mentioned this pull request Feb 11, 2024
@rlouf
Copy link
Member

rlouf commented Mar 18, 2024

Closing due to inactivity. Feel free to re-open.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement grammar structured generation Linked to structured generation
Projects
None yet
3 participants