-
Notifications
You must be signed in to change notification settings - Fork 18
Fix unpredictable not
and untyped simplification
#29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When `not` is simplified to a `anyOf` of all other types, the order of the `anyOf` alternative depends on the dictionary keys iteration order. If one of `anyOf` alternative implementation has a bug, then tests will randomly fail and succeed on consecutive invocations of python. For instance in the stack trace below, the regular expression escaping does not match the needs of greenery, but the bug is only triggered if `_isAnyofSubtype` does not return early because another alternative rejected the type before the `string` check. `` Traceback (most recent call last): File "/home/runner/work/jsonsubschema/jsonsubschema/test/test_mix.py", line 162, in test_not_number self.assertFalse(isSubschema(s2, s1)) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/api.py", line 57, in isSubschema return s1.isSubtype(s2) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 163, in isSubtype return self.subtype_enum(s) and self._isSubtype(s) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 1557, in _isSubtype return _isAnyofSubtype(self, s) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 1552, in _isAnyofSubtype if not s.isSubtype(s2): File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 163, in isSubtype return self.subtype_enum(s) and self._isSubtype(s) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 402, in _isSubtype return super().isSubtype_handle_rhs(s, _isStringSubtype) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 198, in isSubtype_handle_rhs return isSubtype_cb(self, s) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 395, in _isStringSubtype if utils.regex_isSubset(pattern1, pattern2): File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_utils.py", line 202, in regex_isSubset return parse(s2).equivalent(parse(".*")) File "/opt/hostedtoolcache/Python/3.10.14/x64/lib/python3.10/site-packages/greenery/parse.py", line 390, in parse raise NoMatch(f"Could not parse {string!r} beyond index {i}") greenery.parse.NoMatch: Could not parse '<0|0<=X<200|>=200|no\\ checking' beyond index 20 ``
When the type is not set, a dict is canonicalized by adding all types, then trying to join them to remove the unnecessary values. Because `_join` is not commutative or associative, the order of types matters and cause variations in the canonicalized output. If there are issues in some `_join` that cause errors, then the tests become flaky, for instance the exception below only happens when enumerating Jtypes start with `number` and is followed by `integer`. (Any other order will first create a `JSONanyOf` that will only attempt a real join on the `anyOf` alternatives if the types are identical which won't call the failing code because `number` != `integer`). ``` Traceback (most recent call last): File "/home/runner/work/jsonsubschema/jsonsubschema/test/test_numeric.py", line 649, in test_all_all_3 self.assertFalse(isSubschema(s2, s1)) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/api.py", line 56, in isSubschema s1, s2 = prepare_operands(s1, s2) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/api.py", line 46, in prepare_operands canonicalize_schema(s2)) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_canonicalization.py", line 35, in canonicalize_schema canonical_schema = canonicalize_dict(obj) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_canonicalization.py", line 84, in canonicalize_dict return canonicalize_connectors(d) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_canonicalization.py", line 218, in canonicalize_connectors allofs.append(canonicalize_dict({c: d[c]})) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_canonicalization.py", line 84, in canonicalize_dict return canonicalize_connectors(d) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_canonicalization.py", line 211, in canonicalize_connectors simplified = simplify_schema_and_embed_checkers(d) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_canonicalization.py", line 369, in simplify_schema_and_embed_checkers allofs = [simplify_schema_and_embed_checkers(i) for i in s["allOf"]] File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_canonicalization.py", line 369, in <listcomp> allofs = [simplify_schema_and_embed_checkers(i) for i in s["allOf"]] File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_canonicalization.py", line 366, in simplify_schema_and_embed_checkers return boolToConstructor.get("anyOf")({"anyOf": anyofs}) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 1481, in JSONanyOfFactory ret = ret.join(i) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 133, in join ret = self._join(s) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 686, in _join return _joinNumber(self, s) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_checkers.py", line 672, in _joinNumber gcd = utils.gcd(s1.multipleOf, s2.multipleOf) File "/home/runner/work/jsonsubschema/jsonsubschema/jsonsubschema/_utils.py", line 269, in gcd return fractions.gcd(x, y) AttributeError: module 'fractions' has no attribute 'gcd' ```
not
simplificationnot
and untyped simplification
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When
not
is simplified to aanyOf
of all other types, the order of theanyOf
alternative depends on the dictionary keys iteration order.If one of
anyOf
alternative implementation has a bug, then tests will randomly fail and succeed on consecutive invocations of python.For instance in the stack trace below, the regular expression escaping does not match the needs of greenery, but the bug is only triggered if
_isAnyofSubtype
does not return early because another alternative rejected the type before thestring
check.Similarly, when the type is not set, a dict is canonicalized by adding all types, then trying to join them to remove the unnecessary values.
Because
_join
is not commutative or associative, the order of types matters and cause variations in the canonicalized output.If there are issues in some
_join
that cause errors, then the tests become flaky, for instance the exception below only happens when enumeratingJtypes
starts withnumber
and is followed byinteger
.(Any other order will first create a
JSONanyOf
that will only attempt a real join on theanyOf
alternatives if the types are identical which won't call the failing code becausenumber
!=integer
).The gcd error itself is handled here: #30