Skip to content

True diff'ing function for missing and extra elements. #127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changes/130.added
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add the ability to reconstruct JSON blobs to perform JSON data compliance.
1 change: 1 addition & 0 deletions docs/code-reference/jdiff/__init__.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: jdiff
1 change: 1 addition & 0 deletions docs/code-reference/jdiff/check_types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: jdiff.check_types
1 change: 1 addition & 0 deletions docs/code-reference/jdiff/evaluators.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: jdiff.evaluators
1 change: 1 addition & 0 deletions docs/code-reference/jdiff/extract_data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: jdiff.extract_data
1 change: 1 addition & 0 deletions docs/code-reference/jdiff/operator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: jdiff.operator
1 change: 1 addition & 0 deletions docs/code-reference/jdiff/utils/__init__.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: jdiff.utils
1 change: 1 addition & 0 deletions docs/code-reference/jdiff/utils/data_normalization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: jdiff.utils.data_normalization
1 change: 1 addition & 0 deletions docs/code-reference/jdiff/utils/diff_helpers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: jdiff.utils.diff_helpers
1 change: 1 addition & 0 deletions docs/code-reference/jdiff/utils/jmespath_parsers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
::: jdiff.utils.jmespath_parsers
2 changes: 1 addition & 1 deletion docs/generate_code_reference_pages.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

import mkdocs_gen_files

for file_path in Path("pyntc").rglob("*.py"):
for file_path in Path("jdiff").rglob("*.py"):
module_path = file_path.with_suffix("")
doc_path = file_path.with_suffix(".md")
full_doc_path = Path("code-reference", doc_path)
Expand Down
Binary file added docs/images/jdiff_logo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
117 changes: 117 additions & 0 deletions docs/user/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -610,3 +610,120 @@ Can you guess what would be the outcome for an `int`, `float` operator?
```

See `tests` folder in the repo for more examples.

## Putting a Result Back Together

Jdiff results are very helpful in determining what is wrong with the outputs. What if you want to reconstruct the results in order to fix the problem. The `parse_diff` helper does just that. Imagine you have a `jdiff` result such as:

Examples of jdiff evaluated results:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want this with the above?


```python
ex1 = {'bar-2': 'missing', 'bar-1': 'new'}
ex2 = {
'hostname': {'new_value': 'veos-actual', 'old_value': 'veos-intended'},
'domain-name': 'new'
}
ex3 = {
'hostname': {'new_value': 'veos-0', 'old_value': 'veos'},
"index_element['ip name']": 'missing',
'domain-name': 'new'
}
ex4 = {
'servers':
{
'server': defaultdict(<class 'list'>,
{
'missing': [
{
'address': '1.us.pool.ntp.org',
'config': {'address': '1.us.pool.ntp.org'},
'state': {'address': '1.us.pool.ntp.org'}
}
]
}
)
}
}
```

And you need to understand what is extra and what is missing from the result. (Think configuration compliance on a JSON/JSON-RPC system).

Well running the `parse_diff` will give you what is extra (in the comparison data) and missing from the reference data, and also the reverse. What is missing (in the reference data) that is missing from the comparison data.

An example will help visualize the results.

```python
In [1]: from jdiff import extract_data_from_json
...: from jdiff.check_types import CheckType
...: from jdiff.utils.diff_helpers import parse_diff

In [2]: reference_data = {"foo": {"bar-2": "baz2"}}
...: comparison_data = {"foo": {"bar-1": "baz1"}}
...: match_key = "foo"

In [3]: extracted_comparison_data = extract_data_from_json(comparison_data, match_key)

In [4]: extracted_comparison_data
Out[4]: {'bar-1': 'baz1'}

In [5]: extracted_reference_data = extract_data_from_json(reference_data, match_key)

In [6]: extracted_reference_data
Out[6]: {'bar-2': 'baz2'}

In [7]: jdiff_exact_match = CheckType.create("exact_match")
...: jdiff_evaluate_response, _ = jdiff_exact_match.evaluate(extracted_reference_data, extracted_comparison_data)

In [8]: jdiff_evaluate_response
Out[8]: {'bar-2': 'missing', 'bar-1': 'new'}

In [9]: parsed_extra, parsed_missing = parse_diff(
...: jdiff_evaluate_response,
...: comparison_data,
...: reference_data,
...: match_key,
...: )
...:

In [10]: parsed_extra
Out[10]: {'bar-1': 'baz1'}

In [10]: parsed_missing
Out[10]: {'bar-2': 'baz2'}
```

What about one with a more true JSON data structure. Like this RESTCONF YANG response.

```python
from jdiff import extract_data_from_json
from jdiff.check_types import CheckType
from jdiff.utils.diff_helpers import parse_diff

reference_data = {"openconfig-system:config": {"hostname": "veos", "ip name": "ntc.com"}}
comparison_data = {"openconfig-system:config": {"domain-name": "ntc.com", "hostname": "veos-0"}}
match_key = '"openconfig-system:config"'
extracted_comparison_data = extract_data_from_json(comparison_data, match_key)
extracted_reference_data = extract_data_from_json(reference_data, match_key)
jdiff_exact_match = CheckType.create("exact_match")
jdiff_evaluate_response, _ = jdiff_exact_match.evaluate(extracted_reference_data, extracted_comparison_data)

parsed_extra, parsed_missing = parse_diff(
jdiff_evaluate_response,
comparison_data,
reference_data,
match_key,
)
```
Which results in:

```python
In [24]: parsed_extra
{'hostname': 'veos-0', 'domain-name': 'ntc.com'}

In [25]: parsed_missing
Out[25]: {'hostname': 'veos', 'ip name': 'ntc.com'}
```

Now you can see how valuable this data can be to reconstruct, or remediate a out of compliant JSON object.

For more detailed examples see the `test_diff_helpers.py` file.
4 changes: 2 additions & 2 deletions jdiff/check_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,8 @@ def evaluate(self, *args, **kwargs) -> Tuple[Dict, bool]:
This method is the one that each CheckType has to implement.

Args:
*args: arguments specific to child class implementation
**kwargs: named arguments
*args (tuple): arguments specific to child class implementation
**kwargs (dict): named arguments

Returns:
tuple: Dictionary representing check result, bool indicating if differences are found.
Expand Down
5 changes: 4 additions & 1 deletion jdiff/extract_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,10 @@ def extract_data_from_json(data: Union[Mapping, List], path: str = "*", exclude:
if len(re.findall(r"\$.*?\$", path)) > 1:
clean_path = path.replace("$", "")
values = jmespath.search(f"{clean_path}{' | []' * (path.count('*') - 1)}", data)
return keys_values_zipper(multi_reference_keys(path, data), associate_key_of_my_value(clean_path, values))
return keys_values_zipper(
multi_reference_keys(path, data),
associate_key_of_my_value(clean_path, values),
)

values = jmespath.search(jmespath_value_parser(path), data)

Expand Down
109 changes: 105 additions & 4 deletions jdiff/utils/diff_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

import re
from collections import defaultdict
from functools import partial
from functools import partial, reduce
from operator import getitem
from typing import DefaultDict, Dict, List, Mapping

REGEX_PATTERN_RELEVANT_KEYS = r"'([A-Za-z0-9_\./\\-]*)'"
Expand All @@ -12,10 +13,10 @@ def get_diff_iterables_items(diff_result: Mapping) -> DefaultDict:
"""Helper function for diff_generator to postprocess changes reported by DeepDiff for iterables.

DeepDiff iterable_items are returned when the source data is a list
and provided in the format: "root['Ethernet3'][1]"
or more generically: root['KEY']['KEY']['KEY']...[numeric_index]
and provided in the format: `"root['Ethernet3'][1]"`
or more generically: `root['KEY']['KEY']['KEY']...[numeric_index]`
where the KEYs are dict keys within the original object
and the "[index]" is appended to indicate the position within the list.
and the `"[index]"` is appended to indicate the position within the list.

Args:
diff_result: iterable comparison result from DeepDiff
Expand Down Expand Up @@ -51,10 +52,12 @@ def fix_deepdiff_key_names(obj: Mapping) -> Dict:

Args:
obj (Mapping): Mapping to be fixed. For example:
```
{
"root[3]['7.7.7.7']['is_enabled']": {'new_value': False, 'old_value': True},
"root[3]['7.7.7.7']['is_up']": {'new_value': False, 'old_value': True}
}
```

Returns:
Dict: aggregated output, for example: {'7.7.7.7': {'is_enabled': {'new_value': False, 'old_value': True},
Expand Down Expand Up @@ -86,3 +89,101 @@ def dict_merger(original_dict: Dict, dict_to_merge: Dict):
original_dict[key + "_dup!"] = dict_to_merge[key] # avoid overwriting existing keys.
else:
original_dict[key] = dict_to_merge[key]


def _parse_index_element_string(index_element_string):
"""Build out dictionary from the index element string."""
result = {}
pattern = r"\[\'(.*?)\'\]"
match = re.findall(pattern, index_element_string)
if match:
for inner_key in match[1::]:
result[inner_key] = ""
return match, result


def set_nested_value(data, keys, value):
"""
Recursively sets a value in a nested dictionary, given a list of keys.

Args:
data (dict): The nested dictionary to modify.
keys (list): A list of keys to access the target value.
value (str): The value to set.

Returns:
None (None): The function modifies the dictionary in place. Returns None.
"""
if not keys:
return # Should not happen, but good to have.
if len(keys) == 1:
data[keys[0]] = value
else:
if keys[0] not in data:
data[keys[0]] = {} # Create the nested dictionary if it doesn't exist
set_nested_value(data[keys[0]], keys[1:], value)


def parse_diff(jdiff_evaluate_response, actual, intended, match_config):
"""Parse jdiff evaluate result into missing and extra dictionaries.

Dict value in jdiff_evaluate_response can be:
- 'missing' -> In the intended but missing from actual.
- 'new' -> In the actual missing from intended.

Examples of jdiff_evaluate_response:
- {'bar-2': 'missing', 'bar-1': 'new'}
- {'hostname': {'new_value': 'veos-actual', 'old_value': 'veos-intended'}, 'domain-name': 'new'}
- {'hostname': {'new_value': 'veos-0', 'old_value': 'veos'}, "index_element['ip name']": 'missing', 'domain-name': 'new'}
- {'servers': {'server': defaultdict(<class 'list'>, {'missing': [{'address': '1.us.pool.ntp.org', 'config': {'address': '1.us.pool.ntp.org'}, 'state': {'address': '1.us.pool.ntp.org'}}]})}}
"""
# Remove surrounding double quotes if present from jmespath/config-to-match match with - in the string.
match_config = match_config.strip('"')
extra = {} # In the actual missing from intended.
missing = {} # In the intended but missing from actual.

def process_diff(_map, extra_map, missing_map, previous_key=None):
"""Process the diff recursively."""
for key, value in _map.items():
if isinstance(value, dict) and all(nested_key in value for nested_key in ("new_value", "old_value")):
extra_map[key] = value["new_value"]
missing_map[key] = value["old_value"]
elif isinstance(value, str):
if "missing" in value and "index_element" in key:
key_chain, _ = _parse_index_element_string(key)
if len(key_chain) == 1:
missing_map[key_chain[0]] = intended.get(match_config, {}).get(key_chain[0])
else:
new_value = reduce(getitem, key_chain, intended)
set_nested_value(extra_map, key_chain[1::], new_value)
elif "missing" in value:
missing_map[key] = intended.get(match_config, {}).get(key)
else:
if "new" in value:
extra_map[key] = actual.get(match_config, {}).get(key)
elif isinstance(value, defaultdict):
value_dict = dict(value)
if "new" in value_dict:
extra_map[previous_key][key] = value_dict.get("new", {})
if "missing" in value_dict:
missing_map[previous_key][key] = value_dict.get("missing", {})
elif isinstance(value, dict):
extra_map[key] = {}
missing_map[key] = {}
process_diff(value, extra_map, missing_map, previous_key=key)
return extra_map, missing_map

extras, missing = process_diff(jdiff_evaluate_response, extra, missing)
# Don't like this, but with less the performant way of doing it right now it works to clear out
# Any empty dicts that are left over from the diff.
final_extras = extras.copy()
final_missing = missing.copy()
for key, value in extras.items():
if isinstance(value, dict):
if not value:
del final_extras[key]
for key, value in missing.items():
if isinstance(value, dict):
if not value:
del final_missing[key]
return final_extras, final_missing
6 changes: 3 additions & 3 deletions jdiff/utils/jmespath_parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,10 +135,10 @@ def multi_reference_keys(jmspath: str, data):
"""Build a list of concatenated reference keys.

Args:
jmspath: "$*$.peers.$*$.*.ipv4.[accepted_prefixes]"
data: tests/mock/napalm_get_bgp_neighbors/multi_vrf.json
jmspath (str): "$*$.peers.$*$.*.ipv4.[accepted_prefixes]"
data (dict): tests/mock/napalm_get_bgp_neighbors/multi_vrf.json

Returns:
Returns (str):
["global.10.1.0.0", "global.10.2.0.0", "global.10.64.207.255", "global.7.7.7.7", "vpn.10.1.0.0", "vpn.10.2.0.0"]
"""
ref_key_regex = re.compile(r"\$.*?\$")
Expand Down
10 changes: 10 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -139,3 +139,13 @@ nav:
- Contributing to the Library: "dev/contributing.md"
- Development Environment: "dev/dev_environment.md"
- Architecture Decisions: "dev/arch_decision.md"
- Code Reference:
- Jdiff: "code-reference/jdiff/__init__.md"
- check_types: "code-reference/jdiff/check_types.md"
- evaluators: "code-reference/jdiff/evaluators.md"
- extract_data: "code-reference/jdiff/extract_data.md"
- operator: "code-reference/jdiff/operator.md"
- jdiff_utils: "code-reference/jdiff/utils/__init__.md"
- data_normalization: "code-reference/jdiff/utils/data_normalization.md"
- diff_helpers: "code-reference/jdiff/utils/diff_helpers.md"
- jmespath_parsers: "code-reference/jdiff/utils/jmespath_parsers.md"
Loading