Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return change sequence from html_render_diff #197

Open
Mr0grog opened this issue Feb 3, 2025 · 0 comments
Open

Return change sequence from html_render_diff #197

Mr0grog opened this issue Feb 3, 2025 · 0 comments

Comments

@Mr0grog
Copy link
Member

Mr0grog commented Feb 3, 2025

It would be really useful for some users if html_render_diff returned a sequence of changes (tuples of (change_type, text)), kind of like html_text_diff, html_source_diff, difflib, etc. do. Like html_links_diff, we could probably do this via multiple entry points for different return formats.

We actually have a a sequence of changes like this internally (kind of; our list can be nested as is not always flat), but putting it back together is really complicated (most of that is handling the mess of understanding how to safely reconstruct the HTML tree around. the textual changes, where the tree structure may be different on either side o the diff, or the changes my start in the middle of an element and end in the middle of a different one, and need to be broken up into smaller changes). I don’t think we want to push that complexity onto users, so the return format we want is probably something more simplified:

  • A flat list of (change_type, text) tuples where the text part is an HTML string.

  • An xml.etree representation of the diff. People can find the diff tags in the tree and work with them.

Both of these are complicated because the tags or structure around (or inside) the changes are different depending on whether you are putting them back together in a “combined” view (one page with the deletions and insertions shown inline) or separate “deletions” and “insertions” views (usually meant for side-by-side viewing, which lets you see more accurately when the structure/layout of the page has changed a lot). We probably need to have someone still indicate which type of diff they want like they currently do with the include argument.

/cc @lesleyodu

@Mr0grog Mr0grog moved this to Inbox in Web Monitoring Feb 17, 2025
@Mr0grog Mr0grog moved this from Inbox to Backlog in Web Monitoring Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Backlog
Development

No branches or pull requests

1 participant