-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(find_reference_citations_from_markup) #203
Conversation
Solves #198 Implements a function to get name-only ReferenceCitations, taking advantage of style i/em tags on HTML sources - Refactors ReferenceCitation.is_valid_name to utils.is_valid_name - add tests for the new function
81b4fa2
to
2e6ae84
Compare
2e6ae84
to
24d6166
Compare
24d6166
to
d8eb198
Compare
…w uses find_reference_citations_from_markup Adds logic to use freelawproject/eyecite#203
apply refactor from code review #206
…ent pincites This will help disambiguate adyacent ReferenceCitations - add `helpers.add_pre_citation` - add regex needed - add test_FindTest where this is used - resolved a bug in match_on_tokens where MAX_MATCH_CHARS was used incorrectly - updated tests that where invalidated, where what was identified as a Reference was actually a part of the FullCaseCitation
This is passed to `extract_reference_citations`, which allows us to use `find_reference_citations_from_markup` inside that function, simplyfing the calls
Solves #209 - add test cases for full case citation with antecedent and no pincite - fix span calculation on add_pre_citation
8db0f0a
to
71c42c6
Compare
Bill noticed on testing that the HTML extraction on real data was slow; we were using a SpanUpdater for each full citation; code is now refactored to create the SpanUpdaters once, for each Opinion
71c42c6
to
509c12a
Compare
38fde4f
to
61ddc22
Compare
…to exclude whitespace and punctuation - updates test_AnnotateTest for a single space shift - added xml cleanup to benchmark
…no valid names are found
45981a7
to
8f72df6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me some time to understand the changes and the existing code but I think everything looks good, the code is well commented and structured, I tried the tests and there were no problems.
I think it's ready
great job @grossir
The Eyecite Report 👁️Gains and LossesThere were 0 gains and 236 losses. Click here to see details.
Time ChartGenerated Files |
This looks great. Cant wait to rerun the tests |
Solves #198
Solves problems made evident by first iteration of this PR and described here in #209
find.get_citations
regexes.PRE_FULL_CITATION_REGEX
to account for single-name full case citations and for single-name-and-pincite-full-case-citations