Skip to content

Encoding problem #21

@ehenneken

Description

@ehenneken

During the execution of python3 run.py RESOLVE -p /proj/ads/references/sources/PNAS -e *.xml the following exception was thrown:

Traceback (most recent call last):
  File "/app/adsrefpipe/refparsers/unicode.py", line 222, in __sub_hexnumasc_entity
    if self.unicode[entno]:
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run.py", line 323, in <module>
    process_files(source_filenames)
  File "run.py", line 107, in process_files
    parsed_references = toREFs.process_and_dispatch()
  File "/app/adsrefpipe/refparsers/JATSxml.py", line 348, in process_and_dispatch
    jats_reference = JATSreference(reference)
  File "/app/adsrefpipe/refparsers/reference.py", line 390, in __init__
    Reference.__init__(self, reference_str, unicode)
  File "/app/adsrefpipe/refparsers/reference.py", line 118, in __init__
    self.parse()
  File "/app/adsrefpipe/refparsers/JATSxml.py", line 37, in parse
    refstr = self.dexml(self.reference_str.toxml())
  File "/app/adsrefpipe/refparsers/reference.py", line 721, in dexml
    return self.unicode.ent2asc(self.strip_tags(refstr)).strip()
  File "/app/adsrefpipe/refparsers/unicode.py", line 171, in ent2asc
    result = self.re_hexnumentity.sub(self.__sub_hexnumasc_entity, result)
  File "/app/adsrefpipe/refparsers/unicode.py", line 227, in __sub_hexnumasc_entity
    raise UnicodeHandlerError('Unknown hexadecimal entity: %s' % match.group(0))
adsrefpipe.refparsers.unicode.UnicodeHandlerError: Unknown hexadecimal entity: &#x1d463;

Apparently this happened during the process of PNAS volume 112, possibly issue 43.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions