Skip to content

Commit

Permalink
Add ogl-pdl-annotations conversion JSON
Browse files Browse the repository at this point in the history
  • Loading branch information
jacobwegner committed Sep 21, 2023
1 parent 4eb9c94 commit d38f63d
Showing 1 changed file with 10,520 additions and 0 deletions.
Loading

1 comment on commit d38f63d

@jacobwegner
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amedsaid1831.dw042.perseus-eng1.conllu was ran through a processing pipeline from https://github.com/scaife-viewer/ogl-pdl-annotations.

The pipeline has no idea as to what the original structure of the input document is, so it is creating a "virtual" exemplar where each chunk is a sentence.

e.g. https://beyond-translation.perseus.org/reader/urn:cts:greekLit:tlg0012.tlg001.parrish-eng1-trees:1

This is not ideal for our purposes.

There is nearly a 1:1 relationship between sentences and lines, except for this sentence:

# text = He was called General Jim Owen With their brother called Colonel John Owen.
# sent_id = 92

This is where a MISC annotation or comment in the UD file may be helpful. We'll explore this in the next commit.

Please sign in to comment.