Skip to content

Latest commit

 

History

History
31 lines (31 loc) · 1.48 KB

File metadata and controls

31 lines (31 loc) · 1.48 KB
title Analysis of the Penn Korean Universal Dependency Treebank (PKT-UD): Manual Revision to Build Robust Parsing Model in Korean
authors
Tae Hwan Oh
Ji Yoon Han
Hyonsu Choe
Seokwon Park
Han He
Jinho D. Choi
Na-Rae Han
Jena D. Hwang
Hansaem Kim
venue International Conference on Parsing Technologies (IWPT)
year 2020
published 2020-07-09
publicationType conference
venueUrl https://iwpt20.sigparse.org/
paperUrl https://aclanthology.org/2020.iwpt-1.13/
abstract In this paper, we first open on important issues regarding the Penn Korean Universal Treebank (PKT-UD) and address these issues by revising the entire corpus manually with the aim of producing cleaner UD annotations that are more faithful to Korean grammar. For compatibility to the rest of UD corpora, we follow the UDv2 guidelines, and extensively revise the part-of-speech tags and the dependency relations to reflect morphological features and flexible wordorder aspects in Korean. The original and the revised versions of PKT-UD are experimented with transformer-based parsing models using biaffine attention. The parsing model trained on the revised corpus shows a significant improvement of 3.0% in labeled attachment score over the model trained on the previous corpus. Our error analysis demonstrates that this revision allows the parsing model to learn relations more robustly, reducing several critical errors that used to be made by the previous model.