This project is a toolset for performing optical character recognition (OCR) on Byzantine chant notation. It can process both PDFs and image files. Built in Python, the toolset uses OpenCV for image processing, PyTorch for deep learning, and a MobileNetV2 convolutional neural network (CNN) model for feature extraction and classification.
Download the latest app version from the Releases page. Unzip the files to any directory.
Run the app and select a file to perform OCR on. For PDF files, input a page range. The table below explains the available settings. Select the options you want, then press Go and choose a location to save the file.
Note
Even small issues in your scan—like slight skew, tiny speckles, or small gaps in neumes—can have a surprisingly large impact on OCR accuracy. Applying the right settings can make a big difference, so don’t underestimate their effect.
The resulting .byzocr file can be imported into Neanes.
It is generally expected that the OCR result will be at least 90% accurate for most common cases. You can use the martyria as a guide for finding and correcting errors. That is, if a martyria is not the expected value, then there must be an issue somewhere between the incorrect martyria and the last correct martyria.
If the resulting file is not sufficiently accurate, there are three possible causes.
- The image or PDF file is not clear, or contains a severe tilt or non-linear distortions (e.g. a picture of a curved page).
- There is a bug or inefficiency with the Neanes importer.
- The model needs more training.
If the font used for the neumes is significantly different from the fonts that the model was trained with, then the results will be less accurate. If you want to help make the model better, see the contribution guide. The maintainers of this repository will likely priortize training the model on fonts that are the most common and that will impact the most users. But if you want to train on a more obscure font or on handwritten works, pull requests are welcome.
If the image contains a lot of extraneous text that is not part of the lyrics, but that is also close to the neumes, the text removal process may fail to remove this text before performing OCR. In this case, the text may be misinterpreted as neumes belonging to the closest baseline. You can manually remove the text yourself from the image and try again.
If the YAML output is accurate, but the Neanes file is not, then this is possibly and issue with the Neanes importer and should be reported as such.
This project is licensed under the GNU General Public License, version 3.
This project was inspired by and builds upon concepts from the following paper:
C. Dalitz, G.K. Michalakis, C. Pranzas: Optical Recognition of Psaltic Byzantine Chant Notation. International Journal of Document Analysis and Recognition, vol. 11, no. 3, pp. 143-158 (2008)



