It's hard to list all the dependencies here. Currently, the main ones are:
- Python 3.4
- Windows SDK7.1 : See Python wiki
- Windows DDK (for Pyinsane ; required only if the no Python wheel is available)
- Pillow
- Python-levenshtein
- GObject Introspection, Gtk, Gdk, Libpoppler, & friends
- Msys2 or Mingw shell
- GNU Makefile
They must be installed before the rest of Paperwork. Once everything is installed:
- Clone
https://github.com/jflesch/paperwork.git
- You can run
make install
(GNU Makefile) to fetch all the Python dependencies not listed here. However, it won't create any shortcut or anything. Paperwork startup script is installed, but isn't of much help.
Go to where you checked out Paperwork frontend,
and run python paperwork\src\launcher.py
. Tesseract must be in your PATH.
cd git\paperwork
make windows_exe
It should create a directory 'paperwork' with all the required files, except Tesseract. This directory can be stored in a .zip file and deploy wherever you wish.
PyOCR has 2 ways to call Tesseract. Either
by running its executable (module pyocr.tesseract
), or using its library
(module pyocr.libtesseract
). Currently, for convenience reasons, the
packaged version of Paperwork uses only pyocr.tesseract
.
By default, this module looks for tesseract in the PATH only, and let Tesseract
look for its data files in the default location. However, when packaged with
Pyinstaller, PyOCR will also look for tesseract in the subdirectory tesseract
of the current directory (os.path.join(os.getcwd(), 'tesseract')
). It will
also set an environment variable so Tesseract looks for its data files in
the subdirectory data\tessdata
.
So in the end, you can put Paperwork in a directory with the following hierarchy:
C:\Program Files (x86)\OpenPaper\ (for example)
|-- Paperwork\ (for example)
|
|-- Paperwork.exe
|-- (...).dll
|
|-- Tesseract\
| |-- Tesseract.exe
| |-- (...).dll
|
|-- Data
|
|-- paperwork.svg
|-- (...)
|
|-- Tessdata\
|-- eng.traineddata
|-- fra.traineddata
Note that it will only work if packaged with Pyinstaller.