A Python web application for converting PDF forms into PDF-filling APIs.
You post an interactive PDF form, and the app will return a URL. If you post JSONs to that URL, the app will return the PDF and use the JSON to fill the form fields.
This app is a prototype. It is less than a month old. Don't use it in production.
- If you set up the project on your computer, you should be able to use it to fill pdf forms.
- It only supports a couple types of PDF form fields.
- The code base is only covered by a few integration tests.
- The app does not check for or properly handle a wide variety of errors that are likely to occur.
- It may be difficult to integrate this into an existing Flask project.
Here are the current priorities for development.
- Improve the documentation.
- Support the filling of most PDF field types.
- Increase test coverage to 100% with greater emphasis on unit tests.
- Make sure that tests cover any reasonable use cases.
This app idea originated from a common need to automatically fill PDF forms in various Code for America projects. Filling PDFs is an all-too-common need for many government and institutional services, and automatically populating forms can be a useful step in redesigning those services to better serve clients.
After an initial proof-of-concept in another project, I decided to spin this off into a separate project.
This was created by @bengolder at @CodeForAmerica with contributions from: @gauravmk, @zhoux10, @bhoeting, @samgensburg, @debrasol, and @joffemd
You need to download, install, and run the web app locally. Once the local server is running, you can upload PDFs to create APIs, and then fill those PDFs.
The following instructions assume:
- You are using a unix-based operating system, such as OS X or Linux
- You are comfortable on the command line
- Your computer has essential build tools installed
- You have git installed and configured on your command line.
This is a Python 3 app written using Flask. It has only been tested on Ubuntu 14.04 (via Travis CI) and OS X 10.11
This application depends on a command line utility called pdftk
server, by PDF Labs, offered under a GPL License.
If you are running OS X El Capitan 10.11, download the pdftk server installer here (requires your password for installation): https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk_server-2.02-mac_osx-10.11-setup.pkg
If your computer does not have Python 3, you can install it on OS X using Homebrew or by downloading it from python.org. Make sure that python3
is available on your PATH
.
brew install python3
# check the version
python3 --version
git clone https://github.com/codeforamerica/pdfhook.git
cd pdfhook
python3 -m venv . # create the virtual environment
source bin/activate
make install
You do not need to set up a database, but you can create a custom one if you like. By default, the application will create and use an SQLite database. It only uses one database table. Upon the first request it checks if that table has been created. If not, it will create a table before processing the request.
make run
Visit http://localhost:5000/ to see the demo page. Use the API to post a PDF form to the same URL.
make test
# or if you'd like to run specific tests
make test TEST_SCOPE=tests.integration.test_sample_pdfhook:TestPDFHook.test_fill_pdf
To redeploy on Heroku:
heroku apps:create
heroku addons:create heroku-postgresql:hobby-dev
heroku buildpacks:set heroku/python
heroku buildpacks:add --index 1 https://github.com/fxtentacle/heroku-pdftk-buildpack
heroku config:set LD_LIBRARY_PATH=/app/bin
heroku config:set PDFTK_PATH=/app/bin/pdftk
heroku config:set CONFIG=src.settings.ProdConfig
git push heroku master