Skip to content

Commit 63a9b77

Browse files
authored
Code cleanup (#113)
1 parent db30e51 commit 63a9b77

24 files changed

+160019
-317270
lines changed

.flake8

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[flake8]
2+
max-line-length=160
3+
extend-ignore = E203

.github/workflows/python-publish.yml

+78
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
name: Test and Publish Python Package
2+
3+
on: [push, pull_request]
4+
5+
permissions:
6+
contents: read
7+
8+
jobs:
9+
lint:
10+
runs-on: ubuntu-latest
11+
steps:
12+
- uses: actions/checkout@v2
13+
- uses: actions/setup-python@v2
14+
with:
15+
python-version: "3.12"
16+
- name: Install dependencies
17+
run: |
18+
pip install --upgrade pip
19+
pip install .[dev]
20+
- name: flake8
21+
run: flake8 probablepeople tests
22+
- name: isort
23+
if: always()
24+
run: isort --check-only .
25+
- name: black
26+
if: always()
27+
run: black . --check
28+
- name: mypy
29+
if: always()
30+
run: mypy
31+
test:
32+
timeout-minutes: 40
33+
runs-on: ${{ matrix.os }}
34+
strategy:
35+
fail-fast: false
36+
matrix:
37+
os: [windows-latest, macos-latest, ubuntu-latest]
38+
python-version: [3.9, "3.10", "3.11", "3.12", "3.13-dev"]
39+
40+
steps:
41+
- uses: actions/checkout@v2
42+
- name: Set up Python ${{ matrix.python-version }}
43+
uses: actions/setup-python@v2
44+
with:
45+
python-version: ${{ matrix.python-version }}
46+
- name: Install dependencies
47+
run: |
48+
pip install --upgrade pip
49+
pip install -e .[dev]
50+
- name: pytest
51+
run: pytest
52+
53+
deploy:
54+
if: github.event_name == 'push' && startsWith(github.event.ref, 'refs/tags')
55+
needs: [test, lint]
56+
57+
runs-on: ubuntu-latest
58+
59+
name: Upload release to PyPI
60+
environment:
61+
name: pypi
62+
url: https://pypi.org/p/probablepeople
63+
permissions:
64+
id-token: write
65+
steps:
66+
- uses: actions/checkout@v4
67+
- name: Set up Python
68+
uses: actions/setup-python@v3
69+
with:
70+
python-version: '3.x'
71+
- name: Install dependencies
72+
run: |
73+
python -m pip install --upgrade pip
74+
pip install build
75+
- name: Build package
76+
run: python -m build
77+
- name: Publish package distributions to PyPI
78+
uses: pypa/gh-action-pypi-publish@release/v1

.pre-commit-config.yaml

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
repos:
2+
- repo: https://github.com/psf/black
3+
rev: 24.8.0
4+
hooks:
5+
- id: black
6+
- repo: https://github.com/pycqa/isort
7+
rev: 5.13.2
8+
hooks:
9+
- id: isort
10+
name: isort (python)
11+
- repo: https://github.com/pycqa/flake8
12+
rev: "7.1.1"
13+
hooks:
14+
- id: flake8
15+
args: [--config=.flake8]

.travis.yml

-24
This file was deleted.

MANIFEST.in

+1
Original file line numberDiff line numberDiff line change
@@ -1 +1,2 @@
11
include LICENSE
2+
include name_data/labeled/*

Makefile

-13
This file was deleted.

README.md

+8-9
Original file line numberDiff line numberDiff line change
@@ -52,14 +52,13 @@ probablepeople learns how to parse names/companies through a body of training da
5252
Probablepeople uses [parserator](https://github.com/datamade/parserator), a library for making and improving probabilistic parsers - specifically, parsers that use [python-crfsuite](https://github.com/tpeng/python-crfsuite)'s implementation of conditional random fields. Parserator allows you to train probablepeople's model (a .crfsuite settings file) on labeled training data, and provides tools for easily adding new labeled training data.
5353
#### Building & testing development code
5454

55-
```
56-
git clone https://github.com/datamade/probablepeople.git
57-
cd probablepeople
58-
pip install -r requirements.txt
59-
python setup.py develop
60-
make all
61-
nosetests .
62-
```
55+
```console
56+
git clone https://github.com/datamade/probablepeople.git
57+
cd probablepeople
58+
pip install -e .
59+
pytest
60+
```
61+
6362
#### Creating/adding labeled training data (.xml outfile) from unlabeled raw data (.csv infile)
6463

6564
If there are name/company formats that the parser isn't performing well on, you can add them to training data. As probablepeople continually learns about new cases, it will continually become smarter and more robust.
@@ -93,7 +92,7 @@ The parserator `label` command will start a console labeling task, where you wil
9392
parserator train name_data/labeled/person_labeled.xml,name_data/labeled/company_labeled.xml probablepeople --modelfile=generic
9493
parserator train name_data/labeled/person_labeled.xml probablepeople --modelfile=person
9594
parserator train name_data/labeled/company_labeled.xml probablepeople --modelfile=company
96-
```
95+
```
9796

9897
## Errors and Bugs
9998

0 commit comments

Comments
 (0)