Skip to content

Minor typo and import fixes #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 15 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ It's the recommended way to explore this tool. It provides notebooks for playing

## Unit Tests

All the codes in `src` are covered.
All the code in `src` is covered.

```
cd tests
Expand All @@ -59,7 +59,7 @@ The loader reads the file and creates a mask.

The mask is a numpy array. The bright parts are set to 255, the rest is set to 0. It contains ONLY these 2 numbers.

#### Atrributes
#### Attributes

- low_threshold = (0, 0, 250)

Expand All @@ -70,7 +70,7 @@ They control the creation of the mask, used in the function `cv.inRange`.

#### Result

Here, yellow is `255`, purple is `0`.
Here, yellow is `255` and purple is `0`.

![mask](./data/output/mask.jpeg)

Expand All @@ -80,19 +80,19 @@ The extractor, first, generates the regions from the mask.

Then, it removes the small and the big regions because the signature is neither too big nor too small.

The process is as followed.
The process is as follows:

1. label the image

`skimage.measure.label` labels the connected regions of an integer array. It returns a labeled array, where all connected regions are assigned the same integer value.

2. calculate the average size of regions
2. calculate the average size of the regions

Here, the size means **the number of the pixels in a region**.
Here, the size means **the number of pixels in a region**.

We accumulate the number of the pixels in all the regions, `total_pixels`. The average size is `total_pixels / nb_regions`.
We accumulate the number of pixels in all the regions, `total_pixels`. The average size is `total_pixels / nb_regions`.

If the size of a region is smaller `min_area_size`, this region is ignored. `min_area_size` is given by the user.
If the size of a region is smaller than `min_area_size`, this region is ignored. `min_area_size` is given by the user.

3. calculate the size of the small outlier

Expand All @@ -105,10 +105,10 @@ The process is as followed.
4. calculate the size of the big outlier

```
big_size_outlier = small_size_outlier * amplfier
big_size_outlier = small_size_outlier * amplifier
```

`amplfier` is given by the user.
`amplifier` is given by the user.

5. remove the small and big outliers

Expand All @@ -118,7 +118,7 @@ The process is as followed.

- outlier_bias = 100

- amplfier = 10
- amplifier = 10

> `15` is used in the demo.

Expand All @@ -131,7 +131,7 @@ The process is as followed.

### Cropper

The cropper finds the **contours** of regions in the **labeled masks** and crop them.
The cropper finds the **contours** of regions in the **labeled masks** and crops them.

#### Attributes

Expand Down Expand Up @@ -166,9 +166,9 @@ Suppose `(h, w) = cropped_mask.shape`.

- max_pixel_ratio: [low, high]

low < the number of 0 / the number of 255 < high.
low < the number of 0s / the number of 255s < high.

The mask should only have 2 value, 0 and 255.
The mask should only have 2 values, 0 and 255.

By default:

Expand All @@ -180,6 +180,6 @@ By default:

- `max(h, w) / min(h, w)` = 3.48

- number of `0` / number of `255` = 0.44
- number of `0s` / number of `255s` = 0.44

So, this image is signed.
297 changes: 180 additions & 117 deletions demo.ipynb

Large diffs are not rendered by default.

3 changes: 1 addition & 2 deletions demo.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
import sys

from signature_detect.cropper import Cropper
from signature_detect.extractor import Extractor
from signature_detect.loader import Loader
Expand All @@ -8,7 +7,7 @@

def main(file_path: str) -> None:
loader = Loader()
extractor = Extractor(amplfier=15)
extractor = Extractor(amplifier=15)
cropper = Cropper()
judger = Judger()

Expand Down
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,5 +24,5 @@
"opencv-python",
],
extras_require={"dev": ["coverage>=5.5"]},
license = "MIT",
)
license="MIT",
)
13 changes: 6 additions & 7 deletions src/signature_detect/cropper.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
import math
from typing import Any
import cv2
import math
import numpy as np
from PIL import Image
from typing import Any


class Cropper:
Expand Down Expand Up @@ -71,7 +71,7 @@ def find_contours(self, img):
img: numpy array
Return:
boxes: A numpy array of contours.
each items in the array is a contour (x, y, w, h)
each item in the array is a contour (x, y, w, h)
"""
cnts = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnt = cnts[0] if len(cnts) == 2 else cnts[1]
Expand All @@ -86,7 +86,6 @@ def find_contours(self, img):
and h < copy_img.shape[0]
and w < copy_img.shape[1]
):

# cv2.rectangle(copy_img, (x, y), (x + w, y + h), (155, 155, 0), 1)
boxes.append([x, y, w, h])

Expand All @@ -99,9 +98,9 @@ def find_contours(self, img):

return sorted_boxes

def is_intersected(self, new_box, orignal_box) -> bool:
def is_intersected(self, new_box, original_box) -> bool:
[x_a, y_a, w_a, h_a] = new_box
[x_b, y_b, w_b, h_b] = orignal_box
[x_b, y_b, w_b, h_b] = original_box

if y_a > y_b + h_b:
return False
Expand Down Expand Up @@ -188,7 +187,7 @@ def merge_regions_and_masks(self, mask, regions) -> dict:

def run(self, np_image):
"""
read the signature extracted by Extractor, and crop it.
read the signature extracted by Extractor and crop it.
"""

# find contours
Expand Down
21 changes: 10 additions & 11 deletions src/signature_detect/extractor.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
import numpy as np
from typing import Any
from skimage import measure, morphology
from skimage.measure import regionprops
import numpy as np


class Extractor:
"""
Extract the signature from a mask. The process is as followed.
Extract the signature from a mask. The process is as follows.

1. It finds the regions in an image mask. Each region has a label (unique number).
2. It removes the small regions. The small region is defined by attributes.
3. It remove the big regions. The big region is defined by attributes.
3. It removes the big regions. The big region is defined by attributes.
4. It returns a labeled image. The numbers in the image are the region labels, NOT pixels.

Attributes
Expand All @@ -19,8 +19,8 @@ class Extractor:
The weight of small outlier size
outlier_bias: int
The bias of small outlier size
amplfier: int
The amplfier calculates the big outlier size from the small one
amplifier: int
The amplifier calculates the big outlier size from the small one
min_area_size: int
The min region area size in the labeled image.

Expand All @@ -31,22 +31,22 @@ class Extractor:
"""

def __init__(
self, outlier_weight=3, outlier_bias=100, amplfier=10, min_area_size=10
self, outlier_weight=3, outlier_bias=100, amplifier=10, min_area_size=10
):
# the parameters are used to remove small size connected pixels outlier
self.outlier_weight = outlier_weight
self.outlier_bias = outlier_bias
# the parameter is used to remove big size connected pixels outlier
self.amplfier = amplfier
self.amplifier = amplifier
self.min_area_size = min_area_size

def __str__(self) -> str:
s = "\nExtractor\n==========\n"
s += "outlier_weight = {}\n".format(self.outlier_weight)
s += "outlier_bias = {}\n".format(self.outlier_bias)
s += "> small_outlier_size = outlier_weight * average_region_size + outlier_bias\n"
s += "amplfier = {}\n".format(self.amplfier)
s += "> large_outlier_size = amplfier * small_outlier_size\n"
s += "amplifier = {}\n".format(self.amplifier)
s += "> large_outlier_size = amplifier * small_outlier_size\n"
s += "min_area_size = {} (pixels)\n".format(self.min_area_size)
s += "> min_area_size is used to calculate average_region_size.\n"
return s
Expand All @@ -69,7 +69,6 @@ def extract(self, mask) -> Any:

total_pixels = 0
nb_region = 0
average = 0.0
for region in regionprops(labels):
if region.area > self.min_area_size:
total_pixels += region.area
Expand All @@ -83,7 +82,7 @@ def extract(self, mask) -> Any:

# big_size_outlier is used as a threshold value to remove pixels
# are bigger than big_size_outlier
big_size_outlier = small_size_outlier * self.amplfier
big_size_outlier = small_size_outlier * self.amplifier

# remove small pixels
labeled_image = morphology.remove_small_objects(labels, small_size_outlier)
Expand Down
4 changes: 2 additions & 2 deletions src/signature_detect/judger.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from typing import Any
import numpy as np
from typing import Any


class Judger:
Expand All @@ -12,7 +12,7 @@ class Judger:

low < max(h, w) / min(h, w) < high.

h, w are the heigth and width of the input mask.
h, w are the height and width of the input mask.

- max_pixel_ratio: [low, high]

Expand Down
3 changes: 1 addition & 2 deletions src/signature_detect/loader.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
from typing import Any

import cv2
import numpy as np
import os
from typing import Any
from wand.image import Image


Expand Down
9 changes: 3 additions & 6 deletions tests/test_cropper.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,13 @@
import numpy as np
import sys
import unittest

import numpy as np

sys.path.append("..")

from signature_detect.cropper import Cropper
from signature_detect.extractor import Extractor
from signature_detect.loader import Loader

from tests.data.dummy import TEST_IMAGE_PATH

sys.path.append("..")


class TestCropper(unittest.TestCase):
def test_init(self):
Expand Down
15 changes: 6 additions & 9 deletions tests/test_extractor.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,19 @@
import numpy as np
import sys
import unittest

import numpy as np

sys.path.append("..")

from signature_detect.extractor import Extractor
from signature_detect.loader import Loader

from tests.data.dummy import TEST_IMAGE_PATH

sys.path.append("..")


class TestExtractor(unittest.TestCase):
def test_init(self):
extractor = Extractor()
self.assertEqual(extractor.outlier_weight, 3)
self.assertEqual(extractor.outlier_bias, 100)
self.assertEqual(extractor.amplfier, 10)
self.assertEqual(extractor.amplifier, 10)
self.assertEqual(extractor.min_area_size, 10)

def test_str(self):
Expand All @@ -25,8 +22,8 @@ def test_str(self):
s += "outlier_weight = 3\n"
s += "outlier_bias = 100\n"
s += "> small_outlier_size = outlier_weight * average_region_size + outlier_bias\n"
s += "amplfier = 10\n"
s += "> large_outlier_size = amplfier * small_outlier_size\n"
s += "amplifier = 10\n"
s += "> large_outlier_size = amplifier * small_outlier_size\n"
s += "min_area_size = 10 (pixels)\n"
s += "> min_area_size is used to calculate average_region_size.\n"
self.assertEqual(str(extractor), s)
Expand Down
14 changes: 6 additions & 8 deletions tests/test_judger.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,15 @@
import numpy as np
import sys
import unittest

import numpy as np

sys.path.append("..")

from signature_detect.cropper import Cropper
from signature_detect.extractor import Extractor
from signature_detect.loader import Loader
from signature_detect.judger import Judger

from tests.data.dummy import TEST_IMAGE_PATH

sys.path.append("..")


class TestJudger(unittest.TestCase):
def test_init(self):
judger = Judger()
Expand All @@ -30,7 +28,7 @@ def test_str(self):
def test_is_valid_mask(self):
judger = Judger()

mask = np.array([[0,0,0,0]])
mask = np.array([[0, 0, 0, 0]])
res = judger.judge(mask)
self.assertFalse(res)

Expand All @@ -47,7 +45,7 @@ def test_is_valid_mask(self):
def test_judge(self):
judger = Judger()

mask = np.array([[255,0,0,0,0]])
mask = np.array([[255, 0, 0, 0, 0]])
res = judger.judge(mask)
self.assertFalse(res)

Expand Down
9 changes: 3 additions & 6 deletions tests/test_loader.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,11 @@
import numpy as np
import sys
import unittest

import numpy as np

sys.path.append("..")

from signature_detect.loader import Loader

from tests.data.dummy import TEST_IMAGE_PATH, TEST_PDF_PATH, TEST_TIF_PATH

sys.path.append("..")


class TestLoader(unittest.TestCase):
def test_loader_init(self):
Expand Down