Vision Classification

Stack

Tasks

Depending on the model you choose the following tasks are available:

Face detection
Face landmark detection
Age detection
Sensitive content detection

A public list of useable models can be found here.

Installation

For ease of use it's recommended to use the provided compose.yml.

services:
  vision_classification:
    image: ghcr.io/doppeltilde/vision_classification:latest
    ports:
      - "8000:8000"
    volumes:
      - ./cropped_faces:/app/cropped_faces:rw
      - ./models:/root/.cache/huggingface/hub:rw
      - ./mediapipe_models:/app/mediapipe_models:rw
    env_file:
      - .env
    restart: unless-stopped

Caution

When using Docker Swarm, ensure that all necessary volumes are created and accessible before deployment.

Tip

You can find code examples in the examples folder.

Environment Variables

Create a .env file and set the preferred values.

# The default model used when no other is set.
DEFAULT_MODEL_NAME=
# Hugging Face access token used to access private models.
ACCESS_TOKEN=
DEFAULT_FACE_DETECTION_MODEL_URL=

# False == Public Access
# True == Access Only with API Key
USE_API_KEY="False"
API_KEY_HASH="<YOUR_GENERATED_KEY_HASH_HERE>"
API_KEY_SALT="<YOUR_GENERATED_SALT_HERE>"

LOG_LEVEL=INFO

Usage

Tip

Interactive API documentation can be found at: http://localhost:8000/docs

The API is divided into two distinct categories: Classify Images and MediaPipe Tasks.

Classify Images

The Classify Images endpoint leverages state-of-the-art models from Hugging Face to perform image classification. It processes input images and returns classification results, including predicted labels and associated confidence scores, based on the selected pre-trained model.

Mediapipe tasks

The MediaPipe Tasks endpoint utilizes Google's MediaPipe framework to perform various computer vision tasks. It currently exposes the following features:

Image Classification
Identifies what an image represents among a set of categories defined at training time.
Face Detection
Detects one or more human faces in an image. In addition to returning bounding boxes and detection confidence scores, this task supports:
- Automatic cropping and saving of detected face regions
- Extraction and saving of facial landmark coordinates for further processing or analysis
Pose Landmark Detection
Identifies and tracks the human body pose by detecting key anatomical landmarks (such as shoulders, elbows, wrists, hips, knees, ankles, etc.). The module returns the coordinates of each landmark along with visibility and presence scores.

Important

Set the log level to DEBUG, this will generate an api key, hash, and salt for you. Just don't forget to set it back to INFO.

Note

Please be aware that the initial classification process may require some time, as the model is being downloaded.

Notice: This project was initally created to be used in-house, as such the development is first and foremost aligned with the internal requirements.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github/workflows		.github/workflows
assets		assets
examples/dart		examples/dart
src		src
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
compose.dev.yml		compose.dev.yml
compose.swarm.yml		compose.swarm.yml
compose.yml		compose.yml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vision Classification

Stack

Tasks

Installation

Environment Variables

Usage

Classify Images

Mediapipe tasks

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vision Classification

Stack

Tasks

Installation

Environment Variables

Usage

Classify Images

Mediapipe tasks

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages