Classification Workflow Step #668

FeezyHendrix · 2025-11-24T13:22:13Z

What type of PR is this? (check all applicable)

Related Issue

Related to image classification workflow enhancement

Describe this PR

This PR implements several critical fixes and enhancements to the drone image classification workflow:

Key Changes:

Fixed Async S3 Operations: Implemented proper async handling for MinIO S3 operations using run_in_threadpool to prevent blocking the event loop during image downloads in the classification process.
ARQ Worker Connectivity: Added extra_hosts configuration to the ARQ worker in docker-compose.yaml to enable MinIO connectivity via localhost, which is required for generating presigned URLs accessible from the browser.
Database Schema Fixes: Updated create_project_image function to include batch_id and rejection_reason fields in both INSERT statements and RETURNING clauses, resolving Pydantic validation errors.
Classification Status Flow: Changed classification workflow to work with STAGED status instead of UPLOADED status, aligning with the actual upload flow where images are initially staged in user-uploads directory.
Frontend API Corrections: Updated frontend to send batch_id as query parameter instead of request body to match backend endpoint expectations.
Button State Management:
- Disabled "Next" button on Image Upload step until files are successfully uploaded
- Disabled "Start Classification" button when no staged images are available for classification
Toast Notification Fix: Prevented duplicate success notifications by implementing a ref-based flag to ensure only one notification shows per upload batch.
EXIF Processing: Temporarily disabled frontend EXIF extraction (commented out) as the backend handles this during the process_uploaded_image ARQ task.
Consistent Return Values: Fixed classify_batch function to always return complete result dictionaries with all expected keys, preventing KeyError exceptions.

Technical Details:

Backend Changes:

src/backend/app/s3.py: Added run_in_threadpool wrapper for async S3 operations
src/backend/app/images/image_logic.py: Updated INSERT/RETURNING clauses to include batch_id and rejection_reason
src/backend/app/projects/classification_routes.py: Changed to query for staged status, added staged count to status response, added pre-flight image count check
src/backend/app/projects/image_classification.py: Updated classify_batch to query STAGED images and return consistent result structure
compose.yaml: Added extra_hosts for ARQ worker

Frontend Changes:

src/frontend/src/services/classification.ts: Changed to send batch_id as query parameter
src/frontend/src/api/projects.ts: Added proper TypeScript typing for BatchStatusSummary
src/frontend/src/components/.../ImageClassification.tsx: Changed button disable logic to check for staged count
src/frontend/src/components/.../DroneImageProcessingWorkflow/index.tsx: Added Next button disable logic based on batch_id presence
src/frontend/src/components/.../UppyFileUploader/index.tsx: Fixed duplicate toast notifications, commented out EXIF extraction

Screenshots

Screenshots would show:

Disabled "Next" button before file upload
Enabled "Next" button after successful upload
Disabled "Start Classification" button when no staged images
Single success toast notification after batch upload completes

Alternative Approaches Considered

Status Flow: Considered using UPLOADED status for classification but determined STAGED is more appropriate for the current workflow where images remain in staging directory.
Batch ID Persistence: Explored adding an endpoint to retrieve the latest batch_id from backend vs. maintaining state in frontend component.
S3 Connectivity: Initially attempted to change S3_ENDPOINT configuration, but used extra_hosts instead to maintain backward compatibility with presigned URLs.

Review Guide

Testing the Changes:

Upload Flow:
- Open Drone Image Processing Workflow modal
- Verify "Next" button is disabled on step 1
- Upload drone images
- Verify single success notification appears
- Verify "Next" button becomes enabled
- Verify automatic progression to Classification step
Classification Flow:
- On Classification step, verify status summary shows correct counts (Staged, Total, etc.)
- Verify "Start Classification" button is disabled if no staged images
- Click "Start Classification" button
- Verify ARQ worker processes images successfully (check logs)
- Verify images are classified and status updates in real-time
ARQ Worker:
- Check ARQ worker logs for successful MinIO connections
- Verify no connection refused errors to localhost:9000
- Verify images are processed and saved with correct batch_id
Backend Endpoints:
- Test /api/projects/{project_id}/classify-batch/?batch_id={batch_id} endpoint
- Test /api/projects/{project_id}/batch/{batch_id}/status/ returns staged count
- Verify batch classification only starts when staged images exist

Checklist before requesting a review

📖 Read the HOT Code of Conduct: https://docs.hotosm.org/code-of-conduct
👷‍♀️ Create small PRs. In most cases, this will be possible.
✅ Provide tests for your changes.
📝 Use descriptive commit messages.
📗 Update any related documentation and include any relevant screenshots.

[optional] What gif best describes this PR or how it makes you feel?

Note: This PR includes multiple related fixes that were discovered and resolved during the implementation of the image classification workflow. Each change addresses a specific issue that was blocking the classification functionality.

… uppy

…es; update moduleResolution to bundler

…le handling - Added Uppy instance to the global App component for shared file upload functionality. - Implemented UppyImageUploader component to utilize Uppy for image uploads with AWS S3 integration. - Refactored file handling to extract EXIF data upon file addition and display upload progress. - Updated Vite configuration to target 'esnext' for improved compatibility

…inIO - Replaced MinIO client with boto3 for S3 operations in s3.py. - Updated methods for file upload, download, and presigned URL generation to align with boto3 API. - Adjusted error handling to use ClientError from botocore. - Modified user_schemas.py to generate presigned URLs using boto3. - Updated dependencies in pyproject.toml to include boto3 and remove MinIO. - Ensured compatibility with existing code by stripping leading slashes from S3 paths. - Added content type headers for multipart upload requests in the frontend UppyImageUploader component

…dability and maintainability

…w UI with upload, classification, review, and processing steps

…ance upload functionality - Replaced UppyImageUploader with UppyFileUploader for consistency across components. - Improved ImageUpload component to track upload progress and display upload queue. - Added support for staging uploads in UppyFileUploader. - Updated DroneImageProcessingWorkflow to adjust modal dimensions for better usability. - Removed unused UppyImageUploader component and its related code BREAKING CHANGE: Switch from minio to boto3

…ance upload functionality - Replaced UppyImageUploader with UppyFileUploader for consistency across components. - Improved ImageUpload component to track upload progress and display upload queue. - Added support for staging uploads in UppyFileUploader. - Updated DroneImageProcessingWorkflow to adjust modal dimensions for better usability. - Removed unused UppyImageUploader component and its related code

…d request in UppyFileUploader

for more information, see https://pre-commit.ci

… table migration with conditional enum creation and improved index handling

…one-tm into feat/resumable-uploads

for more information, see https://pre-commit.ci

…epSwitcher component for project steps navigation

…l for managing project images with relationships and indexing

…se save function and EXIF handling

… code formatting in image logic and migration files

… with generate_presigned_download_url across the codebase; add new migration script for images from S3

…migration with conflict handling and improved logging

* feat: added multipart initate upload route * feat: added all multipart upload functions * feat(UploadBox): Swapped upload from single to multipart upload using uppy * feat(frontend): add @uppy/drag-drop and @uppy/progress-bar dependencies; update moduleResolution to bundler * feat(frontend): feat: integrate Uppy for image uploads and enhance file handling - Added Uppy instance to the global App component for shared file upload functionality. - Implemented UppyImageUploader component to utilize Uppy for image uploads with AWS S3 integration. - Refactored file handling to extract EXIF data upon file addition and display upload progress. - Updated Vite configuration to target 'esnext' for improved compatibility * feat(image-upload): Refactor S3 integration to use boto3 instead of MinIO - Replaced MinIO client with boto3 for S3 operations in s3.py. - Updated methods for file upload, download, and presigned URL generation to align with boto3 API. - Adjusted error handling to use ClientError from botocore. - Modified user_schemas.py to generate presigned URLs using boto3. - Updated dependencies in pyproject.toml to include boto3 and remove MinIO. - Ensured compatibility with existing code by stripping leading slashes from S3 paths. - Added content type headers for multipart upload requests in the frontend UppyImageUploader component * feat(image-upload-workflow): Refactor code structure for improved readability and maintainability * feat(image-upload-workflow): Implement drone image processing workflow UI with upload, classification, review, and processing steps * feat(image-upload-workflow): Refactor image upload components and enhance upload functionality - Replaced UppyImageUploader with UppyFileUploader for consistency across components. - Improved ImageUpload component to track upload progress and display upload queue. - Added support for staging uploads in UppyFileUploader. - Updated DroneImageProcessingWorkflow to adjust modal dimensions for better usability. - Removed unused UppyImageUploader component and its related code BREAKING CHANGE: Switch from minio to boto3 * feat(image-upload-workflow): Refactor image upload components and enhance upload functionality - Replaced UppyImageUploader with UppyFileUploader for consistency across components. - Improved ImageUpload component to track upload progress and display upload queue. - Added support for staging uploads in UppyFileUploader. - Updated DroneImageProcessingWorkflow to adjust modal dimensions for better usability. - Removed unused UppyImageUploader component and its related code * fixed with minio * feat(image-upload-workflow): Add project_id and filename to the upload request in UppyFileUploader * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * feat(image-upload-workflow): feat(migrations): Enhance project_images table migration with conditional enum creation and improved index handling * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix(image-upload-workflow): feat(step-switcher): Implement generic StepSwitcher component for project steps navigation * feat(image-upload-workflow): feat(db-models): Add DbProjectImage model for managing project images with relationships and indexing * feat(image-upload-workflow): Enhance image processing with new database save function and EXIF handling * refactor(image-upload-workflow): Replace deprecated get_presigned_url with generate_presigned_download_url across the codebase; add new migration script for images from S3 * fix(image-upload-workflow): Implement upsert functionality for image migration with conflict handling and improved logging * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: AbdulHafeez AbdulRaheem <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

- Added new hooks for starting classification, fetching batch status, and retrieving batch images in the projects API. - Enhanced the ImageClassification component to handle classification processes, including polling for updates and displaying status. - Updated the DroneImageProcessingWorkflow to manage batch ID and transition between upload and classification steps. - Modified UppyFileUploader to support batch ID generation and pass it upon upload completion. - Created a new migration to add classification-related fields to the project_images table. - Developed new classification routes for starting batch classification and retrieving batch status and images. - Implemented the ImageClassifier class to handle image classification logic, including quality checks and task matching. - Added service functions for interacting with the classification API, including starting classification and fetching batch details

…ges and implement sharpness calculation in image classification

…processing and status checks

…oaded' to 'staged' and enhance upload notifications

spwoodcock · 2025-11-24T14:30:30Z

Nice! I'm sure this is going to be great 😄

I'll try my best to review this as soon as I can - got quite a few things on.

Couple pieces of feedback though:

Please try and make a draft PR as early as possible, so this can be reviewed incrementally in future. It's a huge task to do an in-depth review of thousands of lines of code & maintainer availability often rises and falls.
If possible, it's great regularly rebasing on the target branch, meaning conflicts are addressed bit-by-bit, as they happen & the final code is always in sync.

Thank you! 🙏

spwoodcock · 2025-11-24T14:31:39Z

Oh also, thanks so much for the great detailed description - really helps!

spwoodcock · 2025-11-24T18:12:23Z

This is looking great! But perhaps we should be targeting a merge to dev instead of the image-workflow-integration branch? Or perhaps we can just update image-workflow-integration to be in sync to dev to ensure we don't have merge conflicts on the final merge.

I'll let you take a look at the merge conflict resolution & come back for a second review 👍

…and image count checks

…nt image processing workflow state management and update components for classification handling

- Introduced a new MapContext to manage map state and loading status across components. - Updated BaseLayerSwitcher, COGOrthophotoViewer, LocateUser, and VectorLayer components to consume map context instead of passing map instance as props. - Simplified the MapContainer component to provide context to its children. - Enhanced error handling and user feedback in ForgotPassword and Login components. - Improved navigation handling in Navbar and ProtectedRoute components. - Updated drone model options in constants and adjusted related state management. - Added utility function for safe URL redirection. - Cleaned up unused imports and optimized component structures for better readability and maintainability. - Introduced Justfile for build and configuration management

for more information, see https://pre-commit.ci

spwoodcock · 2025-11-27T17:13:09Z

src/backend/pyproject.toml

    "asgi-lifespan==2.1.0",
    "arq==0.26.3",
    "redis==5.2.1",
+    "opencv-python-headless==4.10.0.84",


Does this install correctly and run well in docker by the way? I haven't tested, but my assumption is that OpenCV may require some additional dev packages / headers to install and run, but I'm not certain on that. Please confirm it's working well in docker 🙏

It's working correctly as of now, cause I run the application through docker, I can't guarantee the behaviour is prod.

spwoodcock

This looks pretty clean and nice - good job 😄

We should give it a thorough test. Can we deploy the branch to dev, instead of merging into image-workflow-integration and see all of these new workflows in action? 🙏

spwoodcock · 2025-11-27T17:15:19Z

Small thing, there is a pre-commit error about an bare except.

Please update to the actual exception type at best, else except Exception as e at worst

…ification component with presigned URL handling

…xif_value function

…ity across multiple files

for more information, see https://pre-commit.ci

… status display - Introduced a modal to display detailed information about selected images, including status, filename, GPS data, and upload time. - Enhanced status badge styles for better visual distinction. - Updated image list to allow selection and display of the modal. - Improved loading indicators and button states during classification process. - Refactored status color and label functions for better readability and maintainability

…mage ID processing

… processing workflow

…atus updates

for more information, see https://pre-commit.ci

spwoodcock · 2025-12-02T15:29:45Z

Minor feedback:

It's good that the polling interval isn't too aggressive, but should we consider some kind of user feedback while they are waiting for number to change? I think the most obvious is to add a loading spinner in space of the 0 values, to show those stages haven't started yet (but to indicate that they will). Open to other options 👍
We probably have a few too many stages to display to the user. It's perfect for us, and perhaps even to power users via an 'advanced' toggle or something, but to most general users I think we should remove a few stages:
- Staged and Ready could be combined into simply 'Uploaded'?
- The total on the left is a bit confusing, as it's not distinct from other stages - it seems like a separate stage. I'm thinking we could possibly remove the total from this part entirely (the user can get a rough idea by counting the numbers across stages)
- Assigned should be worded as 'Complete'
- Rejected, Unmatched, Invalid EXIF can all be bundled into an 'Issues' category. This could be computed on the frontend, rather than changing the API.
- Keep 'Duplicates' as this is useful.
- The API response can remaining the same, as the more details categories are useful to developers and other users - they could be accessed in other ways in future

spwoodcock · 2025-12-02T15:37:08Z

Now #678 has been merged, this PR needs to probably be rebased on dev.

I think the commits in the merged PR were squashed when the branch was made, but personally I would have kept the history.
This branch may have updated without conflict if the commit hashes were matching.

Either way, we need to make sure only to include the relevant files changes for this PR here 😄

…tionality - Implemented thumbnail generation for uploaded images using PIL. - Added thumbnail URL to image records in the database. - Updated image processing workflow to handle thumbnail uploads to S3. - Enhanced image classification to utilize thumbnail URLs for display. - Modified frontend components to support thumbnail rendering in the image grid

…formatting

…ID handling and add task group interfaces

AbdulHafeez AbdulRaheem and others added 28 commits October 7, 2025 09:07

feat: added multipart initate upload route

00e331a

feat: added all multipart upload functions

855f2ec

Merge branch 'develop' into feat/resumable-uploads

d0f2f18

Merge branch 'develop' into feat/resumable-uploads

480636f

feat(UploadBox): Swapped upload from single to multipart upload using…

bf0bb88

… uppy

feat(frontend): add @uppy/drag-drop and @uppy/progress-bar dependenci…

519e0e2

…es; update moduleResolution to bundler

feat(image-upload-workflow): Refactor code structure for improved rea…

b4f22b3

…dability and maintainability

feat(image-upload-workflow): Implement drone image processing workflo…

312279d

…w UI with upload, classification, review, and processing steps

fixed with minio

00724e5

feat(image-upload-workflow): Add project_id and filename to the uploa…

66dbc58

…d request in UppyFileUploader

[pre-commit.ci] auto fixes from pre-commit.com hooks

59ea669

for more information, see https://pre-commit.ci

feat(image-upload-workflow): feat(migrations): Enhance project_images…

0621bec

… table migration with conditional enum creation and improved index handling

Merge branch 'feat/resumable-uploads' of https://github.com/hotosm/dr…

0ab2c78

…one-tm into feat/resumable-uploads

[pre-commit.ci] auto fixes from pre-commit.com hooks

17099fa

for more information, see https://pre-commit.ci

fix(image-upload-workflow): feat(step-switcher): Implement generic St…

e6046b8

…epSwitcher component for project steps navigation

feat(image-upload-workflow): feat(db-models): Add DbProjectImage mode…

3103d8d

…l for managing project images with relationships and indexing

feat(image-upload-workflow): Enhance image processing with new databa…

82940c8

…se save function and EXIF handling

fix(image-upload-workflow): refactor(image-upload-workflow): Clean up…

6c26e6c

… code formatting in image logic and migration files

refactor(image-upload-workflow): Replace deprecated get_presigned_url…

c226707

… with generate_presigned_download_url across the codebase; add new migration script for images from S3

fix(image-upload-workflow): Implement upsert functionality for image …

dc926ea

…migration with conflict handling and improved logging

feat(image-upload-workflow): Add sharpness_score field to project_ima…

dffa82f

…ges and implement sharpness calculation in image classification

feat(image-upload-workflow): Enhance image classification with batch …

28b5725

…processing and status checks

github-actions bot added enhancement New feature or request backend Related to backend code labels Nov 24, 2025

github-actions bot added the frontend label Nov 24, 2025

FeezyHendrix requested a review from spwoodcock November 24, 2025 13:23

feat(image-upload-workflow): Update image status references from 'upl…

f565c95

…oaded' to 'staged' and enhance upload notifications

FeezyHendrix changed the title ~~Feat/resumable uploads~~ Classification Workflow Step Nov 24, 2025

AbdulHafeez AbdulRaheem and others added 5 commits November 26, 2025 12:42

feat(image-upload-workflow): Add logging for classification requests …

779d207

…and image count checks

feat(image-upload-workflow): feat(image-processing-workflow): Impleme…

8e29f4f

…nt image processing workflow state management and update components for classification handling

fix(image-upload-workflow): Merged conflicts

a34e7fa

[pre-commit.ci] auto fixes from pre-commit.com hooks

84256a6

for more information, see https://pre-commit.ci

spwoodcock reviewed Nov 27, 2025

View reviewed changes

spwoodcock approved these changes Nov 27, 2025

View reviewed changes

AbdulHafeez AbdulRaheem and others added 7 commits December 1, 2025 13:56

feat(image-upload-workflow): Update migration and enhance image class…

33849fb

…ification component with presigned URL handling

feat(image-upload-workflow): Improve exception handling in _convert_e…

3512444

…xif_value function

refactor(image-upload-workflow): Improve code formatting and readabil…

9aa5446

…ity across multiple files

[pre-commit.ci] auto fixes from pre-commit.com hooks

27b1e52

for more information, see https://pre-commit.ci

fix(image-upload-workflow): enhance EXIF value handling and improve i…

230c219

…mage ID processing

fix(image-upload-workflow): enhance loading and empty states in image…

71deb4f

… processing workflow

FeezyHendrix changed the base branch from image-workflow-integration to dev December 2, 2025 07:34

AbdulHafeez AbdulRaheem and others added 2 commits December 2, 2025 08:44

fix(image-upload-workflow): implement duplicate image handling and st…

d49bf41

…atus updates

[pre-commit.ci] auto fixes from pre-commit.com hooks

8ba45ca

for more information, see https://pre-commit.ci

AbdulHafeez AbdulRaheem and others added 3 commits December 4, 2025 15:54

fix(image-upload-workflow): improve code readability with consistent …

9c36779

…formatting

feat(image-upload-workflow): enhance image review process with batch …

1dc704d

…ID handling and add task group interfaces

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Classification Workflow Step #668

Classification Workflow Step #668

FeezyHendrix commented Nov 24, 2025

Uh oh!

spwoodcock commented Nov 24, 2025

Uh oh!

spwoodcock commented Nov 24, 2025

Uh oh!

spwoodcock commented Nov 24, 2025

Uh oh!

spwoodcock Nov 27, 2025

Uh oh!

FeezyHendrix Dec 1, 2025

Uh oh!

spwoodcock left a comment

Uh oh!

spwoodcock commented Nov 27, 2025

Uh oh!

spwoodcock commented Dec 2, 2025

Uh oh!

spwoodcock commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Classification Workflow Step #668

Are you sure you want to change the base?

Classification Workflow Step #668

Conversation

FeezyHendrix commented Nov 24, 2025

What type of PR is this? (check all applicable)

Related Issue

Describe this PR

Key Changes:

Technical Details:

Screenshots

Alternative Approaches Considered

Review Guide

Testing the Changes:

Checklist before requesting a review

[optional] What gif best describes this PR or how it makes you feel?

Uh oh!

spwoodcock commented Nov 24, 2025

Uh oh!

spwoodcock commented Nov 24, 2025

Uh oh!

spwoodcock commented Nov 24, 2025

Uh oh!

spwoodcock Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

FeezyHendrix Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

spwoodcock left a comment

Choose a reason for hiding this comment

Uh oh!

spwoodcock commented Nov 27, 2025

Uh oh!

spwoodcock commented Dec 2, 2025

Uh oh!

spwoodcock commented Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants