Skip to content

Conversation

@FeezyHendrix
Copy link
Collaborator

What type of PR is this? (check all applicable)

  • πŸ• Feature
  • πŸ› Bug Fix
  • πŸ“ Documentation
  • πŸ§‘β€πŸ’» Refactor
  • βœ… Test
  • πŸ€– Build or CI
  • ❓ Other (please specify)

Related Issue

Related to image classification workflow enhancement

Describe this PR

This PR implements several critical fixes and enhancements to the drone image classification workflow:

Key Changes:

  1. Fixed Async S3 Operations: Implemented proper async handling for MinIO S3 operations using run_in_threadpool to prevent blocking the event loop during image downloads in the classification process.

  2. ARQ Worker Connectivity: Added extra_hosts configuration to the ARQ worker in docker-compose.yaml to enable MinIO connectivity via localhost, which is required for generating presigned URLs accessible from the browser.

  3. Database Schema Fixes: Updated create_project_image function to include batch_id and rejection_reason fields in both INSERT statements and RETURNING clauses, resolving Pydantic validation errors.

  4. Classification Status Flow: Changed classification workflow to work with STAGED status instead of UPLOADED status, aligning with the actual upload flow where images are initially staged in user-uploads directory.

  5. Frontend API Corrections: Updated frontend to send batch_id as query parameter instead of request body to match backend endpoint expectations.

  6. Button State Management:

    • Disabled "Next" button on Image Upload step until files are successfully uploaded
    • Disabled "Start Classification" button when no staged images are available for classification
  7. Toast Notification Fix: Prevented duplicate success notifications by implementing a ref-based flag to ensure only one notification shows per upload batch.

  8. EXIF Processing: Temporarily disabled frontend EXIF extraction (commented out) as the backend handles this during the process_uploaded_image ARQ task.

  9. Consistent Return Values: Fixed classify_batch function to always return complete result dictionaries with all expected keys, preventing KeyError exceptions.

Technical Details:

Backend Changes:

  • src/backend/app/s3.py: Added run_in_threadpool wrapper for async S3 operations
  • src/backend/app/images/image_logic.py: Updated INSERT/RETURNING clauses to include batch_id and rejection_reason
  • src/backend/app/projects/classification_routes.py: Changed to query for staged status, added staged count to status response, added pre-flight image count check
  • src/backend/app/projects/image_classification.py: Updated classify_batch to query STAGED images and return consistent result structure
  • compose.yaml: Added extra_hosts for ARQ worker

Frontend Changes:

  • src/frontend/src/services/classification.ts: Changed to send batch_id as query parameter
  • src/frontend/src/api/projects.ts: Added proper TypeScript typing for BatchStatusSummary
  • src/frontend/src/components/.../ImageClassification.tsx: Changed button disable logic to check for staged count
  • src/frontend/src/components/.../DroneImageProcessingWorkflow/index.tsx: Added Next button disable logic based on batch_id presence
  • src/frontend/src/components/.../UppyFileUploader/index.tsx: Fixed duplicate toast notifications, commented out EXIF extraction

Screenshots

Screenshots would show:

  • Disabled "Next" button before file upload
  • Enabled "Next" button after successful upload
  • Disabled "Start Classification" button when no staged images
  • Single success toast notification after batch upload completes

Alternative Approaches Considered

  1. Status Flow: Considered using UPLOADED status for classification but determined STAGED is more appropriate for the current workflow where images remain in staging directory.

  2. Batch ID Persistence: Explored adding an endpoint to retrieve the latest batch_id from backend vs. maintaining state in frontend component.

  3. S3 Connectivity: Initially attempted to change S3_ENDPOINT configuration, but used extra_hosts instead to maintain backward compatibility with presigned URLs.

Review Guide

Testing the Changes:

  1. Upload Flow:

    • Open Drone Image Processing Workflow modal
    • Verify "Next" button is disabled on step 1
    • Upload drone images
    • Verify single success notification appears
    • Verify "Next" button becomes enabled
    • Verify automatic progression to Classification step
  2. Classification Flow:

    • On Classification step, verify status summary shows correct counts (Staged, Total, etc.)
    • Verify "Start Classification" button is disabled if no staged images
    • Click "Start Classification" button
    • Verify ARQ worker processes images successfully (check logs)
    • Verify images are classified and status updates in real-time
  3. ARQ Worker:

    • Check ARQ worker logs for successful MinIO connections
    • Verify no connection refused errors to localhost:9000
    • Verify images are processed and saved with correct batch_id
  4. Backend Endpoints:

    • Test /api/projects/{project_id}/classify-batch/?batch_id={batch_id} endpoint
    • Test /api/projects/{project_id}/batch/{batch_id}/status/ returns staged count
    • Verify batch classification only starts when staged images exist

Checklist before requesting a review

  • πŸ“– Read the HOT Code of Conduct: https://docs.hotosm.org/code-of-conduct
  • πŸ‘·β€β™€οΈ Create small PRs. In most cases, this will be possible.
  • βœ… Provide tests for your changes.
  • πŸ“ Use descriptive commit messages.
  • πŸ“— Update any related documentation and include any relevant screenshots.

[optional] What gif best describes this PR or how it makes you feel?

Finally Working


Note: This PR includes multiple related fixes that were discovered and resolved during the implementation of the image classification workflow. Each change addresses a specific issue that was blocking the classification functionality.

AbdulHafeez AbdulRaheem and others added 28 commits October 7, 2025 09:07
…le handling

- Added Uppy instance to the global App component for shared file upload functionality.
- Implemented UppyImageUploader component to utilize Uppy for image uploads with AWS S3 integration.
- Refactored file handling to extract EXIF data upon file addition and display upload progress.
- Updated Vite configuration to target 'esnext' for improved compatibility
…inIO

- Replaced MinIO client with boto3 for S3 operations in s3.py.
- Updated methods for file upload, download, and presigned URL generation to align with boto3 API.
- Adjusted error handling to use ClientError from botocore.
- Modified user_schemas.py to generate presigned URLs using boto3.
- Updated dependencies in pyproject.toml to include boto3 and remove MinIO.
- Ensured compatibility with existing code by stripping leading slashes from S3 paths.
- Added content type headers for multipart upload requests in the frontend UppyImageUploader component
…w UI with upload, classification, review, and processing steps
…ance upload functionality

- Replaced UppyImageUploader with UppyFileUploader for consistency across components.
- Improved ImageUpload component to track upload progress and display upload queue.
- Added support for staging uploads in UppyFileUploader.
- Updated DroneImageProcessingWorkflow to adjust modal dimensions for better usability.
- Removed unused UppyImageUploader component and its related code

BREAKING CHANGE: Switch from minio to boto3
…ance upload functionality

- Replaced UppyImageUploader with UppyFileUploader for consistency across components.
- Improved ImageUpload component to track upload progress and display upload queue.
- Added support for staging uploads in UppyFileUploader.
- Updated DroneImageProcessingWorkflow to adjust modal dimensions for better usability.
- Removed unused UppyImageUploader component and its related code
… table migration with conditional enum creation and improved index handling
…epSwitcher component for project steps navigation
…l for managing project images with relationships and indexing
… code formatting in image logic and migration files
… with generate_presigned_download_url across the codebase; add new migration script for images from S3
…migration with conflict handling and improved logging
* feat: added multipart initate upload route

* feat: added all multipart upload functions

* feat(UploadBox): Swapped upload from single to multipart upload using uppy

* feat(frontend): add @uppy/drag-drop and @uppy/progress-bar dependencies; update moduleResolution to bundler

* feat(frontend): feat: integrate Uppy for image uploads and enhance file handling

- Added Uppy instance to the global App component for shared file upload functionality.
- Implemented UppyImageUploader component to utilize Uppy for image uploads with AWS S3 integration.
- Refactored file handling to extract EXIF data upon file addition and display upload progress.
- Updated Vite configuration to target 'esnext' for improved compatibility

* feat(image-upload): Refactor S3 integration to use boto3 instead of MinIO

- Replaced MinIO client with boto3 for S3 operations in s3.py.
- Updated methods for file upload, download, and presigned URL generation to align with boto3 API.
- Adjusted error handling to use ClientError from botocore.
- Modified user_schemas.py to generate presigned URLs using boto3.
- Updated dependencies in pyproject.toml to include boto3 and remove MinIO.
- Ensured compatibility with existing code by stripping leading slashes from S3 paths.
- Added content type headers for multipart upload requests in the frontend UppyImageUploader component

* feat(image-upload-workflow): Refactor code structure for improved readability and maintainability

* feat(image-upload-workflow): Implement drone image processing workflow UI with upload, classification, review, and processing steps

* feat(image-upload-workflow): Refactor image upload components and enhance upload functionality

- Replaced UppyImageUploader with UppyFileUploader for consistency across components.
- Improved ImageUpload component to track upload progress and display upload queue.
- Added support for staging uploads in UppyFileUploader.
- Updated DroneImageProcessingWorkflow to adjust modal dimensions for better usability.
- Removed unused UppyImageUploader component and its related code

BREAKING CHANGE: Switch from minio to boto3

* feat(image-upload-workflow): Refactor image upload components and enhance upload functionality

- Replaced UppyImageUploader with UppyFileUploader for consistency across components.
- Improved ImageUpload component to track upload progress and display upload queue.
- Added support for staging uploads in UppyFileUploader.
- Updated DroneImageProcessingWorkflow to adjust modal dimensions for better usability.
- Removed unused UppyImageUploader component and its related code

* fixed with minio

* feat(image-upload-workflow): Add project_id and filename to the upload request in UppyFileUploader

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* feat(image-upload-workflow): feat(migrations): Enhance project_images table migration with conditional enum creation and improved index handling

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix(image-upload-workflow): feat(step-switcher): Implement generic StepSwitcher component for project steps navigation

* feat(image-upload-workflow): feat(db-models): Add DbProjectImage model for managing project images with relationships and indexing

* feat(image-upload-workflow): Enhance image processing with new database save function and EXIF handling

* refactor(image-upload-workflow): Replace deprecated get_presigned_url with generate_presigned_download_url across the codebase; add new migration script for images from S3

* fix(image-upload-workflow): Implement upsert functionality for image migration with conflict handling and improved logging

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: AbdulHafeez AbdulRaheem <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Added new hooks for starting classification, fetching batch status, and retrieving batch images in the projects API.
- Enhanced the ImageClassification component to handle classification processes, including polling for updates and displaying status.
- Updated the DroneImageProcessingWorkflow to manage batch ID and transition between upload and classification steps.
- Modified UppyFileUploader to support batch ID generation and pass it upon upload completion.
- Created a new migration to add classification-related fields to the project_images table.
- Developed new classification routes for starting batch classification and retrieving batch status and images.
- Implemented the ImageClassifier class to handle image classification logic, including quality checks and task matching.
- Added service functions for interacting with the classification API, including starting classification and fetching batch details
…ges and implement sharpness calculation in image classification
@github-actions github-actions bot added enhancement New feature or request backend Related to backend code labels Nov 24, 2025
…oaded' to 'staged' and enhance upload notifications
@FeezyHendrix FeezyHendrix changed the title Feat/resumable uploads Classification Workflow Step Nov 24, 2025
@spwoodcock
Copy link
Member

Nice! I'm sure this is going to be great πŸ˜„

I'll try my best to review this as soon as I can - got quite a few things on.

Couple pieces of feedback though:

  • Please try and make a draft PR as early as possible, so this can be reviewed incrementally in future. It's a huge task to do an in-depth review of thousands of lines of code & maintainer availability often rises and falls.
  • If possible, it's great regularly rebasing on the target branch, meaning conflicts are addressed bit-by-bit, as they happen & the final code is always in sync.

Thank you! πŸ™

@spwoodcock
Copy link
Member

Oh also, thanks so much for the great detailed description - really helps!

@spwoodcock
Copy link
Member

This is looking great! But perhaps we should be targeting a merge to dev instead of the image-workflow-integration branch? Or perhaps we can just update image-workflow-integration to be in sync to dev to ensure we don't have merge conflicts on the final merge.

I'll let you take a look at the merge conflict resolution & come back for a second review πŸ‘

AbdulHafeez AbdulRaheem and others added 5 commits November 26, 2025 12:42
…nt image processing workflow state management and update components for classification handling
- Introduced a new MapContext to manage map state and loading status across components.
- Updated BaseLayerSwitcher, COGOrthophotoViewer, LocateUser, and VectorLayer components to consume map context instead of passing map instance as props.
- Simplified the MapContainer component to provide context to its children.
- Enhanced error handling and user feedback in ForgotPassword and Login components.
- Improved navigation handling in Navbar and ProtectedRoute components.
- Updated drone model options in constants and adjusted related state management.
- Added utility function for safe URL redirection.
- Cleaned up unused imports and optimized component structures for better readability and maintainability.
- Introduced Justfile for build and configuration management
"asgi-lifespan==2.1.0",
"arq==0.26.3",
"redis==5.2.1",
"opencv-python-headless==4.10.0.84",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this install correctly and run well in docker by the way? I haven't tested, but my assumption is that OpenCV may require some additional dev packages / headers to install and run, but I'm not certain on that. Please confirm it's working well in docker πŸ™

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's working correctly as of now, cause I run the application through docker, I can't guarantee the behaviour is prod.

Copy link
Member

@spwoodcock spwoodcock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty clean and nice - good job πŸ˜„

We should give it a thorough test. Can we deploy the branch to dev, instead of merging into image-workflow-integration and see all of these new workflows in action? πŸ™

@spwoodcock
Copy link
Member

Small thing, there is a pre-commit error about an bare except.

Please update to the actual exception type at best, else except Exception as e at worst

AbdulHafeez AbdulRaheem and others added 7 commits December 1, 2025 13:56
…ification component with presigned URL handling
… status display

- Introduced a modal to display detailed information about selected images, including status, filename, GPS data, and upload time.
- Enhanced status badge styles for better visual distinction.
- Updated image list to allow selection and display of the modal.
- Improved loading indicators and button states during classification process.
- Refactored status color and label functions for better readability and maintainability
@FeezyHendrix FeezyHendrix changed the base branch from image-workflow-integration to dev December 2, 2025 07:34
@spwoodcock
Copy link
Member

Minor feedback:

  • It's good that the polling interval isn't too aggressive, but should we consider some kind of user feedback while they are waiting for number to change? I think the most obvious is to add a loading spinner in space of the 0 values, to show those stages haven't started yet (but to indicate that they will). Open to other options πŸ‘
  • We probably have a few too many stages to display to the user. It's perfect for us, and perhaps even to power users via an 'advanced' toggle or something, but to most general users I think we should remove a few stages:
    • Staged and Ready could be combined into simply 'Uploaded'?
    • The total on the left is a bit confusing, as it's not distinct from other stages - it seems like a separate stage. I'm thinking we could possibly remove the total from this part entirely (the user can get a rough idea by counting the numbers across stages)
    • Assigned should be worded as 'Complete'
    • Rejected, Unmatched, Invalid EXIF can all be bundled into an 'Issues' category. This could be computed on the frontend, rather than changing the API.
    • Keep 'Duplicates' as this is useful.
    • The API response can remaining the same, as the more details categories are useful to developers and other users - they could be accessed in other ways in future

@spwoodcock
Copy link
Member

Now #678 has been merged, this PR needs to probably be rebased on dev.

I think the commits in the merged PR were squashed when the branch was made, but personally I would have kept the history.
This branch may have updated without conflict if the commit hashes were matching.

Either way, we need to make sure only to include the relevant files changes for this PR here πŸ˜„

AbdulHafeez AbdulRaheem and others added 3 commits December 4, 2025 15:54
…tionality

- Implemented thumbnail generation for uploaded images using PIL.
- Added thumbnail URL to image records in the database.
- Updated image processing workflow to handle thumbnail uploads to S3.
- Enhanced image classification to utilize thumbnail URLs for display.
- Modified frontend components to support thumbnail rendering in the image grid
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend Related to backend code enhancement New feature or request frontend migration

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants