Skip to content

feat: ML alt text on upload + object-label search (Azure Vision)#17

Open
sinagkh wants to merge 8 commits into
greyli:mainfrom
sinagkh:feat/ml-alttext-and-search
Open

feat: ML alt text on upload + object-label search (Azure Vision)#17
sinagkh wants to merge 8 commits into
greyli:mainfrom
sinagkh:feat/ml-alttext-and-search

Conversation

@sinagkh
Copy link
Copy Markdown

@sinagkh sinagkh commented Sep 19, 2025

Summary

Add ML-powered accessibility and search to Moments:

  • Auto alt text generated on upload (Azure AI Vision).
  • Object search via a new Object category that queries ML-detected labels.
  • Ensure every user photo renders with an alt="..." for accessibility.

What changed

  • Service: moments/services/vision.py
    • Calls Image Analysis 4.0 (caption,tags). If caption isn’t supported in the region, falls back to:
      • 4.0 tags (for search), and
      • Vision 3.2 describe (for captions).
  • Upload flow: moments/blueprints/main.py
    • On upload, analyzes bytes and stores Photo.alt_text + Photo.detected_labels.
  • Search: main.search supports category=object (label-based LIKE query).
  • Model: Photo gets alt_text and detected_labels columns (+ migrations).
  • Templates:
    • Add alt="{{ photo.alt_text or photo.description or 'Image' }}" to all user photos.
    • Photo page shows Detected: labels and a small Generated badge when alt is ML-derived.
  • CLI: flask reanalyze to backfill captions/labels for older photos.
  • Docs: Updated README.md and .env.example.

How to run (quick)

uv python pin 3.11
uv sync
cp .env.example .env
python3 -c 'import secrets; print("SECRET_KEY=" + secrets.token_hex(32))' >> .env
# set AZURE_VISION_ENDPOINT / AZURE_VISION_KEY in .env

uv run flask --app app init-app
# optional demo:
# uv run flask --app app lorem

uv run flask --app app run
# http://127.0.0.1:5000/
# upload at /upload

@sinagkh
Copy link
Copy Markdown
Author

sinagkh commented Sep 19, 2025

Verification steps (for TA)

  1. Setup

    • Python 3.11 with uv
    • cp .env.example .env
    • set AZURE_VISION_ENDPOINT / AZURE_VISION_KEY in .env
    • uv sync && uv run flask --app app init-app
  2. Run

    • uv run flask -e .env --app app run
    • Upload a new photo at /upload
  3. Expected behavior

    • The uploaded photo page renders <img ... alt="..."> with a natural-language caption.
    • /search?category=object&q=<one of the detected labels> (e.g., dog, tree) returns the photo.

Notes

  • ML runs on upload. To populate older photos: uv run flask -e .env --app app reanalyze.
  • No credentials are committed; secrets loaded from .env.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant