A comprehensive AI platform leveraging Hailo-8 acceleration for real-time computer vision, gaze tracking, and IoT integration.
This repository provides a complete ecosystem for deploying AI models on edge devices using Hailo-8 acceleration. It combines DINOv2 vision transformers, GazeLLE gaze estimation, object detection, face recognition, and smart device control through MQTT messaging.
- Multi-Modal AI: DINOv2, GazeLLE gaze tracking, DETR object detection, SCRFD face recognition
- Edge Acceleration: Optimized for Hailo-8 with hybrid CPU/accelerator processing
- Real-time Processing: GStreamer pipelines for live camera feeds
- IoT Integration: MQTT-driven workflows with smart bulb control
- Multi-Device Support: Scalable deployment across Raspberry Pi devices
embeded-ai/
├── isense_scripts/ # Production-ready GazeLLE system with MQTT
│ ├── src/apps/ # Main applications and workflow coordinators
│ ├── src/mqtt/ # MQTT messaging and group management
│ ├── src/chiptools/ # Smart device control (Matter/Thread)
│ └── docs/ # Comprehensive system documentation
├── gazelle/ # Original GazeLLE implementation
├── dinov2/ # DINOv2 vision transformer framework
├── hailo-rpi5-examples/ # Hailo community examples and utilities
├── models/ # Compiled HEF models and checkpoints
├── scripts/ # Model conversion and optimization tools
└── tools/ # Testing and validation utilities
# Clone repository
git clone <repository-url>
cd embeded-ai
# Install Hailo SDK (Raspberry Pi 5)
./scripts/install_env.sh
# Setup Python environment
python3 -m venv venv_hailo
source venv_hailo/bin/activate
pip install -r requirements.txt# Start coordinated system with initialization
python -m isense_scripts.src.apps.gazelle_with_init \
--hef models/dinov2_vits14_224.hef \
--pth models/epoch_14.pt \
--scrfd-hef models/scrfd_10g.hef \
--detr-hef models/detr_resnet_v1_18_bn.hef \
-i ./isense_scripts/examples/study_room.mp4 \
--headless \
--enable-bulb \
--node 0x70 \
--ep 1
# Initialize system (separate terminal)
python -m isense_scripts.src.mqtt.mqtt_init_groups --groups 1 2 3 4# Download calibration data
python scripts/download_calibration_data.py --dataset imagenet_val --num_images 1024
# Convert PyTorch models to Hailo HEF
python scripts/dinov2_hailo_parsing.py
python scripts/gazelle_hailo_parsing.pyProduction-ready system with MQTT coordination:
- Workflow Management: Initialization-based startup coordination
- Multi-Model Integration: DINOv2 + GazeLLE + SCRFD + DETR
- Smart Device Control: Matter-compatible bulb control
- Group Messaging: Multi-device MQTT coordination
Original gaze estimation framework:
- Real-time Processing: Camera-to-prediction pipeline
- Hailo Integration: Hybrid acceleration architecture
- Visualization: Gaze heatmaps and attention visualization
Vision transformer backbone:
- Self-Supervised Learning: Meta's DINOv2 implementation
- Feature Extraction: Rich visual representations
- Model Zoo: Pre-trained variants (ViT-S/B/L/G)
Community examples and pipelines:
- Basic Pipelines: Detection, segmentation, pose estimation
- Community Projects: Navigation, chess, music generation
- Performance Optimization: Efficient processing patterns
External Controller → [MQTT Init] → Edge Device (Pi + Hailo-8)
↓
[Multi-Model Inference]
↓
[state/group1, state/group2, ...]
↓
[Smart Device Control]
↓
[RGB Lighting + IoT Actions]
The system implements sophisticated coordination:
- Device Initialization: Wait for external start signal
- Model Loading: Load all AI models (DINOv2, GazeLLE, SCRFD, DETR)
- Group Assignment: Receive target group configuration
- Inference Pipeline: Real-time processing with multi-model fusion
- Result Publishing: Send detection results to group topics
- Device Control: Update smart bulbs based on group activity
{
"groupId": [1, 2, 3, 4],
"targetId": 73,
"timestamp": "2025-01-06T15:30:45.123456",
"userId": "caps1",
"detections": {
"faces": [...],
"objects": [...],
"gaze_points": [...]
}
}- DINOv2 ViT-S: Feature extraction backbone
- SCRFD: Face detection and recognition
- DETR: Object detection (92 COCO classes)
- GazeLLE Head: Gaze prediction from DINOv2 features
- Custom Fusion: Multi-modal result integration
Integration with Matter/Thread ecosystem:
- Priority Colors: Group-based color assignment
- Activity Scaling: Brightness based on user count
- Real-time Updates: Immediate response to detection changes
- Latency: 50-200ms end-to-end (detection to action)
- Throughput: 10-30 FPS depending on model complexity
- Memory Usage: ~1-2GB RAM per inference process
- Network Overhead: ~200 bytes per MQTT message
- Raspberry Pi 5
- Hailo-8 M.2 accelerator
- Camera (Pi Camera or USB)
- 8GB+ RAM recommended
- Matter-compatible smart bulbs
- Multiple Pi devices for distributed deployment
- External MQTT broker for production use
# Quick initialization demo
./isense_scripts/examples/example_init_groups.sh
# Full workflow with smart bulbs
./isense_scripts/examples/smart_bulb_demo.sh
# Performance testing
./isense_scripts/examples/gazelle_start_with_init.sh- Train/fine-tune models using framework directories
- Export to ONNX using provided scripts
- Compile to HEF using Hailo SDK tools
- Integrate into iSense pipeline
- Develop in component directories (
gazelle/,dinov2/) - Test with
hailo-rpi5-examples/patterns - Integrate into
isense_scripts/production system - Deploy with MQTT coordination
- iSense System Guide: Complete production system
- MQTT Workflow: Detailed coordination docs
- Smart Device Control: IoT integration guide
- Enhanced smart bulb integration with priority-based color control
- Multi-user MQTT messaging with user ID support
- Improved workflow coordination and initialization
- Comprehensive documentation and examples
- Performance optimization for real-time deployment
- Follow existing code organization by component
- Test changes with both single-device and multi-device setups
- Update relevant documentation in component directories
- Ensure MQTT workflow compatibility
- Validate performance on target hardware
[Specify license here]
[Specify contact information here]