🤖 Pepper Robot Management System

A Cost-Effective Cloud-Offloading Architecture for Human-Robot Interaction
Integrating LLMs and Computer Vision on Resource-Constrained Humanoid Robots

Author: Gilang Hidayatullah

Overview

Pepper robots are resource-constrained humanoid platforms with limited onboard compute. This project addresses that constraint through a cloud-offloading architecture — offloading heavy AI workloads (speech recognition, language understanding, face recognition) to Google Cloud services, while the robot handles physical interaction and I/O. The result is a full-stack system enabling real-time conversation, face-based identity recognition, and programmable movement — at a fraction of the cost of onboard processing.

The system is organized into three independently deployable microservices:

Service	Folder	Purpose
Backend Management	`backend/`	REST API, auth, robot control, admin UI
AI Conversation	`ai-chat/`	Voice-based dialogue via Gemini + GCP STT/TTS
Face Recognition	`face-recognition/`	Real-time face detection with cloud-synced database

Architecture

                          ┌─────────────────────────────────────┐
                          │           Google Cloud Platform      │
                          │  ┌──────────┐  ┌──────────────────┐ │
                          │  │  Vertex  │  │  Cloud Storage   │ │
                          │  │  AI      │  │  (face database) │ │
                          │  │ (Gemini) │  └──────────────────┘ │
                          │  └──────────┘  ┌──────────────────┐ │
                          │  ┌──────────┐  │  STT / TTS APIs  │ │
                          │  │Cloud Run │  └──────────────────┘ │
                          │  └──────────┘                       │
                          └────────────┬────────────────────────┘
                                       │ REST / gRPC
               ┌───────────────────────┼───────────────────────┐
               │                       │                       │
      ┌────────▼──────┐    ┌───────────▼──────┐    ┌──────────▼──────┐
      │   backend     │    │     ai-chat        │   │face-recognition │
      │  Flask API    │    │  Flask + Docker   │    │  Flask + DeepFace│
      └───────┬───────┘    └─────────┬─────────┘    └────────┬────────┘
              │                      │                        │
              └──────────────────────┴────────────────────────┘
                                     │ SSH / NAOqi SDK
                                ┌────▼─────┐
                                │  Pepper  │
                                │  Robot   │
                                └──────────┘

Services

1. Backend Management System (`backend/`)

A Flask-based REST API that serves as the central control plane for the robot:

JWT-authenticated user management and role-based access control
Robot movement and choreography sequence management (walk, dance patterns)
SSH-based command dispatch to the Pepper robot
Face identity database management (linked to GCS)
AI conversation session orchestration
Web-based admin dashboard (Bootstrap, HTML/CSS/JS)
Multi-language support (Indonesian / English)

Stack: Python 3.11, Flask, SQLAlchemy (SQLite), JWT, Google Cloud Storage

2. AI Conversation Service (`ai-chat/`)

A voice-first dialogue service that gives Pepper natural language capabilities without onboard ML:

Captures audio input → Google Cloud Speech-to-Text for transcription
Sends transcript to Vertex AI (Gemini 2.0 Flash) for response generation
Synthesizes response audio via Google Cloud Text-to-Speech
Maintains per-session conversation history for multi-turn context
Containerized and deployable to Cloud Run

Stack: Flask, Vertex AI (Gemini 2.0 Flash), Google Cloud STT/TTS, Docker

3. Face Recognition Service (`face-recognition/`)

A real-time face recognition service backed by cloud-synced identity storage:

Detects and identifies faces from live camera frames using DeepFace + VGGFace
Stores and retrieves identity embeddings via Google Cloud Storage (no local DB required)
Auto-syncs the local face cache with GCS on startup
Exposes a REST API for robot integration

Stack: Flask, DeepFace, OpenCV, VGGFace, Google Cloud Storage

Project Structure

pepper-robots/
├── backend/                         # Backend management system
│   ├── app/
│   │   ├── controller/              # API route handlers
│   │   ├── model/                   # SQLAlchemy database models
│   │   ├── services/                # Business logic
│   │   ├── templates/               # Jinja2 web UI templates
│   │   ├── static/                  # CSS, JS, images
│   │   └── utils/                   # Shared utilities
│   ├── tests/                       # Unit and integration tests
│   └── docs/                        # API documentation
│
├── ai-chat/                         # AI conversation microservice
│   ├── app.py                       # Flask application entry point
│   ├── Dockerfile                   # Container build config
│   └── requirements.txt
│
├── face-recognition/                # Face recognition microservice
│   ├── app.py                       # Flask application entry point
│   ├── gcs_handler.py               # Google Cloud Storage integration
│   └── pepper_client.py             # Pepper robot client library
│
├── assets/                          # Architecture & flow diagrams
│   ├── cloud-logic.png
│   ├── face-processing.png
│   └── voice-processing.png
│
├── .gitignore
├── AUTHORS
├── CITATION.cff
├── CONTRIBUTING.md
├── LICENSE                          # Apache 2.0
└── SECURITY.md

Getting Started

Prerequisites

Python 3.11+
A Google Cloud Platform project with these APIs enabled:
- Vertex AI API
- Cloud Speech-to-Text API
- Cloud Text-to-Speech API
- Cloud Storage API
A GCP service account key (JSON) with appropriate permissions
Docker (for the AI conversation service)
Access to a Pepper robot (optional — core services run without it)

1. Backend Management System

cd backend
python -m venv venv
source venv/bin/activate          # Linux/macOS
# venv\Scripts\activate           # Windows

pip install -r requirements.txt
cp .env.example .env              # Fill in your credentials
python run.py

Key .env variables:

SECRET_KEY=your_flask_secret
GCS_BUCKET_NAME=your_bucket
GOOGLE_APPLICATION_CREDENTIALS=path/to/service-account.json
PEPPER_HOST=192.168.x.x          # Robot IP (optional)

2. AI Conversation Service

With Docker (recommended):

cd ai-chat
docker build -t pepper-ai .
docker run -p 5000:5000 \
  -e GOOGLE_APPLICATION_CREDENTIALS=/app/service-account.json \
  -v /path/to/service-account.json:/app/service-account.json \
  pepper-ai

Without Docker:

cd ai-chat
pip install -r requirements.txt
python app.py

3. Face Recognition Service

cd face-recognition
pip install -r requirements.txt
export GOOGLE_APPLICATION_CREDENTIALS=path/to/service-account.json
export GCS_BUCKET_NAME=your_bucket
python app.py

On startup, the service will sync the face database from GCS to a local cache automatically.

API Reference

Each service exposes its own REST API. Refer to the individual README files in each subdirectory for full endpoint documentation:

backend/docs/ — Backend management API
ai-chat/ — Conversation service API
face-recognition/ — Face recognition API

License

Licensed under the Apache License, Version 2.0.

Citation

If you use this work in academic research, please cite:

@software{hidayatullah2026pepper,
  author    = {Hidayatullah, Gilang},
  title     = {Pepper Robot Management System: A Cost-Effective Cloud-Offloading
               Architecture for Human-Robot Interaction},
  year      = {2026},
  url       = {https://github.com/hidatara-ds/pepper-robots}
}

Contributing

Contributions are welcome. Please read CONTRIBUTING.md before submitting a pull request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Pepper Robot Management System

Overview

Architecture

Services

1. Backend Management System (`backend/`)

2. AI Conversation Service (`ai-chat/`)

3. Face Recognition Service (`face-recognition/`)

Project Structure

Getting Started

Prerequisites

1. Backend Management System

2. AI Conversation Service

3. Face Recognition Service

API Reference

License

Citation

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
ai-chat		ai-chat
assets		assets
backend		backend
face-recognition		face-recognition
.gitignore		.gitignore
AUTHORS		AUTHORS
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Folders and files

Latest commit

History

Repository files navigation

🤖 Pepper Robot Management System

Overview

Architecture

Services

1. Backend Management System (backend/)

2. AI Conversation Service (ai-chat/)

3. Face Recognition Service (face-recognition/)

Project Structure

Getting Started

Prerequisites

1. Backend Management System

2. AI Conversation Service

3. Face Recognition Service

API Reference

License

Citation

Contributing

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Backend Management System (`backend/`)

2. AI Conversation Service (`ai-chat/`)

3. Face Recognition Service (`face-recognition/`)

Packages