A self-hosted OpenClaw bot running in Docker with Telegram integration and PDF extraction capabilities.
- OpenClaw Gateway: Self-hosted AI agent gateway
- Telegram Integration: Native Telegram bot support
- PDF Extraction: Process Brazilian credit card invoice PDFs
- Multi-Provider LLM Support: GLM, DeepSeek, Ollama Cloud, and OpenAI-compatible APIs
- Brazilian Portuguese: Native language interaction
- Docker Deployment: Containerized with persistent workspace storage
- Git-Backed Agent State: Workspace files versioned in a private git repo
- Obsidian Vault Sync: Syncthing sync over private network (Tailscale-ready)
- Google Drive Backups: Daily rotating Obsidian vault backups via rclone
- Auxiliary ML Batch Container: Optional llama.cpp service with FIFO queue for long-running OCR/transcription tasks
- Docker and Docker Compose
- Z.AI API key (for GLM models - e.g., GLM-5, GLM-4.7)
- Telegram Bot Token (from @BotFather)
- DeepSeek API key (optional, for alternative LLM)
- Ollama Cloud API key (optional, for Ollama Cloud models)
- A private GitHub repo for agent state versioning
Create a private GitHub repo for agent state (this stores personality, skills, memory):
# Use the template in templates/agent-state-template/
# See templates/agent-state-template/README.md for instructionscd repos/josemar-assistente
git clone <your-private-repo-url> agent-state
cp .env.example .env
# Edit .env with your API keys and agent state repo URLIf you do not have a private state repo yet, initialize from template:
cp -r templates/agent-state-template/ agent-state
cd agent-state && git init && git add -A && git commit -m "Initial state"docker compose build
docker compose up -dNote: Use docker compose (with space) for Docker Compose V2, or docker-compose (with hyphen) for V1.
docker compose logs -f- Start a conversation with your Telegram bot
- Send a PDF credit card invoice for processing
- Ask questions in Brazilian Portuguese
Create a .env file with:
# LLM Provider
ZAI_API_KEY=your_zai_api_key_here
DEEPSEEK_API_KEY=your_deepseek_api_key_here
OLLAMA_API_KEY=your_ollama_api_key_here
# Telegram
TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here
TELEGRAM_ENABLED=true
PRIMARY_TELEGRAM_ID=123456789
# Web UI
GATEWAY_AUTH_PASSWORD=your-secure-password-here
CONTROL_UI_ALLOWED_ORIGIN_1=https://your-domain.example
CONTROL_UI_ALLOWED_ORIGIN_2=http://your-server-ip:18789
# Agent State Repo
WORKSPACE_STATE_REPO=https://github.com/username/josemar-agent-state.git
WORKSPACE_REPO_TOKEN=your_github_pat_here
# Sync Configuration
WORKSPACE_SYNC_ON_START=true
WORKSPACE_SYNC_INTERVAL=60
WORKSPACE_MEMORY_DAYS=30
# Auxiliary ML service (optional)
AUX_ML_ENABLED=false
COMPOSE_PROFILES=
AUX_ML_GLM_OCR_URL=
AUX_ML_GLM_OCR_SHA256=
AUX_ML_GLM_OCR_MMPROJ_URL=
AUX_ML_GLM_OCR_MMPROJ_SHA256=
AUX_ML_URL=http://aux-ml:8091
AUX_ML_MEMORY_LIMIT=8192m
AUX_ML_MEMORY_LIMIT_MB=8192
AUX_ML_MAX_QUEUE=50
AUX_ML_JOB_TIMEOUT_SECONDS=1800
AUX_ML_POLL_INTERVAL_SECONDS=2
AUX_ML_LLAMACPP_TIMEOUT_SECONDS=1800
AUX_ML_ALLOWED_INPUT_DIRS=/root/.openclaw/workspace
AUX_ML_ENFORCE_MEMORY_LIMIT=true
AUX_ML_OCR_MAX_PAGES=50
# Obsidian Sync/Backup
TS_AUTHKEY=tskey-xxxxx
TAILSCALE_HOSTNAME=josemar-server
TS_EXTRA_ARGS=
SYNCTHING_GUI_BIND_IP=127.0.0.1
TZ=America/Sao_PauloSee .env.example for the complete list.
To sync Obsidian outside your home network, use Tailscale on server and laptop:
- Install Tailscale on both devices.
- Join the same tailnet (
tailscale up). - Configure
TS_AUTHKEYso the servertailscalesidecar joins your tailnet automatically. - In Syncthing, set each device address to
tcp://<peer-tailscale-ip>:22000.
Laptop persistence check (after reboot):
systemctl is-enabled tailscaledIf needed:
sudo systemctl enable --now tailscaledDetailed runbook: docs/obsidian-operations.md.
The main configuration is in config/openclaw.json (JSON5 format). See config/AGENTS.md for complete reference.
This project uses explicit memory persistence preferences to reduce context loss between sessions:
- Long idle session window:
session.reset.idleMinutesis set to1440(24h) so conversations are less likely to reset before memory safeguards can run. - Pre-compaction memory flush enabled:
agents.defaults.compaction.memoryFlush.enabled: truewithsoftThresholdTokens: 4000andreserveTokensFloor: 40000. - Memory checkpoint cron: state repos should include a recurring checkpoint job that updates the memory daily log at
memory/YYYY-MM-DD.mdincrementally. - Dedup cursor file:
memory/flush-state.jsontracks checkpoint progress to reduce repeated entries in the same day.
Terminology note:
- Use memory daily log for
memory/YYYY-MM-DD.md. - Use Obsidian daily note only for vault files under
07-Daily/.
The auxiliary ML service runs in a dedicated container and is designed for queue-based, long-running jobs (minutes are acceptable). It currently starts with OCR (glm-ocr) and keeps a modular model registry for future additions.
- Single worker: exactly one job runs at a time
- FIFO queue: requests are processed in order
- Model lifecycle: load on demand, unload when the next queued job is a different model (or queue is empty)
- Internal only: no host port exposure by default (
http://aux-ml:8091)
To enable locally:
# In .env
AUX_ML_ENABLED=true
COMPOSE_PROFILES=aux-ml
docker compose up -d --buildPlace model files in aux-ml/models/ before building (see aux-ml/models/README.md).
If files are not present locally, build auto-downloads default Q8 model + mmproj from Hugging Face.
You can override URLs/checksums with AUX_ML_GLM_OCR_URL, AUX_ML_GLM_OCR_SHA256, AUX_ML_GLM_OCR_MMPROJ_URL, and AUX_ML_GLM_OCR_MMPROJ_SHA256.
Skills are split by ownership:
- Repo-shipped core skills:
skills-factory/(copied into image at/opt/josemar/skills) - User-owned state skills:
agent-state/skills/(private state repo, different per user)
Current core repo-shipped skills:
- vault-gateway: Single entrypoint for vault routing and operations
- aux-ml: Skill interface for queue-based auxiliary ML jobs
- workspace-sync: Skill interface for workspace git sync/status/commit/push flows
- Keep platform functionality in
skills-factory/ - Keep user-specific workflows only in each user's private state repo
- Do not commit user-specific skills to this main repository
For core repo-shipped skills:
- Create or update files under
skills-factory/<skill-name>/ - Rebuild/redeploy so the image ships the new version
For user-owned skills:
- Create skill in
agent-state/skills/<skill-name>/ - Add
SKILL.mdwith YAML frontmatter and executable script - No main-repo config change is needed
- Changes sync via the state repo workflow
See agent-state/skills/AGENTS.md for skill authoring details.
Credentials are stored in credentials/<service>/ and mounted into the container:
credentials/
├── README.md
└── gogcli/
├── README.md
└── josemar-assistente-openclaw-credentials.json
See credentials/README.md for setup instructions.
josemar-assistente/
├── agent-state/ # Nested git repo: agent workspace (private repo)
│ ├── .sync-manifest # Files to version
│ ├── .gitignore # Security ignore list
│ ├── skills/ # User-owned state skills
│ ├── cron/jobs.json # Cron definitions synced from state repo
│ └── memory/flush-state.json # Checkpoint cursor for memory daily log dedup
├── config/ # OpenClaw configuration
│ ├── AGENTS.md # Config reference
│ └── openclaw.json # Main config
├── credentials/ # Service credentials (NOT versioned)
│ └── README.md # Setup guide
├── scripts/
│ ├── workspace-sync.sh # Git sync logic
│ ├── obsidian-backup.sh # Obsidian backup and slot rotation
│ └── obsidian-backup-daemon.sh # Daily backup scheduler
├── aux-ml/ # Auxiliary llama.cpp batch processing service
├── skills-factory/ # Repo-owned core skills shipped in image
│ ├── vault-gateway/
│ ├── aux-ml/
│ └── workspace-sync/
├── docs/
│ ├── obsidian-operations.md # Syncthing/backup setup and operations runbook
│ └── aux-ml.md # Auxiliary ML operations runbook
├── templates/
│ └── agent-state-template/ # Template for new agent state repos
├── .github/workflows/ # CI/CD
│ └── deploy-to-home-server.yml # Deployment workflow
├── Dockerfile # Custom OpenClaw image
├── docker-compose.yml # Deployment config
├── docker-entrypoint.sh # Container startup
└── .env.example # Environment variables template
Deployment is handled via GitHub Actions:
- Set required secrets (see
.github/workflows/AGENTS.md) - For unattended remote sync setup, add optional secret
TS_AUTHKEY(Tailscale auth key) - Set required variables:
WORKSPACE_STATE_REPO(plus optionalTAILSCALE_HOSTNAME,TS_EXTRA_ARGS, and optionalAUX_ML_*variables if enabling aux-ml) - Run the
deploy-to-home-serverworkflow
Fresh Start: The workflow has a fresh_start option that erases ALL data (with a safety countdown).
docker compose buildpython3 -m unittest discover -s tests -v
# Scoped run for vault-gateway contract tests
python3 -m unittest tests.vault_gateway.test_gateway_contract -vdocker compose logs -f openclawDisable Telegram to avoid conflicts with production:
# In .env
TELEGRAM_ENABLED=false
docker compose up -dIf testing the auxiliary ML service, also set COMPOSE_PROFILES=aux-ml in .env.
Access Web UI at http://operator:YOUR_PASSWORD@localhost:18789/
- Image: Based on
ghcr.io/openclaw/openclaw:latestwith Python, pymupdf, git, gogcli - Auxiliary ML: Optional dedicated
aux-mlcontainer based onllama.cppserver for queued batch inference - Volumes:
openclaw-workspacefor OpenClaw runtime stateobsidian-vaultfor Obsidian notes and attachmentssyncthing-configfor Syncthing identity and configtailscale-statefor Tailscale sidecar identity and login stateobsidian-backup-statefor backup slot pointer
- Entrypoint: Copies config, mounts credentials, runs git sync, starts OpenClaw
- On start: Commits local changes, fetches remote, merges (remote wins conflicts), pushes
- Periodic: Auto-commits and pushes at configurable interval
- Security: Only files in
.sync-manifestare versioned - Cron source of truth: OpenClaw cron jobs are loaded from
agent-state/cron/jobs.json - Memory rotation: Logs older than N days are automatically removed
Two-scope skill system:
- Core scope (
skills-factory/): repo-owned, image-shipped platform capabilities - State scope (
agent-state/skills/): user-owned capabilities kept in private state repos - If a skill name exists in both scopes, treat the core repo-shipped version as the canonical platform source
- AGENTS.md: Root project documentation
- config/AGENTS.md: Configuration reference
- agent-state/skills/AGENTS.md: Skills development guide
- credentials/README.md: Credential management
- .github/workflows/AGENTS.md: CI/CD documentation
- docs/obsidian-operations.md: Obsidian sync/backup operations runbook
- docs/aux-ml.md: Auxiliary ML queue/model lifecycle operations
- templates/agent-state-template/README.md: Agent state setup
MIT