Josemar Assistente - OpenClaw Bot

A self-hosted OpenClaw bot running in Docker with Telegram integration and PDF extraction capabilities.

Features

OpenClaw Gateway: Self-hosted AI agent gateway
Telegram Integration: Native Telegram bot support
PDF Extraction: Process Brazilian credit card invoice PDFs
Multi-Provider LLM Support: GLM, DeepSeek, Ollama Cloud, and OpenAI-compatible APIs
Brazilian Portuguese: Native language interaction
Docker Deployment: Containerized with persistent workspace storage
Git-Backed Agent State: Workspace files versioned in a private git repo
Obsidian Vault Sync: Syncthing sync over private network (Tailscale-ready)
Google Drive Backups: Daily rotating Obsidian vault backups via rclone
Auxiliary ML Batch Container: Optional llama.cpp service with FIFO queue for long-running OCR/transcription tasks

Prerequisites

Docker and Docker Compose
Z.AI API key (for GLM models - e.g., GLM-5, GLM-4.7)
Telegram Bot Token (from @BotFather)
DeepSeek API key (optional, for alternative LLM)
Ollama Cloud API key (optional, for Ollama Cloud models)
A private GitHub repo for agent state versioning

Quick Start

1. Create Agent State Repo

Create a private GitHub repo for agent state (this stores personality, skills, memory):

# Use the template in templates/agent-state-template/
# See templates/agent-state-template/README.md for instructions

2. Clone and Configure

cd repos/josemar-assistente
git clone <your-private-repo-url> agent-state
cp .env.example .env
# Edit .env with your API keys and agent state repo URL

If you do not have a private state repo yet, initialize from template:

cp -r templates/agent-state-template/ agent-state
cd agent-state && git init && git add -A && git commit -m "Initial state"

3. Build and Run

docker compose build
docker compose up -d

Note: Use docker compose (with space) for Docker Compose V2, or docker-compose (with hyphen) for V1.

4. Check Logs

docker compose logs -f

5. Interact with Bot

Start a conversation with your Telegram bot
Send a PDF credit card invoice for processing
Ask questions in Brazilian Portuguese

Configuration

Environment Variables

Create a .env file with:

# LLM Provider
ZAI_API_KEY=your_zai_api_key_here
DEEPSEEK_API_KEY=your_deepseek_api_key_here
OLLAMA_API_KEY=your_ollama_api_key_here

# Telegram
TELEGRAM_BOT_TOKEN=your_telegram_bot_token_here
TELEGRAM_ENABLED=true
PRIMARY_TELEGRAM_ID=123456789

# Web UI
GATEWAY_AUTH_PASSWORD=your-secure-password-here
CONTROL_UI_ALLOWED_ORIGIN_1=https://your-domain.example
CONTROL_UI_ALLOWED_ORIGIN_2=http://your-server-ip:18789

# Agent State Repo
WORKSPACE_STATE_REPO=https://github.com/username/josemar-agent-state.git
WORKSPACE_REPO_TOKEN=your_github_pat_here

# Sync Configuration
WORKSPACE_SYNC_ON_START=true
WORKSPACE_SYNC_INTERVAL=60
WORKSPACE_MEMORY_DAYS=30

# Auxiliary ML service (optional)
AUX_ML_ENABLED=false
COMPOSE_PROFILES=
AUX_ML_GLM_OCR_URL=
AUX_ML_GLM_OCR_SHA256=
AUX_ML_GLM_OCR_MMPROJ_URL=
AUX_ML_GLM_OCR_MMPROJ_SHA256=
AUX_ML_URL=http://aux-ml:8091
AUX_ML_MEMORY_LIMIT=8192m
AUX_ML_MEMORY_LIMIT_MB=8192
AUX_ML_MAX_QUEUE=50
AUX_ML_JOB_TIMEOUT_SECONDS=1800
AUX_ML_POLL_INTERVAL_SECONDS=2
AUX_ML_LLAMACPP_TIMEOUT_SECONDS=1800
AUX_ML_ALLOWED_INPUT_DIRS=/root/.openclaw/workspace
AUX_ML_ENFORCE_MEMORY_LIMIT=true
AUX_ML_OCR_MAX_PAGES=50

# Obsidian Sync/Backup
TS_AUTHKEY=tskey-xxxxx
TAILSCALE_HOSTNAME=josemar-server
TS_EXTRA_ARGS=
SYNCTHING_GUI_BIND_IP=127.0.0.1
TZ=America/Sao_Paulo

See .env.example for the complete list.

Remote Vault Sync (Tailscale)

To sync Obsidian outside your home network, use Tailscale on server and laptop:

Install Tailscale on both devices.
Join the same tailnet (tailscale up).
Configure TS_AUTHKEY so the server tailscale sidecar joins your tailnet automatically.
In Syncthing, set each device address to tcp://<peer-tailscale-ip>:22000.

Laptop persistence check (after reboot):

systemctl is-enabled tailscaled

If needed:

sudo systemctl enable --now tailscaled

Detailed runbook: docs/obsidian-operations.md.

OpenClaw Configuration

The main configuration is in config/openclaw.json (JSON5 format). See config/AGENTS.md for complete reference.

Memory Persistence Preferences

This project uses explicit memory persistence preferences to reduce context loss between sessions:

Long idle session window: session.reset.idleMinutes is set to 1440 (24h) so conversations are less likely to reset before memory safeguards can run.
Pre-compaction memory flush enabled: agents.defaults.compaction.memoryFlush.enabled: true with softThresholdTokens: 4000 and reserveTokensFloor: 40000.
Memory checkpoint cron: state repos should include a recurring checkpoint job that updates the memory daily log at memory/YYYY-MM-DD.md incrementally.
Dedup cursor file: memory/flush-state.json tracks checkpoint progress to reduce repeated entries in the same day.

Terminology note:

Use memory daily log for memory/YYYY-MM-DD.md.
Use Obsidian daily note only for vault files under 07-Daily/.

Auxiliary ML Service (Optional)

The auxiliary ML service runs in a dedicated container and is designed for queue-based, long-running jobs (minutes are acceptable). It currently starts with OCR (glm-ocr) and keeps a modular model registry for future additions.

Single worker: exactly one job runs at a time
FIFO queue: requests are processed in order
Model lifecycle: load on demand, unload when the next queued job is a different model (or queue is empty)
Internal only: no host port exposure by default (http://aux-ml:8091)

To enable locally:

# In .env
AUX_ML_ENABLED=true
COMPOSE_PROFILES=aux-ml

docker compose up -d --build

Place model files in aux-ml/models/ before building (see aux-ml/models/README.md). If files are not present locally, build auto-downloads default Q8 model + mmproj from Hugging Face. You can override URLs/checksums with AUX_ML_GLM_OCR_URL, AUX_ML_GLM_OCR_SHA256, AUX_ML_GLM_OCR_MMPROJ_URL, and AUX_ML_GLM_OCR_MMPROJ_SHA256.

Skills

Skills are split by ownership:

Repo-shipped core skills: skills-factory/ (copied into image at /opt/josemar/skills)
User-owned state skills: agent-state/skills/ (private state repo, different per user)

Current core repo-shipped skills:

vault-gateway: Single entrypoint for vault routing and operations
aux-ml: Skill interface for queue-based auxiliary ML jobs
workspace-sync: Skill interface for workspace git sync/status/commit/push flows

Skill Ownership Policy

Keep platform functionality in skills-factory/
Keep user-specific workflows only in each user's private state repo
Do not commit user-specific skills to this main repository

Adding Skills

For core repo-shipped skills:

Create or update files under skills-factory/<skill-name>/
Rebuild/redeploy so the image ships the new version

For user-owned skills:

Create skill in agent-state/skills/<skill-name>/
Add SKILL.md with YAML frontmatter and executable script
No main-repo config change is needed
Changes sync via the state repo workflow

See agent-state/skills/AGENTS.md for skill authoring details.

Credential Management

Credentials are stored in credentials/<service>/ and mounted into the container:

credentials/
├── README.md
└── gogcli/
    ├── README.md
    └── josemar-assistente-openclaw-credentials.json

See credentials/README.md for setup instructions.

Project Structure

josemar-assistente/
├── agent-state/                    # Nested git repo: agent workspace (private repo)
│   ├── .sync-manifest              # Files to version
│   ├── .gitignore                  # Security ignore list
│   ├── skills/                     # User-owned state skills
│   ├── cron/jobs.json              # Cron definitions synced from state repo
│   └── memory/flush-state.json     # Checkpoint cursor for memory daily log dedup
├── config/                         # OpenClaw configuration
│   ├── AGENTS.md                   # Config reference
│   └── openclaw.json               # Main config
├── credentials/                    # Service credentials (NOT versioned)
│   └── README.md                   # Setup guide
├── scripts/
│   ├── workspace-sync.sh           # Git sync logic
│   ├── obsidian-backup.sh          # Obsidian backup and slot rotation
│   └── obsidian-backup-daemon.sh   # Daily backup scheduler
├── aux-ml/                         # Auxiliary llama.cpp batch processing service
├── skills-factory/                 # Repo-owned core skills shipped in image
│   ├── vault-gateway/
│   ├── aux-ml/
│   └── workspace-sync/
├── docs/
│   ├── obsidian-operations.md      # Syncthing/backup setup and operations runbook
│   └── aux-ml.md                   # Auxiliary ML operations runbook
├── templates/
│   └── agent-state-template/       # Template for new agent state repos
├── .github/workflows/              # CI/CD
│   └── deploy-to-home-server.yml   # Deployment workflow
├── Dockerfile                      # Custom OpenClaw image
├── docker-compose.yml              # Deployment config
├── docker-entrypoint.sh            # Container startup
└── .env.example                    # Environment variables template

Deployment

Deployment is handled via GitHub Actions:

Set required secrets (see .github/workflows/AGENTS.md)
For unattended remote sync setup, add optional secret TS_AUTHKEY (Tailscale auth key)
Set required variables: WORKSPACE_STATE_REPO (plus optional TAILSCALE_HOSTNAME, TS_EXTRA_ARGS, and optional AUX_ML_* variables if enabling aux-ml)
Run the deploy-to-home-server workflow

Fresh Start: The workflow has a fresh_start option that erases ALL data (with a safety countdown).

Development

Building the Image

docker compose build

Running Vault Gateway Tests

python3 -m unittest discover -s tests -v

# Scoped run for vault-gateway contract tests
python3 -m unittest tests.vault_gateway.test_gateway_contract -v

Viewing Logs

docker compose logs -f openclaw

Local Testing

Disable Telegram to avoid conflicts with production:

# In .env
TELEGRAM_ENABLED=false

docker compose up -d

If testing the auxiliary ML service, also set COMPOSE_PROFILES=aux-ml in .env.

Access Web UI at http://operator:YOUR_PASSWORD@localhost:18789/

Architecture

Docker Deployment

Image: Based on ghcr.io/openclaw/openclaw:latest with Python, pymupdf, git, gogcli
Auxiliary ML: Optional dedicated aux-ml container based on llama.cpp server for queued batch inference
Volumes:
- openclaw-workspace for OpenClaw runtime state
- obsidian-vault for Obsidian notes and attachments
- syncthing-config for Syncthing identity and config
- tailscale-state for Tailscale sidecar identity and login state
- obsidian-backup-state for backup slot pointer
Entrypoint: Copies config, mounts credentials, runs git sync, starts OpenClaw

Agent State Sync

On start: Commits local changes, fetches remote, merges (remote wins conflicts), pushes
Periodic: Auto-commits and pushes at configurable interval
Security: Only files in .sync-manifest are versioned
Cron source of truth: OpenClaw cron jobs are loaded from agent-state/cron/jobs.json
Memory rotation: Logs older than N days are automatically removed

Skills System

Two-scope skill system:

Core scope (skills-factory/): repo-owned, image-shipped platform capabilities
State scope (agent-state/skills/): user-owned capabilities kept in private state repos
If a skill name exists in both scopes, treat the core repo-shipped version as the canonical platform source

Documentation

AGENTS.md: Root project documentation
config/AGENTS.md: Configuration reference
agent-state/skills/AGENTS.md: Skills development guide
credentials/README.md: Credential management
.github/workflows/AGENTS.md: CI/CD documentation
docs/obsidian-operations.md: Obsidian sync/backup operations runbook
docs/aux-ml.md: Auxiliary ML queue/model lifecycle operations
templates/agent-state-template/README.md: Agent state setup

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 152 Commits
.github/workflows		.github/workflows
.opencode		.opencode
aux-ml		aux-ml
config		config
docs		docs
scripts		scripts
skills-factory		skills-factory
templates/agent-state-template		templates/agent-state-template
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pii-allowlist		.pii-allowlist
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
docker-entrypoint.sh		docker-entrypoint.sh
opencode.json		opencode.json

Folders and files

Latest commit

History

Repository files navigation

Josemar Assistente - OpenClaw Bot

Features

Prerequisites

Quick Start

1. Create Agent State Repo

2. Clone and Configure

3. Build and Run

4. Check Logs

5. Interact with Bot

Configuration

Environment Variables

Remote Vault Sync (Tailscale)

OpenClaw Configuration

Memory Persistence Preferences

Auxiliary ML Service (Optional)

Skills

Skill Ownership Policy

Adding Skills

Credential Management

Project Structure

Deployment

Development

Building the Image

Running Vault Gateway Tests

Viewing Logs

Local Testing

Architecture

Docker Deployment

Agent State Sync

Skills System

Documentation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages