Skip to content

feat: add openviking-server init interactive setup wizard for local Ollama model deployment#1353

Open
t0saki wants to merge 7 commits intovolcengine:mainfrom
t0saki:feat/local-ollama-wizard
Open

feat: add openviking-server init interactive setup wizard for local Ollama model deployment#1353
t0saki wants to merge 7 commits intovolcengine:mainfrom
t0saki:feat/local-ollama-wizard

Conversation

@t0saki
Copy link
Copy Markdown

@t0saki t0saki commented Apr 10, 2026

Description

Add an openviking-server init interactive setup wizard that guides users (especially macOS/Apple Silicon beginners) through configuring local embedding and VLM models via Ollama. Additionally implements proper Ollama lifecycle management following an "ensure running, never stop" pattern — Ollama is a shared service, so OpenViking starts it if needed but never tears it down on exit.

Also moves init and doctor from ov (client CLI) to openviking-server subcommands, since they are server-side configuration commands that operate on ov.conf.

Related Issue

#1334
#601

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update

Changes Made

  • openviking_cli/setup_wizard.py: Interactive wizard with three paths — local Ollama, cloud API, custom config. Includes auto-install, auto-start, model pulling, RAM-based recommendations.
  • openviking_cli/utils/ollama.py (new): Shared Ollama utilities extracted from setup_wizard. Adds detect_ollama_in_config(), ensure_ollama_for_server(), parse_ollama_url(), improved start_ollama() with stderr capture.
  • openviking_cli/server_bootstrap.py: Intercept init and doctor subcommands before server startup (they don't need a running server).
  • openviking_cli/rust_cli.py: Remove init intercept (moved to openviking-server).
  • openviking/server/bootstrap.py: Server startup detects Ollama from config and ensures it's running before app creation. Never blocks startup.
  • openviking/server/routers/system.py: /ready health check includes Ollama connectivity, returns 503 when configured but unreachable.
  • openviking_cli/doctor.py: Adds Ollama connectivity check. Fixes Ollama provider recognition without API key.
  • openviking_cli/utils/config/embedding_config.py: Add qwen3-embedding and embeddinggemma dimensions.

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • macOS
    • Linux
    • Windows

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Screenshots (if applicable)

❯ openviking-server init

  OpenViking Setup
  ================

  Existing config found: /Users/zhengxiao.wu/.openviking/ov.conf
  Overwrite? (current config will be backed up as .bak) [Y/n]: y

  Choose setup mode:

    [1] Local models via Ollama  (recommended for macOS / Apple Silicon)
    [2] Cloud API  (OpenAI, Volcengine, etc.)
    [3] Custom  (manual editing)

  Select [1]: 1

  Checking Ollama... not installed
  Install Ollama now? [Y/n]: y

✔︎ JSON API cask.jws.json                                                                                                                                                                        Downloaded   15.4MB/ 15.4MB
✔︎ JSON API formula.jws.json                                                                                                                                                                     Downloaded   32.0MB/ 32.0MB
==> Fetching downloads for: ollama
✔︎ Bottle Manifest ollama (0.20.6)                                                                                                                                                               Downloaded   13.4KB/ 13.4KB
✔︎ Bottle Manifest mlx (0.31.1)                                                                                                                                                                  Downloaded    8.3KB/  8.3KB
✔︎ Bottle Manifest mlx-c (0.6.0)                                                                                                                                                                 Downloaded    8.4KB/  8.4KB
✔︎ Bottle mlx-c (0.6.0)                                                                                                                                                                          Downloaded  174.4KB/174.4KB
✔︎ Bottle Manifest sqlite (3.53.0)                                                                                                                                                               Downloaded   11.8KB/ 11.8KB
✔︎ Bottle mlx (0.31.1)                                                                                                                                                                           Downloaded   46.5MB/ 46.5MB
✔︎ Bottle sqlite (3.53.0)                                                                                                                                                                        Downloaded    2.4MB/  2.4MB
✔︎ Bottle ollama (0.20.6)                                                                                                                                                                        Downloaded   13.3MB/ 13.3MB
==> Installing dependencies for ollama: sqlite, mlx and mlx-c
==> Installing ollama dependency: sqlite
==> Pouring sqlite--3.53.0.arm64_tahoe.bottle.tar.gz
🍺  /opt/homebrew/Cellar/sqlite/3.53.0: 13 files, 5.3MB
==> Installing ollama dependency: mlx
==> Pouring mlx--0.31.1.arm64_tahoe.bottle.tar.gz
🍺  /opt/homebrew/Cellar/mlx/0.31.1: 417 files, 152.9MB
==> Installing ollama dependency: mlx-c
==> Pouring mlx-c--0.6.0.arm64_tahoe.bottle.tar.gz
🍺  /opt/homebrew/Cellar/mlx-c/0.6.0: 39 files, 816.5KB
==> Installing ollama
==> Pouring ollama--0.20.6.arm64_tahoe.bottle.tar.gz
==> Caveats
To start ollama now and restart at login:
  brew services start ollama
Or, if you don't want/need a background service you can just run:
  OLLAMA_FLASH_ATTENTION="1" OLLAMA_KV_CACHE_TYPE="q8_0" /opt/homebrew/opt/ollama/bin/ollama serve
==> Summary
🍺  /opt/homebrew/Cellar/ollama/0.20.6: 8 files, 36.6MB
==> Running `brew cleanup ollama`...
Disable this behaviour by setting `HOMEBREW_NO_INSTALL_CLEANUP=1`.
Hide these hints with `HOMEBREW_NO_ENV_HINTS=1` (see `man brew`).
Removing: /Users/zhengxiao.wu/Library/Caches/Homebrew/ollama_bottle_manifest--0.20.5... (13.4KB)
Removing: /Users/zhengxiao.wu/Library/Caches/Homebrew/ollama--0.20.5... (13.3MB)
==> Caveats
==> ollama
To start ollama now and restart at login:
  brew services start ollama
Or, if you don't want/need a background service you can just run:
  OLLAMA_FLASH_ATTENTION="1" OLLAMA_KV_CACHE_TYPE="q8_0" /opt/homebrew/opt/ollama/bin/ollama serve
  OK Ollama installed
  Starting Ollama... ready

  Detected 48 GB RAM

  Embedding model:

    [1] Qwen3-Embedding 0.6B  (1024d, ~639 MB)
    [2] Qwen3-Embedding 4B  (1024d, ~2.5 GB)
    [3] Qwen3-Embedding 8B  (1024d, ~4.7 GB) *
    [4] EmbeddingGemma 300M  (768d, ~622 MB) [downloaded]

  Select [3]: 4

  Language model (VLM):

    [1] Qwen 3.5 2B  (~2.7 GB)
    [2] Qwen 3.5 4B  (~3.4 GB)
    [3] Qwen 3.5 9B  (~6.6 GB)
    [4] Qwen 3.5 27B  (~17 GB)
    [5] Qwen 3.5 35B  (~24 GB)
    [6] Qwen 3.5 122B  (~81 GB)
    [7] Gemma 4 E2B  (~7.2 GB)
    [8] Gemma 4 E4B  (~9.6 GB) [downloaded] *
    [9] Gemma 4 26B  (~18 GB)
    [10] Gemma 4 31B  (~20 GB)

  Select [8]: 8
  Workspace [/Users/zhengxiao.wu/.openviking/data]: 

  Summary:
    Embedding:  ollama / embeddinggemma:300m (768d)
    VLM:        litellm / ollama/gemma4:e4b
    Workspace:  /Users/zhengxiao.wu/.openviking/data
    Config:     /Users/zhengxiao.wu/.openviking/ov.conf
  
  Save configuration? [Y/n]: Y
  Existing config backed up to /Users/zhengxiao.wu/.openviking/ov.conf.bak
  OK Configuration written to /Users/zhengxiao.wu/.openviking/ov.conf

  Next steps:
    Start the server:  openviking-server
    Validate setup:    openviking-server doctor

Additional Notes

Command structure: init and doctor are server-side configuration commands (they operate on ov.conf, not ovcli.conf), so they live under openviking-server rather than ov (the client CLI).

Ollama Lifecycle Design:

Phase Behavior
openviking-server init Install + pull models (interactive)
openviking-server startup Detect from config → ensure running → warn on failure
openviking-server runtime /ready reports Ollama status
openviking-server exit Leave Ollama alone (shared service)
openviking-server doctor Report Ollama connectivity

38 tests total (18 setup wizard + 20 shared Ollama module), all passing.

Add an interactive CLI wizard that guides users through configuring
OpenViking with local Ollama models, especially targeting macOS/Apple
Silicon beginners. The wizard auto-detects and installs Ollama,
recommends models based on system RAM, pulls selected models, and
generates a valid ov.conf.

Supported models:
- Embedding: qwen3-embedding (0.6b/4b/8b), embeddinggemma:300m
- VLM: qwen3.5 (2b-122b), gemma4 (e2b/e4b/26b/31b)

Also fixes `ov doctor` to recognize Ollama providers as valid without
requiring an API key.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 10, 2026 10:30
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 10, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions
Copy link
Copy Markdown

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🏅 Score: 92
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

Missing License Header

New Python file in openviking_cli/ lacks required AGPL-3.0 copyright header.

"""ov init - interactive setup wizard for OpenViking.

Guides users through model selection and configuration, with a focus on
local deployment via Ollama for macOS / Apple Silicon beginners.
"""

@github-actions
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an ov init interactive wizard to help users generate an ov.conf for local Ollama or cloud providers, and updates related config/diagnostics so Ollama-based setups validate cleanly.

Changes:

  • Introduces openviking_cli/setup_wizard.py implementing the interactive ov init flow (Ollama install/start, model selection/pull, config generation, config write/backup).
  • Wires ov init into the Python wrapper CLI and updates ov doctor checks to treat Ollama configurations as valid without “real” API keys.
  • Extends Ollama embedding model dimension mapping and adds unit tests covering wizard helpers.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
openviking_cli/setup_wizard.py New interactive setup wizard and config writer for ov init.
openviking_cli/rust_cli.py Routes ov init to the Python wizard (like ov doctor).
openviking_cli/doctor.py Allows Ollama embedding + LiteLLM(Ollama) VLM configs without API key failures.
openviking_cli/utils/config/embedding_config.py Adds dimension presets for qwen3-embedding and embeddinggemma Ollama models.
tests/cli/test_setup_wizard.py New unit tests for wizard helper functions and config I/O.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

t0saki and others added 4 commits April 10, 2026 19:41
…hecks

Extract Ollama utilities into shared module (openviking_cli/utils/ollama.py)
so both `ov init` and `openviking-server` can reuse them. Server now
auto-detects Ollama from config and ensures it's running at startup
("ensure running, never stop" pattern). Adds Ollama connectivity to
`/ready` health check and `ov doctor` diagnostics.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
These are server-side configuration commands (generate/validate ov.conf),
not client operations. Having them under `ov` (the client CLI) was
confusing. Now:

  openviking-server init     # setup wizard
  openviking-server doctor   # diagnostics
  ov <subcommand>            # client operations only
@t0saki t0saki changed the title feat: add ov init interactive setup wizard for local Ollama model deployment feat: add openviking-server init interactive setup wizard for local Ollama model deployment Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

3 participants