Onde Inference CLI

Command-line interface for Onde Inference.

Swift · Flutter · React Native · Rust · Website

Manage your Onde Inference account, fine-tune local models, and export them to GGUF, all from the terminal.

Install

Install onde-cli with your favorite tool. For package docs and the full install matrix, see https://ondeinference.com/cli.

npm

npm install -g @ondeinference/cli

Homebrew

brew tap ondeinference/homebrew-tap
brew install onde

pip / uv / uvx

pip install onde-cli
# or
uv tool install onde-cli
uv run onde
# or with
uvx --from onde-cli onde

.NET tool

dotnet tool install --global Onde.Cli

Dart pub global

dart pub global activate onde_cli

Pre-built binary

Download a release from GitHub Releases:

# macOS Apple Silicon
curl -Lo onde https://github.com/ondeinference/onde-cli/releases/latest/download/onde-macos-arm64
chmod +x onde && mv onde /usr/local/bin/onde

Platform	File
macOS Apple Silicon	`onde-macos-arm64`
macOS Intel	`onde-macos-amd64`
Linux x64	`onde-linux-amd64`
Linux arm64	`onde-linux-arm64`
Windows x64	`onde-win-amd64.exe`
Windows arm64	`onde-win-arm64.exe`

Usage

onde

This opens the TUI. You can sign up or sign in right there.

Key	What it does
`Tab`	Move between fields
`Enter`	Submit or sign out
`Ctrl+L`	Go to the sign-in screen
`Ctrl+N`	Go to the new account screen
`Ctrl+C`	Quit

Fine-tuning

onde includes a LoRA fine-tuning pipeline for Qwen2, Qwen2.5, and Qwen3 models. It runs locally: Metal on Apple Silicon, CPU elsewhere. No cloud setup. No Python environment.

The flow is straightforward: download a safetensors base model, fine-tune it with LoRA, merge the adapter back into the base weights, then export to GGUF for use in the Onde SDK.

If you want a quick refresher on what the model is actually doing at inference time, Onde has a short note on the forward pass.

Training data format

Each line should be one complete conversation in Qwen's chat template:

{"text": "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n<|im_start|>user\nWhat is LoRA?<|im_end|>\n<|im_start|>assistant\nLoRA adds small trainable matrices to frozen layers, letting you fine-tune large models without updating all the weights.<|im_end|>"}

Save the file wherever you want. The TUI lets you point to it directly.

Running it

onde
  → Models tab (Tab from Apps)
  → Select a safetensors model (↑↓, Enter)
  → Press f

Only safetensors models can be fine-tuned. GGUF models are already quantized, so their weights are not differentiable.

Configure the run:

Field	Default	Notes
Training data	`~/.onde/finetune/train.jsonl`	Path to your JSONL file
LoRA rank	`8`	Higher means more capacity and more memory use
Epochs	`3`	Full passes over the dataset
Learning rate	`0.0001`	AdamW default

Press Enter to start. In a healthy run, loss usually starts dropping by epoch 2. If it stays flat, try 0.0003.

After training

For rank 8 on a 0.6B model, the adapter is about 1.5 MB. From the fine-tune complete screen:

m to merge the adapter into the base model
g to export the merged model to GGUF

The resulting GGUF loads directly in the Onde SDK for on-device AI inference.

Supported base models

Model	Size	Notes
`Qwen/Qwen3-0.6B`	~1.2 GB	Smallest and quickest to train
`Qwen/Qwen2.5-1.5B-Instruct`	~3.0 GB	Good default for instruction tuning
`Qwen/Qwen3-1.7B`	~3.4 GB	Newer small Qwen3 model
`Qwen/Qwen3-4B`	~8.0 GB	Best quality, better suited to macOS

You can search for any of these from the Models tab with /.

Debug

Logs are written to ~/.cache/onde/debug.log.

License

Dual-licensed under MIT and Apache 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.agents/skills		.agents/skills
.github/workflows		.github/workflows
.nuget/NuGet		.nuget/NuGet
assets		assets
npm		npm
nuget		nuget
pub/onde_cli		pub/onde_cli
pypi		pypi
src		src
.gitignore		.gitignore
.nvmrc		.nvmrc
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
build.rs		build.rs
rust-toolchain.toml		rust-toolchain.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Onde Inference CLI

Install

npm

Homebrew

pip / uv / uvx

.NET tool

Dart pub global

Pre-built binary

Usage

Fine-tuning

Training data format

Running it

After training

Supported base models

Debug

License

Copyright

About

Uh oh!

Releases 4

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Onde Inference CLI

Install

npm

Homebrew

pip / uv / uvx

.NET tool

Dart pub global

Pre-built binary

Usage

Fine-tuning

Training data format

Running it

After training

Supported base models

Debug

License

Copyright

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Contributors

Uh oh!

Languages