Installs the full prerequisite stack for NICo Core and NICo REST on a bare-metal Kubernetes cluster. Everything is orchestrated by a single script:
export NICO_IMAGE_REGISTRY=<nico-image-registry> # unless using --skip-core --skip-rest
export NICO_CORE_IMAGE_TAG=<nico-core-image-tag> # unless using --skip-core
export NICO_REST_IMAGE_TAG=<nico-rest-image-tag> # unless using --skip-rest
# export REGISTRY_PULL_SECRET=<registry-pull-secret> # optional; authenticated registries only
./setup.sh # interactive - prompts before deploying Core and REST
./setup.sh -y # non-interactive - deploys everythingFor complete step-by-step deployment instructions, see the Quick Start Guide in the NICo documentation site. The Quick Start Guide covers:
- Building NICo containers
- Preparing the Kubernetes cluster
- Configuring the site (environment variables, values files, MetalLB, VIPs, preflight)
- Running
setup.sh - Connecting the OOB network
- Discovering your first host
- Verifying the deployment
For manual phase-by-phase installation (re-running individual phases, debugging failures), see the Reference Installation guide.
helm-prereqs/
├── setup.sh # Main deployment script - runs all phases sequentially
├── preflight.sh # Pre-flight validation (also run automatically by setup.sh)
├── clean.sh # Teardown script - removes everything in reverse order
├── unseal_vault.sh # Vault init + unseal (called by setup.sh Phase 4)
├── bootstrap_ssh_host_key.sh # SSH host key generation (called by setup.sh Phase 4)
├── helmfile.yaml # Helmfile release definitions for all prerequisite components
├── Chart.yaml # nico-prereqs Helm chart metadata
├── values.yaml # Top-level values (siteName, PostgreSQL tuning)
├── values/
│ ├── nico-core.yaml # NICo Core deployment values (hostname, siteConfig, VIPs)
│ ├── nico-rest.yaml # NICo REST deployment values (Keycloak config)
│ ├── nico-site-agent.yaml # Site-agent deployment values (DB config, gRPC settings)
│ └── metallb-config.yaml # MetalLB IP pools, BGP peers, and advertisements
├── templates/ # nico-prereqs Helm chart templates (PKI, ESO, PostgreSQL)
├── operators/ # Raw manifests and operator values (local-path, MetalLB, cert-manager, Vault, ESO)
└── keycloak/ # Dev Keycloak deployment and token helper scripts
Before running setup.sh, walk through these in order. Each step links to the
config it edits.
- Pick your IP plan. Carve out two CIDR blocks reachable from the
provisioning network: an external pool for
nico-apiand an internal pool fornico-dhcp,nico-dns,nico-pxe,nico-ntp,nico-ssh-console-rs. Reserve specific VIPs from those blocks for each service plus one pernico-ntp/nico-dnsreplica. →values/metallb-config.yamlIPAddressPool blocks. - Wire MetalLB to your network. Set per-node
BGPPeerASNs / addresses (BGP mode), or switch toL2Advertisementfor non-BGP environments. →values/metallb-config.yamlBGPPeer / BGPAdvertisement / L2Advertisement. - Fill in site identity.
siteName(top-level) plus the TOML block undernico-api.siteConfig.nicoApiSiteConfig: sitename, initial_domain_name, site_fabric_prefixes, deny_prefixes, pools, networks. →values.yamlandvalues/nico-core.yaml. - Pin per-service VIPs into nico-core. Each chart's
externalService.annotations.metallb.universe.tf/loadBalancerIPs(orperPodAnnotationsfornico-ntp/nico-dns) must match a VIP from the pools you carved out in step 1. →values/nico-core.yaml. - Set the DHCP hook parameters.
nico-dhcp.config.kea.hookParameters(nameservers,ntpServer,provisioningServer) tells DHCP clients where to find DNS / NTP / PXE. These must equal the VIPs you set in step 4. The chart default is127.0.0.1— leaving it there silently breaks DPU bring-up. →values/nico-core.yaml. - Decide how the
.forgecompatibility zone is served. Built-in unbound (enable invalues/nico-core.yaml) or your external DNS. Required for existing DPUs that look upcarbide-api.forge,carbide-pxe.forge,carbide-ntp.forge, etc. See DPU compatibility DNS below. - Export the runtime env vars (registry, image tags, optional pull secret) — see Environment variables below.
Once the above is done, run ./setup.sh -y.
Detailed field-by-field instructions for each values file live in the Quick Start Guide — Step 3. The tables below summarize the keys that must be set per site.
| Variable | Required | Description |
|---|---|---|
KUBECONFIG |
No | Path to your cluster kubeconfig. Optional when the current kubectl context already points at the target cluster. |
REGISTRY_PULL_SECRET |
No | Raw NGC API key or registry password (e.g. nvapi-...). This value is passed verbatim as the docker password — do not point it at a file path or a JSON dockerconfig. Leave unset for public, preloaded, or externally managed image pulls. |
REGISTRY_PULL_USERNAME |
No | Username for generated pull secrets. Defaults to $oauthtoken (correct for nvcr.io API-key auth). |
NICO_IMAGE_REGISTRY |
Yes, unless --skip-core --skip-rest |
Base image registry for all NICo images (e.g. my-registry.example.com/nico) |
NICO_CORE_IMAGE_TAG |
Yes, unless --skip-core |
NICo Core image tag (e.g. v2025.12.30-rc1) |
NICO_REST_IMAGE_TAG |
Yes, unless --skip-rest |
NICo REST image tag (e.g. v1.0.4) |
NICO_SITE_UUID |
No | Stable UUID for this site. Defaults to a1b2c3d4-e5f6-4000-8000-000000000001. |
NICO_MANAGE_DEFAULT_STORAGE_CLASS |
No | Whether setup.sh marks local-path as the default StorageClass. Defaults to true. Set to false when the cluster already has an operator-managed default StorageClass. |
NICO_STORAGE_CLASS |
No | StorageClass used by Vault data/audit PVCs. Defaults to local-path-persistent. |
PREFLIGHT_CHECK_IMAGE |
No | Image used for preflight per-node checks. Defaults to busybox:1.36; set to a local mirror for air-gapped clusters. |
| Key | Default | Must change? | Description |
|---|---|---|---|
siteName |
"TMP_SITE" |
Yes | Site identifier, injected into postgres pods as TMP_SITE |
imagePullSecrets.ngcNicoPull |
"" |
No (auto) | Pull secret for NICo Core images. Set automatically by setup.sh from REGISTRY_PULL_SECRET when provided. |
vault.nicoCliClientRole.enabled |
false |
No | Create an optional Vault PKI role for short-lived NICo CLI client certificates. This only defines the certificate profile; issuance access must be granted separately. |
vault.nicoCliClientRole.name |
"nico-cli-client" |
No | Vault role name and certificate SubjectOU used to identify NICo CLI client certificates. |
vault.nicoCliClientRole.organization |
"" |
No | Optional certificate SubjectO value for deployments that want an additional identity marker. |
postgresql.instances |
3 |
No | Number of PostgreSQL replicas |
postgresql.volumeSize |
"10Gi" |
No | PVC size per PostgreSQL replica |
postgresql.storageClass |
"local-path-persistent" |
No | StorageClass for the nico-prereqs PostgreSQL PVCs. Override through Helm values when using a non-local StorageClass. |
| Key | Default | Must change? | Description |
|---|---|---|---|
nico-api.hostname |
"api-examplesite.example.com" |
Yes | External DNS name for the NICo Core API |
nico-api.externalService.annotations...loadBalancerIPs |
"10.180.126.177" |
Yes | MetalLB VIP for nico-api (from external pool) |
siteConfig.sitename |
"examplesite" |
Yes | Short site identifier (must match siteName in values.yaml) |
siteConfig.initial_domain_name |
"examplesite.example.com" |
Yes | Base DNS domain for the site |
siteConfig.dhcp_servers |
["10.180.126.160"] |
Yes | DHCP service VIP(s) from your MetalLB internal pool |
siteConfig.site_fabric_prefixes |
["10.180.62.72/29"] |
Yes | CIDRs for site fabric (instance-to-instance traffic) |
siteConfig.deny_prefixes |
["10.180.62.64/29", ...] |
Yes | CIDRs instances must not reach (OOB, mgmt, underlay) |
siteConfig.[pools.lo-ip] ranges |
{ start = "10.180.62.84", end = "10.180.62.86" } |
Yes | Loopback IP range for bare-metal hosts |
siteConfig.[pools.vlan-id] ranges |
{ start = "100", end = "501" } |
Yes | VLAN ID allocation range |
siteConfig.[pools.vni] ranges |
{ start = "1024500", end = "1024800" } |
Yes | VXLAN Network Identifier range |
siteConfig.[networks.admin] |
example values | Yes | Admin/OOB network: prefix (CIDR), gateway, mtu, reserve_first. prefix and gateway must not be empty — nico-api crashes on startup if they are. |
siteConfig.[networks.<underlay>] |
[networks.RNO1-M04-D04-IPMITOR-01] |
Yes | One block per underlay data-plane L3 segment: type = "underlay", prefix, gateway, mtu, reserve_first. Rename the block to match your site segment name. Add additional blocks for each underlay segment. |
nico-api / nico-dhcp / nico-dns / nico-pxe / nico-ssh-console-rs .externalService.annotations.metallb.universe.tf/loadBalancerIPs |
example IPs | Yes | Single MetalLB VIP per service. Must be inside the matching IPAddressPool from metallb-config.yaml (external pool for nico-api, internal pool for the rest). |
nico-ntp.externalService.perPodAnnotations |
3-element example list | Yes | nico-ntp is a StatefulSet — one MetalLB VIP per replica (3 by default). List entry [0] goes on the LB Service for pod nico-ntp-0, [1] on nico-ntp-1, etc. These three VIPs are what DPUs sync clocks against. |
nico-dhcp.config.kea.hookParameters.nameservers |
"127.0.0.1" (chart default) |
Yes | IP(s) advertised to DHCP clients as their DNS resolver. Must be the nico-dns VIP (or whichever DNS the DPUs should use). Leaving the 127.0.0.1 chart default silently breaks DPU name resolution. |
nico-dhcp.config.kea.hookParameters.ntpServer |
"127.0.0.1" (chart default) |
Yes | Comma-separated IPs advertised to DHCP clients as their NTP servers. Must match the three nico-ntp.externalService.perPodAnnotations VIPs. DPU pre-ingestion fails on clock divergence if this is left at the default. |
nico-dhcp.config.kea.hookParameters.provisioningServer |
"127.0.0.1" (chart default) |
Yes | IP advertised as the PXE / provisioning server. Must be the nico-pxe VIP. |
| Key | Default | Must change? | Description |
|---|---|---|---|
nico-rest-api.config.keycloak.enabled |
true |
No | Use bundled dev Keycloak. Set false for BYO IdP. |
nico-rest-api.config.keycloak.baseURL |
"http://keycloak.nico-rest:8082" |
For prod | Internal Keycloak URL. Change if using external Keycloak. |
nico-rest-api.config.keycloak.externalBaseURL |
"http://keycloak.nico-rest:8082" |
For prod | External Keycloak URL returned in tokens |
| Key | Default | Must change? | Description |
|---|---|---|---|
envConfig.DB_ADDR |
"postgres.postgres.svc.cluster.local" |
For prod | PostgreSQL host address |
envConfig.DB_DATABASE |
"elektratest" |
For prod | Database name |
envConfig.DEV_MODE |
"true" |
For prod | Set to "false" in production |
envConfig.NICO_SEC_OPT |
"2" |
No | Security mode: 0=insecure, 1=TLS, 2=mTLS (required) |
CLUSTER_ID |
— | No (auto) | Site UUID. Set automatically by setup.sh via --set from NICO_SITE_UUID. |
TEMPORAL_SUBSCRIBE_NAMESPACE |
— | No (auto) | Temporal namespace. Set automatically by setup.sh via --set from NICO_SITE_UUID. Must match CLUSTER_ID. |
| Key | Default | Must change? | Description |
|---|---|---|---|
IPAddressPool (internal).spec.addresses |
10.180.126.160/28 |
Yes | Internal VIP CIDR for DHCP, DNS, PXE, SSH, NTP |
IPAddressPool (external).spec.addresses |
10.180.126.176/28 |
Yes | External VIP CIDR for nico-api |
BGPPeer[*].spec.myASN |
4244766850 |
Yes | Cluster-side ASN (same for all nodes) |
BGPPeer[*].spec.peerASN |
per-node | Yes | TOR router ASN (unique per node) |
BGPPeer[*].spec.peerAddress |
per-node | Yes | TOR switch IP reachable from each node |
BGPPeer[*].spec.nodeSelectors |
example hostnames | Yes | Actual node hostnames (kubectl get nodes) |
| Advertisement mode | BGP | For dev | For non-BGP environments: comment out BGPPeer/BGPAdvertisement, uncomment L2Advertisement |
setup.sh runs preflight validation automatically before making cluster changes.
It supports these common deployment modes:
| Option | Description |
|---|---|
-y |
Non-interactive mode; accept setup prompts automatically. |
--skip-core |
Install prerequisites and REST, but skip the NICo Core Helm release. |
--skip-rest |
Install prerequisites and Core, but skip all REST phases and REST repo checks. |
--skip-core --skip-rest |
Infrastructure-only run; image tags, image registry, and REST repo are not required. |
--core-values <file> |
Use site-specific Core values instead of helm-prereqs/values/nico-core.yaml. |
--metallb-config <path> |
Use a site-specific MetalLB manifest file or kustomize directory. |
--site-overlay <dir> |
Apply a site kustomize overlay after Core deploys. |
--debug |
Enable bash tracing. This can print secrets, so avoid it in shared logs. |
REGISTRY_PULL_SECRET is optional. When it is unset, setup does not create or
inject image pull secrets; images must be public, preloaded, or configured with
existing imagePullSecrets in values.
local-path-provisioner (raw manifest - StorageClasses for Vault + PostgreSQL PVCs)
metallb (metallb/metallb 0.14.5 - LoadBalancer IPs via BGP or L2)
postgres-operator (zalando/postgres-operator 1.10.1 - manages nico-pg-cluster)
cert-manager (jetstack/cert-manager v1.17.1)
vault (hashicorp/vault 0.25.0, 3-node HA Raft, TLS)
external-secrets (external-secrets/external-secrets 0.14.3)
nico-prereqs (this Helm chart - nico-system namespace)
NICo Core (../helm - nico-core.yaml values)
├── nico-api (Deployment - gRPC/REST API, requires PostgreSQL + Vault)
├── nico-bmc-proxy (Deployment - authenticating Redfish proxy)
├── nico-dhcp (Deployment - Kea DHCP, advertises hook params to DPUs)
├── nico-dns (StatefulSet - authoritative DNS, per-pod LB VIPs)
├── nico-hardware-health (Deployment - hardware health collector)
├── nico-ntp (StatefulSet - chrony, per-pod LB VIPs, on by default)
├── nico-pxe (Deployment - HTTP PXE boot)
├── nico-ssh-console-rs (Deployment - SSH console proxy)
└── unbound (Deployment - .forge zone DNS, opt-in)
NICo REST (infra-controller-rest/helm/charts/nico-rest)
├── nico-rest-ca-issuer ClusterIssuer (cert-manager.io)
├── postgres StatefulSet (temporal + keycloak + NICo databases)
├── keycloak (dev OIDC IdP, nico-dev realm)
├── temporal (temporal-helm/temporal, mTLS)
├── nico-rest (API, cert-manager, workflow, site-manager)
└── nico-rest-site-agent (StatefulSet, bootstrap via site-manager)
Existing DPU agent binaries deployed in the field are hardcoded to resolve a
handful of legacy hostnames in the .forge zone:
| Hostname | Port | Used by | Points at |
|---|---|---|---|
carbide-api.forge |
443 | DPU agents, CLI, PXE, DHCP — gRPC/TLS to NICo API | nico-api external VIP |
carbide-pxe.forge |
80 | DPU agents (hardcoded in agent binary) — HTTP boot artifacts | nico-pxe VIP |
carbide-static-pxe.forge |
80 | Host PXE loader (hardcoded in boot images) | nico-pxe VIP |
carbide-ntp.forge |
123 | DPU agents (hardcoded in agent binary) — NTP/UDP | nico-ntp VIPs (one per replica) |
unbound.forge |
53 | DPUs (distributed via DHCP option 6) — DNS | unbound VIP |
otel-receiver.forge |
443 | otel-collector sidecars — gRPC/TLS | otel receiver VIP |
socks.forge |
1888 | DPU extension services (hardcoded in agent binary) | socks VIP |
Per the dual-deployment-compat POR, these names stay hardcoded in the binary for now. The deployment is responsible for resolving them. Two ways to do that:
- In
values/nico-core.yaml, enable theunboundblock and uncomment thelocalData:example. Each entry takes anameand anaddresseslist — fill the addresses with the VIPs you've already assigned to the corresponding service above (those live in the same file under each chart'sexternalService.annotations.metallb.universe.tf/loadBalancerIPs). - Assign a MetalLB VIP to unbound itself (so DPUs can reach it via DHCP
option 6). Add it as another
externalServiceentry the same way. - Re-run
setup.sh. The chart deploys unbound with the.forgezone pre-populated; DPUs reach it via DHCP-served DNS. - Verify with
helm-prereqs/health-check.sh— the.forge DNS Endpoint Referencesection reports per-record status.
If your site already has DNS infrastructure for the OOB management network,
serve the .forge zone there. Point each hostname at the corresponding
MetalLB VIP in values/nico-core.yaml. The cluster has no opinion on which
DNS server provides the records; only that the DPUs can resolve them.
Without one of these in place, DPU bring-up will hang on PXE / NTP / API lookups even though every cluster-side helm chart shows healthy.
Once DNS resolves carbide-api.forge to the nico-api VIP, the TLS handshake
still has to validate the server cert against that hostname. The chart's
default cert SAN list only covers nico-api.<release-ns>.svc.cluster.local
and the short DNS name — connections to carbide-api.forge would fail TLS
verification. To accept the legacy hostnames, add them to
certificate.extraDnsNames for each affected chart in
values/nico-core.yaml:
| Chart | Required extraDnsNames |
|---|---|
nico-api |
carbide-api.forge, carbide-api.forge-system.svc.cluster.local, plus the external hostname clients use (matches nico-api.hostname) |
nico-pxe |
carbide-pxe.forge, carbide-static-pxe.forge, carbide-pxe.forge-system.svc.cluster.local |
The example values/nico-core.yaml in this directory has these entries
pre-populated under each chart's certificate.extraDnsNames block. They're
issued by vault-nico-issuer (set up by nico-prereqs in Phase 5) and
rotated on the usual cert-manager schedule.
If you're migrating from an existing forged-kustomize site and want the
DPUs already in the field (which have certs in the forge.local trust
domain) to keep authenticating, also override
global.spiffe.trustDomain to forge.local in your values. See the
dual-deployment-compat POR
for the in-place upgrade caveats.
After setup completes, run the read-only health check from the repo root:
helm-prereqs/health-check.shThe script auto-detects the Core, Vault, Postgres, cert-manager, External Secrets, and MetalLB namespaces. Override namespace detection if your deployment uses non-default namespaces:
NICO_NS=nico-system \
VAULT_NS=vault \
POSTGRES_NS=postgres \
CERT_MANAGER_NS=cert-manager \
ESO_NS=external-secrets \
METALLB_NS=metallb-system \
helm-prereqs/health-check.shIt checks component readiness, Vault and PostgreSQL health, required secrets and certificates, External Secrets sync status, LoadBalancer VIP assignment, and basic in-cluster connectivity. Failures exit non-zero; warnings and skipped probes are reported without failing the run.
./clean.shRemoves all components in reverse dependency order: NICo REST → NICo Core → helmfile releases → CRDs → namespaces → PVs → local-path-provisioner.