Skip to content

PCP-6874: Feat: LXD workload cluster / Host Control Plane cluster / maintenance mode #354

Merged
AmitSahastra merged 60 commits into
mainfrom
feat/lxd-hcp
Jun 19, 2026
Merged

PCP-6874: Feat: LXD workload cluster / Host Control Plane cluster / maintenance mode #354
AmitSahastra merged 60 commits into
mainfrom
feat/lxd-hcp

Conversation

@Kun483

@Kun483 Kun483 commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

What this PR does / why we need it

  1. Cherry picked the LXD (DHCP) / HCP experimental features
  2. Fixed make docker-push-all
  3. Removed spectro folder
  4. Added WLC and HCP cluster templates
  5. Make cluster role configurable

Type of change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that changes existing behavior)
  • Docs / chore / CI only

Checklist

  • I have read the CONTRIBUTING guide.
  • make test passes locally.
  • make lint passes locally.
  • make verify passes locally (generated code and formatting are committed).
  • I have added/updated tests where appropriate (table-driven, deterministic).
  • I have updated docs / manifests where appropriate.

Special notes for reviewers

guyni and others added 30 commits June 4, 2026 11:10
* Added InternalIP to Machine's status.addresses

* Added InternalDNS to Machine's status.addresses

* support hypershift nodes with FQDN

---------

Co-authored-by: jzhoucliqr <zhoujun06@gmail.com>
)

* PCP-5029 [Palette] [CAPMaas] Support for the Hosted control plane.

* Minor changes (#180)

* Pcp 5029 2 (#181)

* Minor changes

* Refactrored IsLXDHostEnabled
)

* Add support for LXD server and VM composer

* Vmhost and lxd initiliser daemon

* Fix MAAS client identity to use secret or environment variables and add LXD documentation

* Fix make generate command to exclude lxd-initializer directory

* Update generate-manifests to include controllers directory for RBAC generation

* LXD Support for Controlplane Cluster:

- Support for LXD init via daemonset
- Launch daemonset from capmaas-controller
- Register CP node as LXD host from capmaas-controller

* Fix maas profile creation

* Code refactoring, commented redundent code for now

* Workload cluster and VM creation fix. Code refactoring

* Update changes

* Code refactor, removed infraclusterref

* PCP-5043: remove accidental .netrc and ignore it

* Removed WorkloadClusterConfigRef from maas cluster. Refactor code.

* Update pkg/maas/machine/machine.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update pkg/maas/machine/machine.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add support for LXD server and VM composer

* Vmhost and lxd initiliser daemon

* Fix MAAS client identity to use secret or environment variables and add LXD documentation

* Fix make generate command to exclude lxd-initializer directory

* Update generate-manifests to include controllers directory for RBAC generation

* LXD Support for Controlplane Cluster:

- Support for LXD init via daemonset
- Launch daemonset from capmaas-controller
- Register CP node as LXD host from capmaas-controller

* Fix maas profile creation

* Code refactoring, commented redundent code for now

* Workload cluster and VM creation fix. Code refactoring

* Update changes

* Code refactor, removed infraclusterref

* PCP-5043: remove accidental .netrc and ignore it

* Removed WorkloadClusterConfigRef from maas cluster. Refactor code.

* core global template

* Fix vm compose flow. Updated maas-client-go verison

* Update machine.go

* Fixed pr comment and constants for min resource

* Minor comment fix
* PCP-5088: Storage input configuration

* PCP-5088: Storage input configuration

* PCP-5088: Storage input configuration

* Update pkg/maas/machine/machine.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
… bootstrap cluster (#189)

* resolved issue that lxd-initializer daemonset is running in bootstrap cluster

* keep one APIServerReadinessLabel
* PCP-5124: HCP cluster deletion, Deregister KVM Host

* Code refactor added new util functions (#193)
* auto detect resource pool, zone, and storage

* restore env var and make build image to use go 1.24

* change to use maas-client-go v0.0.4-beta1

* exclude username for lxd initializer image

* adopt image in daemonset yaml

* Update lxd-initializer/lxd-initializer.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…uster; new LXD VM created on a KVM/LXD host belonging to a different MAAS host-pooling cluster (#215)
* add resourcePool as filter when composing lxd vm

* error out when selecting default resource pool
…ne cluster (#228)

* PCP-5178: Release LXD VM during the upgrade of the hosted control plane cluster
* PCP-5152: Multiple node HCP cluster losing interface reference

* Additional changes (#225)

* Additional changes

* trust password and code cleanup (#226)

* Additional changes

* Add gate to registration logic, seperate it to initializer and put grace period. Code cleanup.

* Add gate to registration logic, seperate it to initializer and put grace period. Code cleanup. (#234)

* Additional changes

* Add gate to registration logic, seperate it to initializer and put grace period. Code cleanup.

* - Gosec fix
- Cleanup daemonset once all host are registered

* Additional changes

* Use hsotname for lxd registration. Code refactoring. (#240)

* Use hsotname for lxd registration. Code refactoring.

* merge conflict
…#253)

* added number of interfaces check before releasing IP

* If IP is already allocated and interface is referenced, we should avoid releasing such IP
Kun483 and others added 5 commits June 8, 2026 21:01
* PCP-6344: Maas HCP cluster upgrade is stuck

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…nk_subnet (#318)

* Fix wrong subnet in use existing VM flow

* feat: add AUTO link mode and LinkSubnetWithMode for MAAS link_subnet

* fix unit tests

* change to use new maas client go

* fix pkg/maas/lxd/host_maas_client_test.go

* add unit test for resolveLinkMode

---------

Co-authored-by: kun zhou <kun.zhou@spectrocloud.com>
Co-authored-by: Kun Zhou <156021375+Kun483@users.noreply.github.com>
@Kun483 Kun483 requested a review from AmitSahastra June 9, 2026 20:04
@Kun483 Kun483 changed the title Feat/lxd hcp Feat: LXD workload cluster / Host Control Plane cluster / maintenance mode Jun 9, 2026
cPu1
cPu1 previously approved these changes Jun 9, 2026
@Kun483 Kun483 changed the title Feat: LXD workload cluster / Host Control Plane cluster / maintenance mode PCP-6874: Feat: LXD workload cluster / Host Control Plane cluster / maintenance mode Jun 11, 2026
Comment thread Makefile
@AmitSahastra

Copy link
Copy Markdown
Contributor

@Kun483 These could be blocking issue for clusterctl: --cluster-role is global, not per-cluster

This is the core Palette assumption that breaks. --cluster-role is a single process-wide flag
(main.go:7254/7266) that gates whether HMC (hcp) or VEC (wlc) is registered. In Palette's
self-hosted/pivoted model each cluster runs its own controller, so per-cluster roles work. With
clusterctl there is one CAPMAAS provider instance in the management cluster managing both the HCP and
WLC MaasCluster objects — and it can only be hcp or wlc, never both.

So in a standard single-management-cluster clusterctl setup you cannot get HMC (HCP maintenance) and
VEC (WLC evacuation) simultaneously. The guide's own troubleshooting line ("make sure the controller
for each cluster is started with the correct --cluster-role", HCP_WLC_GUIDE.md:4323) assumes a
topology clusterctl doesn't produce. clusterctl won't install the same provider twice.

Recommendation: register HMC and VEC unconditionally and have each controller filter the objects it
acts on (e.g. HMC acts on MaasCluster with lxdConfig.enabled, VEC on machines with spec.lxd.enabled),
using predicates in SetupWithManager. Keep --cluster-role only as an optional opt-out. That makes
one management cluster serve mixed HCP+WLC fleets — the actual clusterctl shape.

@AmitSahastra

Copy link
Copy Markdown
Contributor

MAAS credential secret undocumented for clusterctl. Everything reads
capmaas-manager-bootstrap-credentials (keys MAAS_ENDPOINT/MAAS_API_KEY). In Palette the bootstrapper
creates it; with clusterctl the user must create it manually, and the guide never says so. Add a
"create the MAAS credentials secret / set MAAS_ENDPOINT+MAAS_API_KEY" prerequisite step.
(GetMaasClientIdentity also falls back only to MAAS_API_URL/MAAS_API_TOKEN, inconsistent with
GetMAASCredentials which accepts both pairs — worth unifying.)

@Kun483

Kun483 commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

MAAS credential secret undocumented for clusterctl. Everything reads capmaas-manager-bootstrap-credentials (keys MAAS_ENDPOINT/MAAS_API_KEY). In Palette the bootstrapper creates it; with clusterctl the user must create it manually, and the guide never says so. Add a "create the MAAS credentials secret / set MAAS_ENDPOINT+MAAS_API_KEY" prerequisite step. (GetMaasClientIdentity also falls back only to MAAS_API_URL/MAAS_API_TOKEN, inconsistent with GetMAASCredentials which accepts both pairs — worth unifying.)

@AmitSahastra , The secret is not something the user creates manually. It ships as part of the provider's infrastructure-components.yaml, sourced from config/default/credentials.yaml. So the real flow is:

  1. User sets MAAS_ENDPOINT / MAAS_API_KEY in ~/.cluster-api/clusterctl.yaml (or env).
  2. clusterctl init --infrastructure maas substitutes them and creates capmaas-manager-bootstrap-credentials in
    capmaas-system automatically.
  3. The controller Deployment pulls those into its env via secretKeyRef (confirmed in the rendered components).

Kun483 added 2 commits June 18, 2026 12:29
… README to clarify MAAS credentials handling and added new cluster templates for HCP and LXD. Refactored MAAS client identity retrieval in the codebase for consistency.
@AmitSahastra AmitSahastra merged commit e963885 into main Jun 19, 2026
4 checks passed
@AmitSahastra AmitSahastra deleted the feat/lxd-hcp branch June 19, 2026 06:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants