Skip to content

ChatGPT said: Kubernetes operator that auto-assigns public IPs via DHCP and syncs them with Cilium LoadBalancer pools.

Notifications You must be signed in to change notification settings

serialx/cilium-dhcp-wanip-operator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

37 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Cilium DHCP WAN IP Operator

A Kubernetes operator that dynamically allocates public IP addresses from your ISP via DHCP and integrates them with Cilium's LoadBalancer IP pools. Perfect for home labs and edge deployments where you want to expose services with multiple public IPs without static IP assignments.

Description

This operator bridges the gap between ISP-provided DHCP addresses and Kubernetes LoadBalancer services. It:

  1. Allocates Public IPs: SSHes into your router (UDM-Pro, pfSense, etc.) to create macvlan interfaces and obtain DHCP leases
  2. Updates Cilium Pools: Automatically adds allocated IPs to CiliumLoadBalancerIPPool resources
  3. Manages Lifecycle: Handles cleanup when IPs are released, including stopping DHCP daemons and removing router interfaces
  4. Integrates with BGP: Works with Cilium BGP to advertise routes dynamically (no static routes needed)

Perfect for homelabs where you have limited public IPs but want proper LoadBalancer support for services like Ingress controllers, game servers, or VPN endpoints

How It Works

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     SSH      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Operator   │─────────────>β”‚    Router    β”‚
β”‚   (K8s)     β”‚              β”‚  (UDM/etc)   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
      β”‚                             β”‚
      β”‚ 1. Create macvlan           β”‚ 2. DHCP lease
      β”‚    interface                β”‚    from ISP
      β”‚                             β”‚
      β”‚ 3. Configure                β”‚ 4. Proxy ARP
      β”‚    proxy ARP                β”‚    enabled
      β”‚                             β”‚
      β”‚ 5. Add IP to Pool           β”‚
      v                             v
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Cilium    β”‚<─── BGP ────>β”‚  WAN/ISP     β”‚
β”‚  IP Pool    β”‚              β”‚  Network     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Traffic Flow: Internet β†’ Router WAN (proxy ARP) β†’ Router BGP table β†’ K8s via LAN β†’ Cilium LoadBalancer β†’ Service

Quick Start

Prerequisites

Infrastructure:

  • Kubernetes v1.16+ cluster with Cilium installed
  • Cilium BGP configured and peering with your router
  • Router with SSH access (UDM-Pro, pfSense, Linux-based routers)

Development (optional - only needed if building from source):

  • Go 1.24.0+
  • Docker 17.03+
  • kubectl 1.11.3+

Installation

Option A: Quick Install (Recommended - Uses pre-built images)

Deploy the operator directly from the release manifest:

kubectl apply -f https://raw.githubusercontent.com/serialx/cilium-dhcp-wanip-operator/v0.2.0/dist/install.yaml

This will:

  • Create the cilium-dhcp-wanip-operator-system namespace
  • Install the PublicIPClaim CRD
  • Deploy the operator controller with image ghcr.io/serialx/cilium-dhcp-wanip-operator:v0.2.0
  • Set up necessary RBAC permissions

Verify the installation:

kubectl get pods -n cilium-dhcp-wanip-operator-system
# You should see the controller manager pod running

Option B: Build and Deploy from Source

1. Install the router script

Copy the allocation script to your router:

scp config/samples/router-script-example.sh [email protected]:/data/cilium-dhcp-wanip-operator/alloc_public_ip.sh
# /data path is persistent on UDM Pro routers UniFi OS 2+
ssh [email protected] "chmod +x /data/cilium-dhcp-wanip-operator/alloc_public_ip.sh"

2. Create SSH secret

ssh-keygen -t ed25519 -f ssh_id_m2m_router -N "" -C "cilium-dhcp-wanip-operator ssh key"
kubectl -n kube-system create secret generic router-ssh \
  --from-file=id_rsa=ssh_id_m2m_router

3. Install CRDs

make install

4. Deploy operator

export IMG=<your-registry>/cilium-dhcp-wanip-operator:latest

# Option A: Multi-arch build and push (recommended - supports AMD64, ARM64, s390x, ppc64le)
make docker-buildx IMG=$IMG

# Option B: Single-arch build (faster, builds for your host platform)
make docker-build docker-push IMG=$IMG

# Deploy to cluster
make deploy IMG=$IMG

Multi-arch build options:

# Build for specific platforms only
make docker-buildx IMG=$IMG PLATFORMS=linux/amd64,linux/arm64

# Default platforms: linux/arm64,linux/amd64,linux/s390x,linux/ppc64le

5. Create a Cilium IP Pool

kubectl apply -f config/samples/cilium-ippool-example.yaml

6. Create a PublicIPClaim

apiVersion: network.serialx.net/v1alpha1
kind: PublicIPClaim
metadata:
  name: ip-wan-001
spec:
  poolName: public-pool
  router:
    host: 192.168.1.1
    user: root
    sshSecretRef: router-ssh
    command: /data/cilium-dhcp-wanip-operator/alloc_public_ip.sh
    wanParent: eth9  # Your router's WAN interface
kubectl apply -f config/samples/network_v1alpha1_publicipclaim.yaml

7. Verify

kubectl get publicipclaims
# NAME         POOL          IP               PHASE   AGE
# ip-wan-001   public-pool   203.0.113.45     Ready   1m

πŸ“š See DEPLOYMENT.md for detailed deployment instructions

Key Features

  • βœ… Automatic IP Allocation: Creates macvlan interfaces and obtains DHCP leases via SSH
  • βœ… Cilium Integration: Updates CiliumLoadBalancerIPPool with allocated IPs
  • βœ… BGP-Ready: Works with Cilium BGP for dynamic route advertisement
  • βœ… Proxy ARP: Configures router to answer ARP for allocated IPs
  • βœ… Auto-Cleanup: Finalizers ensure proper cleanup on deletion
  • βœ… MAC Generation: Auto-generates unique MAC addresses for each claim
  • βœ… API Version Detection: Supports both Cilium v2 and v2alpha1 APIs
  • βœ… Status Tracking: Full status reporting with phase, IP, interface, and MAC
  • βœ… Automatic Reboot Recovery: Detects router reboots and automatically restores configuration (~40s)
  • βœ… SSH Connection Pooling: Efficient connection management with automatic reconnection
  • βœ… Periodic Verification: Validates router state every 60 minutes to detect configuration drift
  • βœ… Event-Driven Reconciliation: Reacts immediately to connection drops and router state changes

Examples

Basic Claim

apiVersion: network.serialx.net/v1alpha1
kind: PublicIPClaim
metadata:
  name: ingress-ip
spec:
  poolName: public-pool
  router:
    host: 192.168.1.1
    user: root
    sshSecretRef: router-ssh
    wanParent: eth9

Claim with Custom Interface and MAC

apiVersion: network.serialx.net/v1alpha1
kind: PublicIPClaim
metadata:
  name: game-server-ip
spec:
  poolName: game-pool
  router:
    host: 192.168.1.1
    port: 22
    user: admin
    sshSecretRef: router-ssh
    command: /usr/local/bin/alloc_public_ip.sh
    wanParent: eth9
    wanInterface: wan-game
    macAddress: "02:aa:bb:cc:dd:01"

Architecture Details

See SPEC.md for complete architecture documentation including:

  • Router script implementation (proxy ARP + Cilium BGP)
  • CRD schema and validation
  • Controller reconciliation logic
  • Finalizer cleanup process
  • Networking details (rp_filter, BGP routing, etc.)

Development

Run locally:

make run

Build binary:

make build

Build Docker images:

# Single-arch (for your host platform)
make docker-build IMG=<your-image>

# Multi-arch (cross-platform)
make docker-buildx IMG=<your-image>

# Multi-arch with custom platforms
make docker-buildx IMG=<your-image> PLATFORMS=linux/amd64,linux/arm64

Run tests:

make test

Generate manifests:

make manifests generate

Releasing New Versions

When you're ready to release a new version:

1. Build and push the Docker image

The GitHub Actions workflow automatically builds and pushes images when you create a tag:

git tag -a v0.2.0 -m "Release v0.2.0 - Description of changes"
git push origin v0.2.0

This will trigger the CI to build multi-platform images and push to ghcr.io/serialx/cilium-dhcp-wanip-operator:v0.2.0

2. Generate the installer manifest

After the CI completes, update the installer manifest with the new image:

make build-installer IMG=ghcr.io/serialx/cilium-dhcp-wanip-operator:v0.2.0

This updates dist/install.yaml with the new image tag.

3. Commit and push the installer

git add dist/install.yaml config/manager/kustomization.yaml
git commit -m "chore: update installer manifest for v0.2.0"
git push origin main

4. Create a GitHub Release

gh release create v0.2.0 \
  --title "v0.2.0" \
  --notes "Release notes here" \
  dist/install.yaml

Or create it manually in the GitHub UI and attach dist/install.yaml.

Users can then install the new version:

kubectl apply -f https://raw.githubusercontent.com/serialx/cilium-dhcp-wanip-operator/v0.3.0/dist/install.yaml

Uninstall

If installed via Quick Install (Option A):

# Delete all claims first to ensure proper cleanup
kubectl delete publicipclaims --all

# Remove the operator
kubectl delete -f https://raw.githubusercontent.com/serialx/cilium-dhcp-wanip-operator/v0.2.0/dist/install.yaml

If installed from source (Option B):

# Delete all claims
kubectl delete publicipclaims --all

# Undeploy operator
make undeploy

# Remove CRDs
make uninstall

Troubleshooting

Check operator logs:

kubectl -n cilium-dhcp-wanip-operator-system logs deployment/cilium-dhcp-wanip-operator-controller-manager

Check claim status:

kubectl describe publicipclaim <name>

Common issues:

  • SSH authentication fails β†’ Check SSH key in secret
  • DHCP fails β†’ Verify wanParent interface name
  • IP not added to pool β†’ Check RBAC permissions for Cilium resources

Resilience & Recovery

Automatic Reboot Recovery (v0.2.0+)

The operator automatically detects and recovers from router reboots with no manual intervention required:

How It Works:

  • SSH Connection Monitoring: Maintains persistent SSH connections to routers with keep-alive checks every 30 seconds
  • Reboot Detection: Detects router reboots by monitoring uptime changes (~40 second worst-case detection time)
  • Automatic Restoration: Immediately reapplies all configuration (interfaces, DHCP clients, proxy ARP) when reboot detected
  • Periodic Verification: Validates router state every 60 minutes as a safety net to catch any configuration drift
  • Connection Pooling: Multiple claims share a single SSH connection per router for efficiency

What This Means:

  • βœ… Router reboots are automatically handled
  • βœ… Services recover within ~40 seconds of router reboot
  • βœ… No manual intervention needed
  • βœ… Configuration drift is automatically corrected
  • βœ… Connection drops trigger immediate reconciliation

Observability:

# Check claim status to see last verification time
kubectl get publicipclaim my-claim -o yaml

# Status fields show:
# - lastVerified: timestamp of last successful verification
# - routerUptime: current router uptime in seconds
# - configurationVerified: whether config has been verified
# - lastReconciliationReason: why last reconciliation occurred
#   (router_reboot, interface_missing, periodic, etc.)

Events: The operator emits Kubernetes events for key actions:

kubectl describe publicipclaim my-claim

# Events you may see:
# - RouterRebooted: Router reboot detected, reapplying configuration
# - ConfigurationApplied: Configuration applied successfully
# - ConfigurationDrift: Interface missing, reapplying configuration

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: make test
  5. Submit a pull request

Run make help for all available make targets.

More information: Kubebuilder Documentation

License

Copyright 2025.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

ChatGPT said: Kubernetes operator that auto-assigns public IPs via DHCP and syncs them with Cilium LoadBalancer pools.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors 2

  •  
  •