Add selfhosted authorizer defaults and dashboard fixes#348
Add selfhosted authorizer defaults and dashboard fixes#348aviator-app[bot] merged 1 commit intomainfrom
Conversation
Current Aviator status
This PR was merged using Aviator.
See the real-time status of this PR on the
Aviator webapp.
Use the Aviator Chrome Extension
to see the status of your PR within GitHub.
|
There was a problem hiding this comment.
Pull request overview
This PR updates the controlplane Helm chart to make self-hosted authorizer configuration easier (by adding UserClouds client defaults) and fixes Grafana dashboard panels for authorizer mode + metric name mismatches.
Changes:
- Add pre-configured
services.authorizer.configMap.authorizer.userCloudsClient+ bootstrap/internal comm defaults incharts/controlplane/values.yaml. - Update the
union-controlplane-overviewdashboard PromQL queries and value mappings (case-insensitive mode mapping, renamed metrics). - Update snapshot test inputs/outputs for the UserClouds authorizer scenario and regenerated manifests.
Reviewed changes
Copilot reviewed 3 out of 7 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| charts/controlplane/values.yaml | Adds default userCloudsClient + bootstrap/internal comm defaults for the authorizer config. |
| charts/controlplane/dashboards/union-controlplane-overview.json | Fixes authorizer panels (metric names + mode value mappings). |
| tests/values/controlplane.userclouds.yaml | Switches the test to enable UserClouds authorizer via the new config path. |
| tests/generated/controlplane.userclouds.yaml | Regenerated snapshot reflecting UserClouds authorizer + dashboard changes (now includes union-authz resources). |
| tests/generated/controlplane.external-authz.yaml | Regenerated snapshot reflecting dashboard changes and new authorizer config defaults rendering. |
| tests/generated/controlplane.aws.yaml | Regenerated snapshot reflecting dashboard changes and new authorizer config defaults rendering. |
| tests/generated/controlplane.aws.billing-enable.yaml | Regenerated snapshot reflecting dashboard changes and new authorizer config defaults rendering. |
Comments suppressed due to low confidence (1)
tests/values/controlplane.userclouds.yaml:71
- In this fixture, switching the authorizer type to "UserClouds" will render the union-authz resources, but required globals (notably global.KUBERNETES_SECRET_NAME and global.DB_* used by union.authz defaults) are not set here. The generated snapshot ends up with invalid/empty secret references and empty DB connection fields for union-authz. Update this test values file to set the needed globals (or explicitly override union.authz.database/auth secrets) so the rendered manifests are valid and representative.
services:
authorizer:
configMap:
authorizer:
type: "UserClouds"
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # --- UserClouds client defaults (pre-configured) --- | ||
| # These defaults are used when type is set to "UserClouds" (Union RBAC). | ||
| # They are ignored when type is "Noop" or "External". | ||
| # To enable Union RBAC, just change type to "UserClouds" — no other | ||
| # configuration is needed. Override individual fields only if your | ||
| # deployment uses non-standard naming or secrets. | ||
| userCloudsClient: | ||
| tenantUrl: 'http://{{ .Release.Name }}-union-authz.{{ .Release.Namespace }}.svc.cluster.local:8080' | ||
| tenantID: '623771e7-ddd6-4575-bedb-7c970ec75b87' | ||
| clientID: '{{ .Values.union.authz.clientID }}' | ||
| clientSecretName: 'union/client_secret' | ||
| enableLogging: true | ||
| internalCommunicationConfig: | ||
| enabled: false | ||
| bootstrap: | ||
| organization: "" | ||
| domains: | ||
| - development | ||
| - staging | ||
| - production | ||
| projects: [] | ||
| serviceAccounts: [] | ||
| adminUsers: [] | ||
| retryInterval: 5s | ||
| maxRetries: 30 |
There was a problem hiding this comment.
The PR description mentions adding a defaultIdentityToSubject authorizer config default for non-Okta IdPs, but that setting doesn’t appear to be introduced anywhere in the chart/values changes in this PR. Either add the corresponding value + wiring, or update the PR description/migration notes to avoid implying this behavior exists.
- Add defaultIdentityToSubject config for non-Okta IdPs (FAB-189) that don't include identitytype claim natively - Provide UserClouds client connection defaults in controlplane values so terraform doesn't need to deep-merge them - Fix Authorizer Mode dashboard panel value mappings for case sensitivity - Fix dashboard metric name mismatches and query bugs ## Migration Notes No migration required. Additive defaults only — existing deployments are not affected. ref FAB-189 ref FAB-178 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9bd24b6 to
89d28c3
Compare
|
/aviator merge |
|
Aviator has accepted the merge request. It will enter the queue when all of the required status checks have passed. Aviator will update the sticky status comment as the pull request moves through the queue. |
…350) ## Summary - **server-alias annotation**: Allows DP→CP intra-cluster traffic through nginx ingress by matching the internal service DNS name - **gRPC configuration-snippet**: Forwards identity headers (X-User-Subject, X-User-Claim-Identitytype) from auth subrequest to gRPC backends. Required for BYOC since Oct 2024, now ported to selfhosted chart. - **Organizations service**: New controlplane service for settings/org APIs. Required by CreateRun since #15107. - **Fix organizations connectPort** in configmap Stacked on #349 → #348 → main ## Migration Notes **New behavior:** Intra-cluster DP→CP gRPC traffic now goes through nginx auth subrequests (via server-alias). This ensures consistent identity resolution for all callers. Previously, intra-cluster traffic could bypass auth if the `:authority` header didn't match the ingress host. **Action required:** None for new deployments. Existing deployments will get the server-alias annotation on next helm upgrade. ## Risk Assessment **Medium** — Changes ingress behavior. Test: DP→CP traffic, browser login. ## Test Plan - [ ] Verify DP→CP gRPC traffic still works through ingress - [ ] Verify browser login flow unchanged - [ ] Verify organizations service endpoints respond - [ ] `helm template` produces expected manifests ref FAB-178, FAB-195 🤖 Generated with [Claude Code](https://claude.com/claude-code) - `main` <!-- branch-stack --> - **Add selfhosted ingress identity forwarding and organizations service** :point\_left: - \#351 - \#352 - \#353 - \#354
## Summary - Move all OIDC/OAuth2 auth configuration into a documented `flyte.configmap.adminServer.auth` block in base values.yaml with inline Okta/Entra ID examples - Move `adminClient.connection` to base (was duplicated in AWS/GCP overlays) - Wire `OIDC_S2S_SCOPE` for CP S2S auth (Entra requires `/.default`, Okta uses `all`) - Add `OIDC_BROWSER_SCOPE` for Entra browser auth (AADSTS90009 fix) - Deprecate scattered auth globals: `OIDC_BASE_URL`, `OIDC_CLIENT_ID`, `CLI_CLIENT_ID` - Authz templates accept both `"Union"` and `"UserClouds"` type Stacked on #350 → #349 → #348 → main ## Migration Notes **Auth config location changed:** All auth config now lives in `flyte.configmap.adminServer.auth` in base values.yaml. Cloud overlays no longer set auth values — terraform authn modules output complete `app_auth`/`user_auth` blocks. **Deprecated globals:** `OIDC_BASE_URL`, `OIDC_CLIENT_ID`, `CLI_CLIENT_ID` still work but are deprecated. New deployments should use the auth block directly. **New globals:** `OIDC_S2S_SCOPE` (default: `all`), `OIDC_BROWSER_SCOPE`, `INTERNAL_SUBJECT_ID` (default: `INTERNAL_CLIENT_ID`). **UserClouds → Union:** Authz templates accept `type: "Union"` alongside `"UserClouds"`. Both are equivalent; `"Union"` preferred for new deployments. ## Risk Assessment **Medium** — Restructures auth config. Verified by `compare_manifests.py` producing zero structural diffs against baseline. ## Test Plan - [ ] `helm template` with Okta defaults produces identical auth config - [ ] `helm template` with Entra overrides produces correct scopes/audiences - [ ] Verify deprecated globals still work for backward compatibility - [ ] `compare_manifests.py` zero diffs against baseline ref FAB-178, FAB-195 🤖 Generated with [Claude Code](https://claude.com/claude-code) - `main` <!-- branch-stack --> - **Consolidate auth config into base values.yaml** :point\_left: - \#352 - \#353 - \#354
## Summary - Move cloud-agnostic config from AWS/GCP overlays into base values.yaml - **AWS overlay**: \~400 → 102 lines - **GCP overlay**: 504 → 105 lines - Base chart is now self-contained for selfhosted deployments - Overlays reduced to only cloud-specific items: IAM, storage, region, scylla provisioner Stacked on #351 → #350 → #349 → #348 → main ### What moved to base - Namespace-derived FQDNs (`admin.endpoint`, `rootTenantURLPattern`) - Ingress configuration (server-alias, protectedIngress annotations) - ingress-nginx (enabled, ClusterIP, fullnameOverride) - Monitoring (kube\* disabled, serviceMonitor enabled) - Image repos (default: `registry.unionai.cloud/controlplane`) - Secrets (union-operator, union-secrets) - ScyllaDB generic config - envoy-gateway defaults ### What stays in overlays (\~100 lines each) - Cloud region, DB, storage bucket, IAM identifiers - `flyte.storage` (type:s3/gcs, region/projectId) - IAM annotations (IRSA / Workload Identity) - scylla storageClass provisioner - dataproxy endpoint, artifacts stow config ## Migration Notes **Base chart is now self-contained:** New selfhosted deployments only need cloud-specific overlay values (IAM, storage, DB). **Image repository default:** Base defaults to `registry.unionai.cloud/controlplane`. Internal deployments using ECR must set `IMAGE_REPOSITORY_PREFIX` via terraform. **Namespace-derived FQDNs:** Services use `{{ .Release.Namespace }}` instead of hardcoded values. Resolves identically in standard deployments. ## Risk Assessment **Higher** — Touches most of values.yaml. Verified by `compare_manifests.py` producing zero structural diffs against baseline from `mike/overlay-consolidated-backup-2026-04-20`. ## Test Plan - [ ] `helm template` with AWS overlay produces identical manifests to baseline - [ ] `helm template` with GCP overlay produces identical manifests to baseline - [ ] `compare_manifests.py` zero diffs against baseline - [ ] Deploy to identity-testing environment and verify all services healthy ref FAB-178, FAB-195, FAB-276 🤖 Generated with [Claude Code](https://claude.com/claude-code) - `main` <!-- branch-stack --> - **Move generic selfhosted config from overlays to base values.yaml** :point\_left: - \#353 - \#354
## Summary - Overhaul authorizer row in union-controlplane-overview dashboard for standardized `BackendMetrics` - Fix metric name mismatches, query bugs, and Authorizer Mode value mappings - Add `identity_type` breakdown (User/App/External/Unknown) to auth panels - Add proper units (ops, ms, percentunit), thresholds, and zero-state handling - Fix legendFormat double-escaping - Remove "(V1 + V2)" labels from row titles - Regenerate test snapshots Stacked on #352 → #351 → #350 → #349 → #348 → main ## Migration Notes **Dashboard auto-updates:** Deployed via ConfigMap sidecar — updates on next helm upgrade. **New metrics required:** Panels reference `authz_backend_*` metrics from standardized `BackendMetrics`. Emitted by cloud v2026.4.x+. Older versions show "No data" for backend panels. ## Risk Assessment **Low** — Dashboard JSON only + generated test snapshots. ## Test Plan - [ ] Dashboard loads in Grafana without errors - [ ] Panels show data with cloud v2026.4.x+ - [ ] Zero-state panels show "0" instead of "No data" - [ ] `make generate-expected` matches committed snapshots ref FAB-178 🤖 Generated with [Claude Code](https://claude.com/claude-code) - `main` <!-- branch-stack --> - **Update authorizer dashboard: standardized backend metrics** :point\_left: - \#354
Overview
Adds authorizer configuration defaults for selfhosted deployments and fixes dashboard panel issues.
identitytypeclaim natively. When enabled, subjects without identitytype are treated as users.Migration Notes
No migration required. Additive defaults only — existing deployments are not affected.
Test Plan
helm templaterenders correctlyref FAB-189
ref FAB-178
🤖 Generated with Claude Code
main