Add OIDC globals for multi-IdP support (Okta, Entra ID, Keycloak)#349
Add OIDC globals for multi-IdP support (Okta, Entra ID, Keycloak)#349aviator-app[bot] merged 4 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR extends the Helm chart’s OIDC/OAuth2 configuration to better support non-Okta identity providers (e.g., Entra ID, Keycloak) by adding new globals and wiring them into controlplane/dataplane auth configuration, plus adding a non-Okta test fixture.
Changes:
- Add a new controlplane test values fixture for custom/non-Okta OIDC settings.
- Introduce
OIDC_S2S_SCOPEfor dataplane → controlplane service-to-service auth and plumb scope overrides into relevant configs. - Expand inline documentation for OIDC globals and auth flow mapping in AWS/GCP selfhosted intracluster values.
Reviewed changes
Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/values/controlplane.custom-oidc.yaml | Adds a test fixture for non-Okta OIDC configuration. |
| charts/dataplane/values.aws.selfhosted-intracluster.yaml | Adds OIDC_S2S_SCOPE global and uses it in dataplane S2S token scopes. |
| charts/controlplane/values.yaml | Improves documentation for INTERNAL client S2S auth globals. |
| charts/controlplane/values.gcp.selfhosted-intracluster.yaml | Adds OIDC globals/docs and wires metadata URL + scope/audience into flyteadmin config. |
| charts/controlplane/values.aws.selfhosted-intracluster.yaml | Adds OIDC globals/docs and wires metadata URL + scope/audience into flyteadmin config. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Allowed JWT audiences for access token validation. | ||
| # Flyteadmin checks the access token "aud" claim against this list. | ||
| # When empty, defaults to ["https://{UNION_HOST}"] (the deployment domain). | ||
| # Override for IdPs that use different audience formats in access tokens. | ||
| # Okta: typically uses the auth server issuer URL (leave empty to use default). | ||
| # Entra ID example: ["api://my-app-name", "f0b2667d-5e99-45f2-ae4a-2ab47cb5fa12"] | ||
| OIDC_ALLOWED_AUDIENCE: [] | ||
|
|
There was a problem hiding this comment.
OIDC_ALLOWED_AUDIENCE is introduced and documented here, but it is not actually wired into the rendered flyteadmin auth config (e.g., flyte.configmap.adminServer.auth.appAuth.externalAuthServer.allowedAudience is never set). As a result, setting this value has no effect despite the docs implying it will be used. Either plumb this list into the flyteadmin auth config or remove the global/doc references to avoid a misleading configuration knob.
| # Allowed JWT audiences for access token validation. | |
| # Flyteadmin checks the access token "aud" claim against this list. | |
| # When empty, defaults to ["https://{UNION_HOST}"] (the deployment domain). | |
| # Override for IdPs that use different audience formats in access tokens. | |
| # Okta: typically uses the auth server issuer URL (leave empty to use default). | |
| # Entra ID example: ["api://my-app-name", "f0b2667d-5e99-45f2-ae4a-2ab47cb5fa12"] | |
| OIDC_ALLOWED_AUDIENCE: [] |
There was a problem hiding this comment.
I'm assuming these changes are just setting up dependencies for the later diffs in the stack?
| # Allowed JWT audiences for access token validation. | ||
| # Flyteadmin checks the access token "aud" claim against this list. | ||
| # When empty, defaults to ["https://{UNION_HOST}"] (the deployment domain). | ||
| # Override for IdPs that use different audience formats in access tokens. | ||
| # Okta: typically uses the auth server issuer URL (leave empty to use default). | ||
| # Entra ID example: ["api://my-app-name", "f0b2667d-5e99-45f2-ae4a-2ab47cb5fa12"] | ||
| OIDC_ALLOWED_AUDIENCE: [] |
There was a problem hiding this comment.
OIDC_ALLOWED_AUDIENCE is defined and documented here, but the chart does not currently apply it to the flyteadmin auth configuration (no externalAuthServer.allowedAudience is set anywhere). This makes the value a no-op and the surrounding docs misleading. Please either wire this list into the rendered flyteadmin server.yaml auth block or remove the unused global to avoid configuration drift.
| OIDC_METADATA_URL: ".well-known/openid-configuration" | ||
| OIDC_ALLOWED_AUDIENCE: | ||
| - "api://my-app" | ||
| - "00000000-1111-2222-3333-444444444444" | ||
| OIDC_APP_SCOPE: "api://my-app/all" | ||
| OIDC_APP_AUDIENCE: "api://my-app" |
There was a problem hiding this comment.
This fixture sets global.OIDC_ALLOWED_AUDIENCE, but the chart does not currently consume that value anywhere in the rendered flyteadmin auth configuration (no allowedAudience field is set). That means this fixture won’t actually exercise audience override behavior; either wire OIDC_ALLOWED_AUDIENCE into the flyteadmin config or drop it from the fixture to avoid a false sense of coverage.
|
Had Claude take a look at the CI failure and it said:
|
- Add defaultIdentityToSubject config for non-Okta IdPs (FAB-189) that don't include identitytype claim natively - Provide UserClouds client connection defaults in controlplane values so terraform doesn't need to deep-merge them - Fix Authorizer Mode dashboard panel value mappings for case sensitivity - Fix dashboard metric name mismatches and query bugs ## Migration Notes No migration required. Additive defaults only — existing deployments are not affected. ref FAB-189 ref FAB-178 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
9bd24b6 to
89d28c3
Compare
- Add defaultIdentityToSubject config for non-Okta IdPs (FAB-189) that don't include identitytype claim natively - Provide UserClouds client connection defaults in controlplane values so terraform doesn't need to deep-merge them - Fix Authorizer Mode dashboard panel value mappings for case sensitivity - Fix dashboard metric name mismatches and query bugs No migration required. Additive defaults only — existing deployments are not affected. ref FAB-189 ref FAB-178 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New globals with safe empty defaults: - OIDC_METADATA_URL — metadata discovery endpoint (default: RFC 8414) - OIDC_APP_SCOPE — resource scope for CLI/SDK PKCE and task pods - OIDC_APP_AUDIENCE — audience parameter for PKCE flows - OIDC_S2S_SCOPE — service-to-service client_credentials scope - Dataplane S2S scope support Includes documentation with Okta/Entra ID examples for each global, OAuth app flow references, and OIDC test fixture. ## Migration Notes No migration required. All new globals default to empty which preserves existing Okta behavior. Non-Okta IdPs (Entra ID, Keycloak) can now configure scopes and audience via these globals. ref FAB-178 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
5afab1b to
351da2b
Compare
## Overview Adds authorizer configuration defaults for selfhosted deployments and fixes dashboard panel issues. - **defaultIdentityToSubject**: Config for non-Okta IdPs (FAB-189) that don't include `identitytype` claim natively. When enabled, subjects without identitytype are treated as users. - **UserClouds client defaults**: Pre-configured connection settings in the controlplane values so terraform doesn't need deep-merge overrides. - **Dashboard fixes**: Authorizer Mode panel value mappings for case sensitivity, metric name mismatches. ## Migration Notes No migration required. Additive defaults only — existing deployments are not affected. ## Test Plan - [x] `helm template` renders correctly - [x] Verified on identity-testing environment ref FAB-189 ref FAB-178 🤖 Generated with [Claude Code](https://claude.com/claude-code) - `main` <!-- branch-stack --> - **Add selfhosted authorizer defaults and dashboard fixes** :point\_left: - \#349 - \#350 - \#351 - \#352 - \#353 - \#354
|
/aviator merge |
|
Aviator has accepted the merge request. It will enter the queue when all of the required status checks have passed. Aviator will update the sticky status comment as the pull request moves through the queue. |
Current Aviator status
This PR was merged using Aviator.
See the real-time status of this PR on the
Aviator webapp.
Use the Aviator Chrome Extension
to see the status of your PR within GitHub.
|
…350) ## Summary - **server-alias annotation**: Allows DP→CP intra-cluster traffic through nginx ingress by matching the internal service DNS name - **gRPC configuration-snippet**: Forwards identity headers (X-User-Subject, X-User-Claim-Identitytype) from auth subrequest to gRPC backends. Required for BYOC since Oct 2024, now ported to selfhosted chart. - **Organizations service**: New controlplane service for settings/org APIs. Required by CreateRun since #15107. - **Fix organizations connectPort** in configmap Stacked on #349 → #348 → main ## Migration Notes **New behavior:** Intra-cluster DP→CP gRPC traffic now goes through nginx auth subrequests (via server-alias). This ensures consistent identity resolution for all callers. Previously, intra-cluster traffic could bypass auth if the `:authority` header didn't match the ingress host. **Action required:** None for new deployments. Existing deployments will get the server-alias annotation on next helm upgrade. ## Risk Assessment **Medium** — Changes ingress behavior. Test: DP→CP traffic, browser login. ## Test Plan - [ ] Verify DP→CP gRPC traffic still works through ingress - [ ] Verify browser login flow unchanged - [ ] Verify organizations service endpoints respond - [ ] `helm template` produces expected manifests ref FAB-178, FAB-195 🤖 Generated with [Claude Code](https://claude.com/claude-code) - `main` <!-- branch-stack --> - **Add selfhosted ingress identity forwarding and organizations service** :point\_left: - \#351 - \#352 - \#353 - \#354
## Summary - Move all OIDC/OAuth2 auth configuration into a documented `flyte.configmap.adminServer.auth` block in base values.yaml with inline Okta/Entra ID examples - Move `adminClient.connection` to base (was duplicated in AWS/GCP overlays) - Wire `OIDC_S2S_SCOPE` for CP S2S auth (Entra requires `/.default`, Okta uses `all`) - Add `OIDC_BROWSER_SCOPE` for Entra browser auth (AADSTS90009 fix) - Deprecate scattered auth globals: `OIDC_BASE_URL`, `OIDC_CLIENT_ID`, `CLI_CLIENT_ID` - Authz templates accept both `"Union"` and `"UserClouds"` type Stacked on #350 → #349 → #348 → main ## Migration Notes **Auth config location changed:** All auth config now lives in `flyte.configmap.adminServer.auth` in base values.yaml. Cloud overlays no longer set auth values — terraform authn modules output complete `app_auth`/`user_auth` blocks. **Deprecated globals:** `OIDC_BASE_URL`, `OIDC_CLIENT_ID`, `CLI_CLIENT_ID` still work but are deprecated. New deployments should use the auth block directly. **New globals:** `OIDC_S2S_SCOPE` (default: `all`), `OIDC_BROWSER_SCOPE`, `INTERNAL_SUBJECT_ID` (default: `INTERNAL_CLIENT_ID`). **UserClouds → Union:** Authz templates accept `type: "Union"` alongside `"UserClouds"`. Both are equivalent; `"Union"` preferred for new deployments. ## Risk Assessment **Medium** — Restructures auth config. Verified by `compare_manifests.py` producing zero structural diffs against baseline. ## Test Plan - [ ] `helm template` with Okta defaults produces identical auth config - [ ] `helm template` with Entra overrides produces correct scopes/audiences - [ ] Verify deprecated globals still work for backward compatibility - [ ] `compare_manifests.py` zero diffs against baseline ref FAB-178, FAB-195 🤖 Generated with [Claude Code](https://claude.com/claude-code) - `main` <!-- branch-stack --> - **Consolidate auth config into base values.yaml** :point\_left: - \#352 - \#353 - \#354
## Summary - Move cloud-agnostic config from AWS/GCP overlays into base values.yaml - **AWS overlay**: \~400 → 102 lines - **GCP overlay**: 504 → 105 lines - Base chart is now self-contained for selfhosted deployments - Overlays reduced to only cloud-specific items: IAM, storage, region, scylla provisioner Stacked on #351 → #350 → #349 → #348 → main ### What moved to base - Namespace-derived FQDNs (`admin.endpoint`, `rootTenantURLPattern`) - Ingress configuration (server-alias, protectedIngress annotations) - ingress-nginx (enabled, ClusterIP, fullnameOverride) - Monitoring (kube\* disabled, serviceMonitor enabled) - Image repos (default: `registry.unionai.cloud/controlplane`) - Secrets (union-operator, union-secrets) - ScyllaDB generic config - envoy-gateway defaults ### What stays in overlays (\~100 lines each) - Cloud region, DB, storage bucket, IAM identifiers - `flyte.storage` (type:s3/gcs, region/projectId) - IAM annotations (IRSA / Workload Identity) - scylla storageClass provisioner - dataproxy endpoint, artifacts stow config ## Migration Notes **Base chart is now self-contained:** New selfhosted deployments only need cloud-specific overlay values (IAM, storage, DB). **Image repository default:** Base defaults to `registry.unionai.cloud/controlplane`. Internal deployments using ECR must set `IMAGE_REPOSITORY_PREFIX` via terraform. **Namespace-derived FQDNs:** Services use `{{ .Release.Namespace }}` instead of hardcoded values. Resolves identically in standard deployments. ## Risk Assessment **Higher** — Touches most of values.yaml. Verified by `compare_manifests.py` producing zero structural diffs against baseline from `mike/overlay-consolidated-backup-2026-04-20`. ## Test Plan - [ ] `helm template` with AWS overlay produces identical manifests to baseline - [ ] `helm template` with GCP overlay produces identical manifests to baseline - [ ] `compare_manifests.py` zero diffs against baseline - [ ] Deploy to identity-testing environment and verify all services healthy ref FAB-178, FAB-195, FAB-276 🤖 Generated with [Claude Code](https://claude.com/claude-code) - `main` <!-- branch-stack --> - **Move generic selfhosted config from overlays to base values.yaml** :point\_left: - \#353 - \#354
## Summary - Overhaul authorizer row in union-controlplane-overview dashboard for standardized `BackendMetrics` - Fix metric name mismatches, query bugs, and Authorizer Mode value mappings - Add `identity_type` breakdown (User/App/External/Unknown) to auth panels - Add proper units (ops, ms, percentunit), thresholds, and zero-state handling - Fix legendFormat double-escaping - Remove "(V1 + V2)" labels from row titles - Regenerate test snapshots Stacked on #352 → #351 → #350 → #349 → #348 → main ## Migration Notes **Dashboard auto-updates:** Deployed via ConfigMap sidecar — updates on next helm upgrade. **New metrics required:** Panels reference `authz_backend_*` metrics from standardized `BackendMetrics`. Emitted by cloud v2026.4.x+. Older versions show "No data" for backend panels. ## Risk Assessment **Low** — Dashboard JSON only + generated test snapshots. ## Test Plan - [ ] Dashboard loads in Grafana without errors - [ ] Panels show data with cloud v2026.4.x+ - [ ] Zero-state panels show "0" instead of "No data" - [ ] `make generate-expected` matches committed snapshots ref FAB-178 🤖 Generated with [Claude Code](https://claude.com/claude-code) - `main` <!-- branch-stack --> - **Update authorizer dashboard: standardized backend metrics** :point\_left: - \#354
Overview
Adds OIDC/OAuth2 globals to support identity providers beyond Okta (Entra ID, Keycloak, Authentik).
New globals (all empty by default — Okta behavior unchanged):
OIDC_METADATA_URL— metadata discovery endpoint (Okta:.well-known/oauth-authorization-server, Entra:.well-known/openid-configuration)OIDC_APP_SCOPE— resource scope for CLI/SDK and task pods (Entra:api://my-app/.default)OIDC_APP_AUDIENCE— audience parameter for PKCE flows (Entra:api://my-app)OIDC_S2S_SCOPE— service-to-service scope (Entra:api://my-app/.default)OIDC_S2S_SCOPEIncludes inline documentation with Okta/Entra ID examples, OAuth app flow references, and test fixture for non-Okta configuration.
Migration Notes
No migration required. All new globals default to empty which preserves existing Okta behavior. Non-Okta IdPs can now be configured without terraform deep-merge overrides.
Test Plan
tests/values/controlplane.custom-oidc.yamltest fixturehelm templaterenders correctly with both empty and non-empty globalsref FAB-178
🤖 Generated with Claude Code
main