Skip to content

Consolidate selfhosted auth config: globals, scopes, Entra ID support (FAB-178)#313

Open
mhotan wants to merge 23 commits intomainfrom
mike/userclouds-defaults
Open

Consolidate selfhosted auth config: globals, scopes, Entra ID support (FAB-178)#313
mhotan wants to merge 23 commits intomainfrom
mike/userclouds-defaults

Conversation

@mhotan
Copy link
Copy Markdown
Contributor

@mhotan mhotan commented Mar 31, 2026

Summary

Consolidates all OIDC/OAuth2 authentication configuration into the base values.yaml using globals, eliminating the need for terraform deep-merge overrides. Adds full Entra ID support alongside existing Okta support.

Auth config consolidation

  • Move all auth globals (OIDC_BASE_URL, OIDC_CLIENT_ID, CLI_CLIENT_ID, scopes, audiences, etc.) from cloud-specific overlays to the base values.yaml
  • Move flyte.configmap.adminServer.auth block (resource server, PKCE client, browser login) from overlays to base using globals
  • Move configMap.union.connection.trustedIdentityClaims to base using INTERNAL_SUBJECT_ID global
  • Move executions.adminClient.connection to base using existing globals

Entra ID scope separation

  • OIDC_APP_SCOPE — CLI/SDK PKCE + task pod client_credentials (Entra: /.default)
  • OIDC_BROWSER_SCOPE — browser authorization_code (Entra: /all, needed because Entra rejects /.default for same-app flows)
  • OIDC_S2S_SCOPE — internal S2S client_credentials (Entra: /.default)

Other changes

  • Add server-alias annotation for intra-cluster DP→CP traffic through ingress
  • Add gRPC identity header forwarding via configuration-snippet
  • Add organizations service to controlplane chart

Test plan

  • helm_template.sh renders identical auth config (cosmetic quoting diffs only)
  • E2E: browser login on identity-testing (Entra ID)
  • E2E: SDK flyte run creates and executes task pods
  • E2E: UserClouds RBAC authorizes browser + DP service requests

🤖 Generated with Claude Code

@aviator-app
Copy link
Copy Markdown
Contributor

aviator-app Bot commented Mar 31, 2026

Current Aviator status

Aviator will automatically update this comment as the status of the PR changes.
Comment /aviator refresh to force Aviator to re-examine your PR (or learn about other /aviator commands).

This pull request is currently open (not queued).

How to merge

To merge this PR, comment /aviator merge or add the mergequeue label.


See the real-time status of this PR on the Aviator webapp.
Use the Aviator Chrome Extension to see the status of your PR within GitHub.

@mhotan mhotan force-pushed the mike/userclouds-defaults branch 4 times, most recently from 3a2b2e9 to 78a13a2 Compare March 31, 2026 05:43
@mhotan mhotan marked this pull request as ready for review March 31, 2026 07:24
@mhotan mhotan marked this pull request as draft March 31, 2026 07:38
@mhotan mhotan marked this pull request as ready for review March 31, 2026 17:54
@mhotan mhotan force-pushed the mike/userclouds-defaults branch 4 times, most recently from dd52b64 to a4dd636 Compare April 2, 2026 23:01
@github-actions github-actions Bot mentioned this pull request Apr 3, 2026
@mhotan mhotan force-pushed the mike/userclouds-defaults branch from a4dd636 to cc0d763 Compare April 3, 2026 20:21
@jmonty42
Copy link
Copy Markdown
Contributor

E2E tested on `monty-selfhosted` staging (AWS intra-cluster, Entra ID as IdP) as part of FAB-189 combined testing.

Confirmed working:

  • internalCommunicationConfig.enabled: false is critical for selfhosted — without it, all Authorize() calls return l5d-client-id header missing before shadow mode can run (the l5d check fires before UC evaluation and before shadow passthrough)
  • userCloudsClient defaults worked without any override in the values overlay
  • defaultIdentityToSubject: true confirmed working for browser login: Entra ID ID tokens have no identitytype or idtyp claim, so absence correctly falls back to user identity

Combined with cloud#15134 (bootstrap goroutine) and cloud#15267 + flyte#950 (identityTypeClaimsForApps), full RBAC enablement in Active mode required only a values overlay change — no manual userclouds-lite API calls, no port-forwarding.

@mhotan mhotan force-pushed the mike/userclouds-defaults branch from d8561a8 to 0ff278a Compare April 11, 2026 00:16
@mhotan mhotan changed the title Add UserClouds client defaults and disable Linkerd for selfhosted Add OAuth2 globals and documentation for non-Okta IdP support (FAB-178) Apr 16, 2026
mhotan and others added 12 commits April 17, 2026 15:56
When true, defaults to user identity if x-user-claim-identitytype
header is missing from gRPC metadata. Enables selfhosted deployments
with non-Okta IdPs (Apple IdMS, Entra ID) that cannot easily add
custom JWT claims. BYOC overrides to false.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add pre-configured userCloudsClient defaults under
services.authorizer.configMap.authorizer so that enabling Union RBAC
only requires setting type: "UserClouds" — no other configuration
needed. All connection details (tenantUrl, tenantID, clientID,
clientSecretName) are derived from existing chart values.

This eliminates the need for Terraform or manual overrides to supply
the userCloudsClient block, reducing configuration surface for
selfhosted deployments.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Authorizer: use full metric path (authorizer:authorizer:cloudauthorizer:connect:*)
- CacheService: add _unlabeled suffix to match actual metric names
- Usage: processing_time → processing_time_ms
- Cluster API Latency: fix histogram_quantile on summary type (use quantile selector)

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The type label value is capitalized (e.g., "UserClouds") but mappings
used lowercase keys. Add both cases to ensure matching.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Flyteadmin defaults to .well-known/oauth-authorization-server (RFC 8414)
for OIDC metadata discovery. Entra ID and some other providers only serve
.well-known/openid-configuration. This global lets operators override
the discovery endpoint without a manual values-overrides.yaml.

Default: ".well-known/oauth-authorization-server" (preserves existing behavior).
Entra ID: set to ".well-known/openid-configuration".

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
New globals in selfhosted-intracluster values:
- OIDC_ALLOWED_AUDIENCE: custom JWT audiences for access token validation
- OIDC_APP_SCOPE: resource scope for app-specific access tokens
- OIDC_APP_AUDIENCE: audience for CLI/SDK PKCE flow

Improved documentation on all existing OAuth globals with Okta and
Entra ID examples, provider-specific guidance, and cross-references
to identityTypeClaimsForApps (configured in values overlay, not global).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Tests the new OAuth2 globals (OIDC_ALLOWED_AUDIENCE, OIDC_APP_SCOPE,
OIDC_APP_AUDIENCE, OIDC_METADATA_URL) and identityTypeClaimsForApps
with generic values — no internal names, customer details, or
environment-specific configuration.

Snapshot generated but globals not yet wired into chart templates.
Template wiring is the next step.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
flyteClient.scopes uses OIDC_APP_SCOPE (default: "all").
flyteClient.audience uses OIDC_APP_AUDIENCE (default: "").
Test fixture updated with selfhosted-intracluster overlay to verify.

openId.scopes and allowedAudience remain in base values — terraform
handles appending app_scope and setting IdP-specific audiences via
the direct merge path (lists can't be conditionally built in static YAML).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Each config section now documents:
- Which OAuth app (1-5) it maps to
- Which authentication flow it's used in (browser, CLI/SDK, service-to-service)
- How it relates to the globals in Section 1

Cross-references the five OAuth apps from the authentication architecture:
  App 1: Browser (confidential) — userAuth.openId
  App 2: CLI (public, PKCE) — thirdPartyConfig.flyteClient
  App 3: Internal S2S — INTERNAL_CLIENT_ID global
  Apps 4,5: Operator, EAGER — dataplane values

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Values files should not assume specific deployment tooling. Replaced
"Terraform-generated values" with "environment-specific values overlay"
and "Terraform authn module output" with "values overlay".

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Entra ID requires scope "api://{app}/.default" for client_credentials
grants. The default "all" causes AADSTS1002012 invalid_scope.

New global OIDC_S2S_SCOPE: used by internal service auth config.
Default: "all" (Okta). Entra ID: "api://my-app/.default".

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@mhotan mhotan force-pushed the mike/userclouds-defaults branch from 6aaed3e to b45a4fb Compare April 17, 2026 05:58
mhotan and others added 8 commits April 17, 2026 16:21
Dataplane services (operator, proxy, executor) use client_credentials
to authenticate with the control plane. Entra ID requires /.default
scope for this grant type.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
In selfhosted-intracluster mode, DP services connect to the CP nginx
controller via internal K8s DNS. Without server-alias, the :authority
header doesn't match the ingress host and auth subrequests are bypassed.
This ensures all DP→CP traffic goes through nginx auth regardless of
whether it arrives via internal DNS or the external domain.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
For gRPC backends (backend-protocol: GRPC), nginx uses grpc_pass instead
of proxy_pass. The auth-response-headers annotation only sets proxy
headers, not gRPC headers. This configuration-snippet bridges identity
headers (X-User-Subject, X-User-Claim-Identitytype, etc.) from the auth
subrequest response into the upstream gRPC request.

This has been in BYOC since Oct 2024 (cdf8f5c6f6) but was never ported
to the selfhosted/selfmanaged controlplane chart.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The v2 SDK CreateRun path calls SettingsService/GetSettings (served by
the organizations service) to resolve task resource defaults. This
service exists in BYOC but was missing from selfhosted, causing
"no children to pick from" when the executions service tried to reach it
via internalConnectionConfig.

Adds a minimal organizations service (cloudorganizations binary, shared
DB, no cloud-specific features like externalIDProvisioning or Redis).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The connectPort must be in both the top-level sharedService (for the
container port in the Deployment) and configMap.sharedService (for the
application config). Without it in the configmap, the connect server
defaults to port 8080 which conflicts with the gRPC server.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Entra ID requires api://{app}/.default for client_credentials grants.
The hardcoded "all" scope works for Okta but fails for Entra with
AADSTS1002012. Use the OIDC_S2S_SCOPE global (already set by terraform)
with fallback to "all" for backwards compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The adminClient config was duplicated in the AWS and GCP selfhosted
overlays. Since every field uses globals (FLYTEADMIN_ENDPOINT,
INTERNAL_CLIENT_ID, AUTH_TOKEN_URL, OIDC_S2S_SCOPE), it belongs in
the base chart values so terraform doesn't need deep-merge overrides
just to set the S2S scope.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Entra rejects /.default for same-app authorization_code flows
(AADSTS90009). Browser login needs a specific delegated scope (/all)
while task pods need /.default for client_credentials.

Add OIDC_BROWSER_SCOPE to AWS and GCP overlay globals. The actual
userAuth.openId.scopes injection is still done by terraform until
the adminServer.auth block is fully extracted to base values.yaml.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@mhotan mhotan changed the title Add OAuth2 globals and documentation for non-Okta IdP support (FAB-178) Consolidate selfhosted auth config: globals, scopes, Entra ID support (FAB-178) Apr 18, 2026
Move all OIDC/OAuth2 globals and the full adminServer.auth block from
cloud-specific overlays to the base values.yaml. Every auth field is
now either a static default or a global variable, eliminating the need
for terraform deep merge overrides.

New globals: OIDC_BASE_URL, OIDC_CLIENT_ID, CLI_CLIENT_ID,
OIDC_METADATA_URL, OIDC_ALLOWED_AUDIENCE, OIDC_APP_SCOPE,
OIDC_APP_AUDIENCE, OIDC_BROWSER_SCOPE, OIDC_S2S_SCOPE,
OIDC_SUBJECT_CLAIM_NAMES, OIDC_IDENTITY_TYPE_CLAIMS,
INTERNAL_SUBJECT_ID.

Also moves configMap.union.connection.trustedIdentityClaims to base
using INTERNAL_SUBJECT_ID (defaults to INTERNAL_CLIENT_ID).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@mhotan mhotan force-pushed the mike/userclouds-defaults branch 6 times, most recently from 813bebf to 2e08849 Compare April 19, 2026 00:13
- Replace External-specific panels with backend-agnostic panels
- Backend Latency uses backend_authorize_duration_ms (works for all types)
- Backend Errors uses backend_authorize_errors (works for all types)
- Allow/Deny Rate now shows identity_type breakdown (user/app/external)
- Authorizer Mode shows authz_type_info{type="Union"}
- Consistent metric prefix (no type-specific sub-scope)

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@mhotan mhotan force-pushed the mike/userclouds-defaults branch from 2e08849 to 26ecbd6 Compare April 19, 2026 00:17
Move all OIDC auth configuration from scattered globals into one
documented flyte.configmap.adminServer.auth block with inline
Okta/Entra ID examples.

- Deprecate auth globals: OIDC_BASE_URL, OIDC_CLIENT_ID, CLI_CLIENT_ID
- Remove new globals: OIDC_METADATA_URL, OIDC_APP_SCOPE, etc.
- Keep S2S globals: INTERNAL_CLIENT_ID, AUTH_TOKEN_URL, OIDC_S2S_SCOPE
- Auth block uses literal values, not tpl global references
- Authn modules output complete appAuth/userAuth blocks
- Authz templates accept both "Union" and "UserClouds" type
- Dashboard: standardized backend metrics, identity_type, V1+V2 removed

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants