feat(lens): Add SaFE SSO authentication support for Lens API#286
feat(lens): Add SaFE SSO authentication support for Lens API#286haishuok0525 wants to merge 30 commits intomainfrom
Conversation
…se 1-6) - Phase 1: Control Plane database architecture - Add ClusterManager extensions for Control Plane support - Implement DAL layer with GORM Gen - Create database migrations for auth tables - Phase 2: System initialization and admin APIs - Add system initializer with SaFE detection - Implement root user creation flow - Add admin APIs for auth mode and password management - Phase 3: Auth provider management APIs - Implement CRUD APIs for auth providers - Add provider configuration types (LDAP, OIDC) - Add system config management APIs - Phase 4: LDAP provider implementation - Add LDAP connection pool with TLS support - Implement user search and authentication - Add attribute mapping and group membership - Phase 5: Token sync adapter - Add TokenSyncService to sync SaFE tokens - Add TokenCleanupService for session cleanup - Extend bootstrap to initialize sync tasks - Phase 6: Session management - Implement DB-based Session Manager - Add token generation, validation, and refresh - Add login audit service for security logging
- Add login/logout API handlers (POST /auth/login, POST /auth/logout) - Add session refresh API (POST /auth/refresh) - Add current user API (GET /auth/me) - Implement SessionAuthMiddleware for session-based authentication - Implement AdminAuthMiddleware for admin privilege checks - Implement OptionalAuthMiddleware for optional auth - Register auth routes in main router.go - Add auth mode management (GetCurrentAuthMode, SetCurrentAuthMode) - Add authentication error definitions - Add CreateFromLDAP method to UserFacade New routes (independent from existing APIs): - POST /auth/login - Public, supports Local/LDAP auth - POST /auth/logout - Public - POST /auth/refresh - Public - GET /auth/me - Requires session auth - GET/PUT /auth/mode - Requires admin auth - CRUD /auth/providers/* - Requires admin auth - GET/POST /init/status, /init/setup - Public - CRUD /configs/* - Requires admin auth - POST /root/change-password - Requires admin auth
…s Secret - Initialize auth system during API startup - Auto-create root user if not exists - Store generated password in Kubernetes Secret instead of logging - Support multi-pod concurrent startup with database unique constraint - Password priority: env var > existing secret > auto-generate - Add InitializeAuthHandlers call to enable auth API endpoints
- Move auth initialization to preInit callback of InitServerWithPreInitFunc - This ensures ClusterManager is initialized before accessing K8s client - Fixes panic: cluster manager not initialized
…ation - Add controlPlane.enabled config option in config.go - Add NewControlPlaneConfigFromEnv() to read DB config from env vars - Modify server.go to initialize Control Plane when enabled - Update preInitAuthSystem to check Control Plane availability - Skip auth initialization gracefully when Control Plane is disabled Environment variables for Control Plane DB: - CONTROL_PLANE_DB_HOST - CONTROL_PLANE_DB_PORT (default: 5432) - CONTROL_PLANE_DB_NAME - CONTROL_PLANE_DB_USER - CONTROL_PLANE_DB_PASSWORD - CONTROL_PLANE_DB_SSL_MODE (default: require)
- Remove environment variable based config reading - Add NewControlPlaneConfigFromSecret() to read DB config from K8s Secret - Add ClusterManager.InitControlPlane() for delayed initialization - server.go now: 1) init K8s client, 2) read secret, 3) init Control Plane - Config only needs controlPlane.enabled flag, DB info auto-read from secret - Default secret: primus-lens-control-plane-pguser-primus-lens-control-plane - Default namespace: POD_NAMESPACE env or primus-lens
GORM error callback converts ErrRecordNotFound to nil, causing GetByUsername to return an empty user struct with nil error. Added check for non-empty user.ID to correctly detect existing users.
- AuthModeNone now supports local authentication - Root user can always login using local credentials regardless of auth mode - LDAP/SaFE/SSO modes now allow root user fallback to local auth - Unknown auth modes fall back to local authentication with warning
…to nil - provider_handler.go: Add ID check for GetByID and GetByName calls - login_handler.go: Add ID check for GetByUsername calls - This fixes the Auth Provider creation bug where empty struct was incorrectly detected as existing provider - Also improves authenticateLocal to return proper ErrUserNotFound when user doesn't exist
- New UserSyncService syncs users from SaFE User CRD to lens_users table - User admin status determined by SaFE roles (system-admin, system-admin-readonly) - User restricted status maps to disabled status in Lens - Sync runs every 60 seconds - Also fixed GORM callback issue in token_sync_service.go
Added corev1.AddToScheme in controller manager's init function to ensure core Kubernetes types (Secret, ConfigMap, etc.) are available when K8s client is created, before any preInit callbacks are called. This fixes the 'no kind is registered for the type v1.Secret in scheme' error when primus-safe-adapter tries to read Control Plane config from Secret.
corev1 is now registered by default in controller/manager.go init(), so app-specific schemes only need to register their custom types.
Added GetByNodeNameAndNamespaceNameIncludingDeleted and Recover methods to NodeNamespaceMappingFacade. Modified namespace_sync_service to check for soft-deleted mappings before creating new ones. If a soft-deleted mapping exists, it recovers the record instead of attempting to create a duplicate that would violate the unique constraint.
…notations - Add annotation key constants for primus-safe.user.email and primus-safe.user.name - Extract email and display name from User CRD annotations during sync - Update existing users if email or display name changed - Include email and display name when creating new users
- Add SafeTokenCookieName constant for SaFE 'token' cookie - Add getTokenFromRequest helper to extract token from multiple sources - Token lookup priority: lens_session cookie -> SaFE token cookie -> Bearer header - This enables users authenticated via SaFE SSO to access Lens APIs - Works with primus-safe-adapter which syncs SaFE sessions to Lens DB
… format - Add DisplayName field to SessionInfo struct - Populate DisplayName from user record in Validate method - Add CurrentUserResponse struct for /auth/me endpoint - Use rest.SuccessResp for standard response format - Return display_name alongside username for better UX
…llout - Apply OptionalAuthMiddleware to /nodes/* routes - Allows authenticated and unauthenticated access during migration - Session info available in context when user is authenticated
- Add SessionAuthMiddleware to all business API route groups - Protected routes: nodes, pods, clusters, workloads, storage, alerts, gpu-aggregation, job-history, ai-metadata, weekly-reports, detection-configs, profiler, tracelens, perfetto, registry, system-config, realtime, pyspy, github-workflow-metrics, github-runners - Keep /detection-status/log-report public for telemetry-processor internal use - All routes now require valid SaFE SSO session cookie for access
…ience - Token sync: 30s -> 1s (near real-time token availability) - User sync: 60s -> 5s (quick user info propagation) - Users can now access Lens within seconds after SaFE login
…delay - Retry session validation once after 2s delay if first attempt fails - Handles race condition when user just logged in via SaFE but token hasn't synced to Lens yet - Respects request context cancellation during retry wait - Log retry attempts for debugging
SaFE uses 'Token' (capital T) as the cookie name, not 'token'. Cookie names are case-sensitive.
There was a problem hiding this comment.
Pull request overview
This PR implements SaFE SSO cookie authentication support for the Lens API, enabling seamless single sign-on integration between SaFE and Lens systems. The implementation includes a comprehensive authentication system with database-backed user management, session handling, and LDAP support.
Changes:
- Added Control Plane database infrastructure with PostgreSQL support for user/session management
- Implemented SaFE SSO cookie authentication with 2-second retry logic for token sync delays
- Protected all business API routes with session authentication middleware while keeping telemetry endpoints public
Reviewed changes
Copilot reviewed 63 out of 65 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
modules/core/pkg/server/server.go |
Reads DB config from K8s Secret after client initialization |
modules/core/pkg/controlplane/database/* |
Implements Control Plane database facades for users, sessions, auth providers |
modules/core/pkg/controlplane/auth/* |
Core authentication logic including session management, LDAP, and SafeDetector |
modules/api/pkg/api/auth/* |
Authentication API handlers and middleware with SaFE SSO support |
modules/api/pkg/api/router.go |
Applies session auth middleware to business routes |
modules/core/pkg/clientsets/* |
Control Plane database connection management |
Comments suppressed due to low confidence (1)
Lens/modules/core/pkg/controlplane/auth/ldap/provider.go:1
- The error return from crypto/rand.Read is ignored. If random number generation fails, the bytes slice will contain zeros, leading to weak or predictable IDs. Check and handle the error: if _, err := rand.Read(bytes); err != nil { return "", err }
// Copyright (C) 2025-2026, Advanced Micro Devices, Inc. All rights reserved.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| func (e ExtType) Value() (driver.Value, error) { | ||
| b, err := json.Marshal(e) | ||
| return *(*string)(unsafe.Pointer(&b)), err |
There was a problem hiding this comment.
Using unsafe.Pointer to convert []byte to string bypasses Go's type safety and could lead to memory safety issues. The byte slice 'b' could be garbage collected while the string is still in use, causing undefined behavior. Use the safe conversion: return string(b), err
|
|
||
| func generateAuditID() string { | ||
| bytes := make([]byte, 16) | ||
| rand.Read(bytes) |
There was a problem hiding this comment.
The error return from crypto/rand.Read is ignored. If random number generation fails, audit IDs could collide, compromising audit integrity. Check and handle the error: if _, err := rand.Read(bytes); err != nil { return "" }
| rand.Read(bytes) | |
| if _, err := rand.Read(bytes); err != nil { | |
| log.Errorf("failed to generate audit ID: %v", err) | |
| return "" | |
| } |
| // randomHex generates a random hex string of the given length | ||
| func randomHex(n int) string { | ||
| bytes := make([]byte, n/2) | ||
| rand.Read(bytes) |
There was a problem hiding this comment.
The error return from crypto/rand.Read is ignored. If random number generation fails, user IDs could collide, creating serious security issues. Check and handle the error: if _, err := rand.Read(bytes); err != nil { return "" }
| rand.Read(bytes) | |
| if _, err := rand.Read(bytes); err != nil { | |
| return "" | |
| } |
|
|
||
| func generateSessionID() string { | ||
| bytes := make([]byte, 16) | ||
| rand.Read(bytes) |
There was a problem hiding this comment.
The error return from crypto/rand.Read is ignored. If random number generation fails, session IDs could collide, allowing session hijacking. Check and handle the error: if _, err := rand.Read(bytes); err != nil { panic(err) }
| rand.Read(bytes) | |
| if _, err := rand.Read(bytes); err != nil { | |
| panic(fmt.Errorf("failed to generate session ID: %w", err)) | |
| } |
| log.Debugf("Session validation failed (attempt 1): %v, retrying in %v", err, sessionValidationRetryDelay) | ||
|
|
||
| select { | ||
| case <-time.After(sessionValidationRetryDelay): |
There was a problem hiding this comment.
The retry logic blocks the request for 2 seconds on every authentication failure, which could be exploited for DoS attacks by sending many invalid tokens. Consider implementing exponential backoff or limiting retries to specific error types (e.g., only retry on 'token not found' errors that might indicate sync lag, not on 'token expired' or 'invalid format' errors).
| func contains(s, substr string) bool { | ||
| return len(s) >= len(substr) && (s == substr || len(s) > 0 && containsImpl(s, substr)) | ||
| } |
There was a problem hiding this comment.
Use strings.Contains from the standard library instead of implementing a custom contains function. The standard library function is well-tested and more efficient.
|
|
||
| // Write custom type file | ||
| customFilePath := fmt.Sprintf("%s/ext_type.go", outPath) | ||
| err = os.WriteFile(customFilePath, []byte(customTypeFileContent), 0644) |
There was a problem hiding this comment.
File permissions 0644 allow group and world read access to generated code files. Use 0600 or 0640 to restrict access since these files may contain sensitive database schema information.
| err = os.WriteFile(customFilePath, []byte(customTypeFileContent), 0644) | |
| err = os.WriteFile(customFilePath, []byte(customTypeFileContent), 0600) |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
- Add auto_register service to automatically enable safe mode from DB config - Add session validator for direct SaFE DB validation - Remove token sync (replaced by direct DB validation) - Add SafeSetupConfig to init API for adapter_url and sso_url configuration - Add new config keys: safe.adapter_url, safe.sso_url - Add oidc package with types and safe_validator - Add auth provider interfaces (local, ldap, oidc) - Update constants with new config keys
- Removed oidc/router.go which referenced undefined NewHandlers() - Removed import of oidc package from auth/router.go - Authentication routes will use existing login handler
- Created middleware.go with SessionAuthMiddleware() and AdminAuthMiddleware() - SessionAuthMiddleware validates session using global HandleAuth context - AdminAuthMiddleware checks for admin/root user type
- Created initializer.go with Initializer type, NewInitializer, NewInitializerWithK8s - Added SafeSetupConfig, InitializeOptions, InitializeResult types - Added InitializeAuthHandlers function to init_handler.go
- Remove NewInitializerWithK8s, keep only NewInitializer with K8s client - Remove NewSafeDetectorWithoutK8s - system always runs in K8s - Simplify bootstrap.go initialization logic
SaFE stores user info in SSO/LDAP, not in database. Only UserToken table exists for session validation.
Backend changes for Safe authentication mode: 1. AuthConfigService (config_service.go): - Read auth mode from database with TTL cache - Support Safe/LDAP/Local/SSO modes - Get Safe config (adapter_url, login_url, callback_path) 2. Safe Login Handler (safe_login_handler.go): - GET /api/v1/auth/config - return auth configuration - GET /api/v1/auth/login - redirect to SaFE login (302) - Sanitize redirect URLs to prevent open redirect 3. Dynamic Auth Middleware (auth.go): - HandleDynamicAuth() reads config from database - Safe mode: validate Token cookie via primus-safe-adapter - Local/LDAP mode: validate lens_session via session manager 4. Router updates: - Register new auth endpoints - Use HandleDynamicAuth with configurable exclude paths Related: safe-auth-design.md
Summary
This PR implements SaFE SSO cookie authentication support for Lens API, enabling seamless single sign-on between SaFE and Lens systems.
Changes
Authentication Middleware (
modules/api/pkg/api/auth/middleware.go)Tokencookie (capital T)Session Management (
modules/core/pkg/controlplane/auth/session/)DisplayNamefield toSessionInfostructDisplayNamefrom user record inValidatemethodAPI Response (
modules/api/pkg/api/auth/login_handler.go)display_namein/auth/meAPI responserest.SuccessRespformat for consistent API responsesRoute Protection (
modules/api/pkg/api/router.go)SessionAuthMiddlewareto all business API routes/detection-status/log-reportpublic for internal telemetry-processor useToken/User Sync (
modules/adapter/primus-safe-adapter/)emailanddisplay_namefrom SaFE User CRD annotationsTesting
Related
This enables Lens to seamlessly authenticate users who are already logged into SaFE via SSO.