unionai · ppiegaze · Apr 10, 2026 · Apr 10, 2026 · Apr 17, 2026 · Apr 17, 2026
@@ -7,25 +7,48 @@ top_menu: true
 
 # Security
 
-Union.ai provides a production-grade workflow orchestration platform built on Flyte, designed for AI/ML and data-intensive workloads.
-Security is foundational to Union.ai’s architecture, not an afterthought.
-This document provides a comprehensive overview of Union.ai’s security practices, architecture, and compliance posture for enterprise security professionals evaluating the platform.
-
-Union.ai’s security model is built on several core principles:
-
-* **Data residency:** Customer data is stored and computed only within the customer's data plane. The Union.ai control plane stores only orchestration metadata—no task inputs, outputs, code, logs, secrets, or container images.
-* **Architectural isolation:** A strict separation between the Union-hosted control plane and the customer-hosted data plane ensures that the blast radius of any control plane compromise does not extend to customer data.
-* **Outbound only connectivity:** The Cloudflare Tunnel connecting the control plane to the data plane is outbound-only from the customer’s network, requiring no inbound firewall rules. All communication uses mutual TLS (mTLS) and is authenticated using the customer's Auth / SSO.
-* **Compliance:** Union.ai is SOC 2 Type II certified for Security, Availability, and Integrity, with practices aligned to ISO 27001 and GDPR standards. Union is designed to meet HIPAA compliance requirements for handling Protected Health Information (PHI) and maintains CIS 1.4 AWS certification while pursuing CIS 3.0 certification (in progress). The Union.ai trust portal can be found at [trust.union.ai](https://trust.union.ai)
-* **Defense in depth:** Multiple layers of encryption, authentication, authorization, and network segmentation protect data throughout its lifecycle.
-* **Human / operational isolation:** Union.ai personnel access the customer's control plane UI only through authenticated, RBAC-controlled channels. Personnel do not have IAM credentials for customer cloud accounts and cannot directly access customer data stores, secrets, or compute infrastructure. In BYOC deployments, Union.ai additionally has [K8s cluster management access](./byoc-differences#human-access-to-customer-environments).
+This section provides a comprehensive overview of Union.ai's security architecture, practices, and compliance posture for enterprise security professionals evaluating the platform.
+Beyond describing the security model, it provides concrete verification steps so that reviewers can independently confirm each claim against a running system.
+
+## Overview
+
+**[Architecture](./architecture/_index)**
+The system is divided into a control plane hosted by Union.ai and a data plane hosted on the customer's infrastructure.
+The only connections between the two planes are outbound-only routes from the customer data plane to the control plane.
+Consequently, no inbound firewall rules are required on the customer's network.
+
+**[Data protection](./data-protection/_index)**
+Bulk customer data items (files, DataFrames, code bundles, container images) are stored in the customer's data plane and never enter the control plane.
+Smaller inline data items (structured task inputs/outputs, secret values during creation, log streams) pass through the control plane memory only transiently. They are not persisted there.
+The control plane does persist orchestration and task metadata, but these are always encrypted at rest.
+
+**[Identity and access](./identity-and-access/_index)**
+Authentication is done via OIDC/SSO, API keys, and service accounts.
+Role-based access control enforces least-privilege.
+Union.ai personnel cannot access customer data or secrets.
+
+**[Threat model](./threat-model)**
+An analysis of potential threats and how they are mitigated is provided.
+Control plane compromise, tunnel interception, and presigned URL leakage scenarios are examined,
+and the architectural design and security controls that mitigate these risks are described.
+The goal is to demonstrate that even in worst-case scenarios, customer data remains protected.
+
+**[Compliance and governance](./compliance/_index)**
+Union.ai is SOC 2 Type II certified for Security, Availability, and Processing Integrity, with practices aligned to ISO 27001 and CIS benchmarks.
+The platform is designed to meet HIPAA requirements.
+Details are available in the Public Trust Center at [trust.union.ai](https://trust.union.ai).
+This includes organizational security practices, vulnerability management, and a shared responsibility model.
 
 ## Deployment models
 
-Union.ai offers two deployment models, both sharing the same control plane / data plane architecture and security controls described in this document.
+Union.ai offers two deployment models, both sharing the same control plane / data plane architecture and security controls described in this section.
 
-In **Self-Managed** deployments, the customer operates their data plane independently; Union.ai has zero access to the customer’s infrastructure, with the Cloudflare tunnel as the only connection.
+In **BYOC** deployments, Union.ai manages the data plane in the customer's cloud account via private connectivity (PrivateLink/PSC).
+Union.ai handles upgrades, monitoring, and provisioning, while maintaining strict separation from customer data, secrets, and logs.
 
-In **BYOC** deployments, Union.ai manages the Kubernetes cluster in the customer’s cloud account via private connectivity (PrivateLink/PSC), handling upgrades, monitoring, and provisioning while maintaining strict separation from customer data, secrets, and logs.
+In **Self-managed** deployments, the customer operates their data plane independently.
+The customer is responsible for all aspects of data plane management, including upgrades, monitoring, and provisioning.
+Union.ai has no access to the customer's infrastructure, with the Cloudflare Tunnel and GRPC connections being the only pathways between Union.ai and the customer's network
+(and even then, only outbound from the customer to Union.ai).
 
-The core security architecture—encryption, RBAC, tenant isolation, presigned URL data access, and audit logging—is identical across both models. Sections where operational responsibilities differ are noted inline. [BYOC deployment differences](./byoc-differences) provides a detailed comparison.
+For details, see [Deployment models](./architecture/deployment-models).
@@ -0,0 +1,26 @@
+---
+title: Architecture
+weight: 1
+variants: -flyte +union
+sidebar_expanded: true
+---
+
+# Architecture
+
+Union.ai's security architecture rests on a foundational division between the Union.ai-hosted control plane, which orchestrates execution, and the customer-hosted data plane, where all computation occurs and all customer data resides. The two planes are connected by an outbound-only route that requires no inbound firewall rules on the customer side.
+
+In the BYOC model, Union.ai manages the data plane over a private connection. In the self-managed model, the customer manages the data plane themselves. In both cases, the same security controls apply, and the same [data residency guarantees](../data-protection/classification-and-residency) hold.
+
+This section covers:
+
+* **[Two-plane separation](./two-plane-separation)**: The division between between the Union.ai-hosted control plane and the customer-hosted data plane is the foundation of the security architecture.
+
+* **[Control plane](./control-plane)**: The control plane is the Union.ai-hosted orchestration component. It stores only orchestration and task metadata, which is encrypted at rest. Bulk data is referenced via signed URIs only, the actual bulk data never touches the control plane.
+
+* **[Data plane](./data-plane)**: The data plane runs entirely within the customer's cloud account. All computation occurs here and all customer data resides here. It uses workload identity federation (IRSA / Workload Identity / Azure Workload Identity) instead of static credentials, so no long-lived access keys are stored on the data plane.
+
+* **[Network architecture](./network)**: The data plane initiates all connections to the control plane via two outbound-only routes. There is no inbound attack surface on the customer's network and therefore no firewall rules are required.
+
+* **[Private connectivity (BYOC)](./private-connectivity)**: In the BYOC model, Union.ai manages the customer's Kubernetes cluster via PrivateLink, Private Service Connect, or Azure Private Link. The Kubernetes API is never exposed to the public internet.
+
+* **[Deployment models](./deployment-models)**: Self-managed and BYOC share the same two-plane architecture and security controls, differing only in who operates the data plane's Kubernetes cluster.
@@ -0,0 +1,39 @@
+---
+title: Control plane
+weight: 2
+variants: -flyte +union
+---
+
+# Control plane
+
+The control plane is the Union.ai-hosted component that orchestrates task execution, manages user access, and provides the web interface. It runs on AWS infrastructure managed by Union.ai and is covered by Union.ai's SOC 2 Type II certification.
+
+## What it does and does not store
+
+The control plane stores the information required for orchestration:
+
+- **Orchestration metadata**: Identifiers, action state (phase, timestamps, cluster assignment), user profiles, and scheduling configuration.
+- **Task and run definitions**: Each run submission includes a full TaskSpec (container image, typed interface, resource requirements, security context) and a RunSpec (environment variables, labels, annotations). Trigger specs carry default input values for scheduled runs.
+- **Error and event information**: Error messages from task executions (which may contain customer data from Python tracebacks), Kubernetes event messages, and per-attempt plugin state.
+
+The control plane does not store:
+
+- **Bulk customer data payloads**: When it references such data it stores only URIs pointing to objects in the customer's object store (for example, `s3://customer-bucket/org/project/domain/run/action/output.pb`).
+
+For the full classification of what is and isn't stored in the control plane, the sensitive fields that may appear in task definitions, and how inline data (structured I/O, secret values during creation, log streams) transits control plane memory without being persisted, see [Data classification and residency](../data-protection/classification-and-residency).
+
+## Infrastructure
+
+The control plane runs on AWS with multi-AZ redundancy to ensure high availability. It uses managed cloud database services for orchestration metadata, task/run definitions, execution events, and error messages. All backends are encrypted at rest and isolated within a VPC with restricted security groups that permit access only from control plane application services. See [Encryption](../data-protection/encryption) for at-rest encryption details by data type.
+
+TLS terminates at the edge, and all internal communication occurs over encrypted channels. Automated backups run on a defined schedule with point-in-time recovery capability. Union.ai maintains disaster recovery procedures and applies security patches on a regular cadence. The SOC 2 Type II report covers the availability, security, and operational controls of this infrastructure.
+
+## Capabilities
+
+The control plane exposes the following capabilities:
+
+- **API and UI gateway** -- an authenticated HTTPS API and web console for users, the SDK, and the CLI. All requests are subject to authentication and RBAC enforcement before any orchestration logic runs.
+- **Scheduling and execution tracking** -- schedules TaskActions across registered data plane clusters and records execution state (phase transitions, timestamps, errors) reported back from the data plane.
+- **Cluster registry** -- maintains the inventory of registered data plane clusters and their health, and routes orchestration traffic accordingly.
+- **Data gateway** -- proxies structured task inputs and outputs between clients and the data plane object store, streams execution logs from the data plane to clients, and brokers presigned URL signing requests for bulk data access. See [Data flow](../data-protection/data-flow) for what these pathways carry and how data is handled in transit.
+