Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
a3d1bba
Add security-2 documentation section
ppiegaze Apr 10, 2026
3bac80e
Tighten security-2 prose and deduplicate content
ppiegaze Apr 10, 2026
a4eee09
Move security-2 back to security and replace old flat structure
ppiegaze Apr 17, 2026
b17e339
Add security-2 section based on annotated topic tree
ppiegaze Apr 17, 2026
c27e161
Move security-2 back to security and replace old flat structure
ppiegaze Apr 17, 2026
9dd77e7
Add missing H1 headings to security content pages
ppiegaze Apr 17, 2026
48f9b75
Rewrite security index: replace Core principles with Overview, swap O…
ppiegaze Apr 20, 2026
74930c2
Clarify Architecture and Data summaries in security index
ppiegaze Apr 21, 2026
3bd5b1a
Add audit annotations to security content (42 findings across 21 files)
ppiegaze Apr 21, 2026
b623001
Add explicit CP database field enumeration to audit annotations
ppiegaze Apr 21, 2026
93841ff
Remove trailing periods from section link headings in security index
ppiegaze Apr 21, 2026
67e6308
Restructure security index overview as bulleted sections
ppiegaze Apr 21, 2026
c91b391
Update security annotations for v2 architecture
ppiegaze Apr 21, 2026
2fa85f5
Remove v1/v2 contrast language from annotations
ppiegaze Apr 21, 2026
873f636
Rewrite security content to incorporate audit findings into prose
ppiegaze Apr 21, 2026
69bab5d
Add detailed encryption tables and field enumerations from audit
ppiegaze Apr 21, 2026
414af5a
Review and fix verification sections across all security pages
ppiegaze Apr 21, 2026
cfc2132
Fix stale terminology: State Service, workflow execution, registration
ppiegaze Apr 21, 2026
f65e947
Remove all em-dash (--) usage from security section
ppiegaze Apr 21, 2026
88f3718
Add optional customer-side support access section
ppiegaze Apr 21, 2026
9abf167
Restructure security section: eliminate Operations, promote Threat model
ppiegaze Apr 21, 2026
ea3b9e3
Merge branch 'main' into peeter/security-2
ppiegaze Apr 21, 2026
9367118
Merge remote-tracking branch 'origin/main' into peeter/security-2
ppiegaze Apr 28, 2026
3902236
Sanitize control plane implementation details and tighten security prose
ppiegaze Apr 28, 2026
222344b
Merge branch 'main' into peeter/security-2
ppiegaze Apr 28, 2026
fc2b3f3
Separate architecture from data residency in security docs
ppiegaze Apr 29, 2026
762cc43
Refactor security architecture intro and move CMK kill-switch test
ppiegaze Apr 29, 2026
a3e0b95
Generalize control plane Components into Capabilities
ppiegaze Apr 29, 2026
13981d8
Move control-plane Verification to canonical-claim pages
ppiegaze Apr 29, 2026
02e8584
Tighten data-plane proofreading nits
ppiegaze Apr 30, 2026
48863b0
Document second outbound channel (direct gRPC) and proofread network.md
ppiegaze Apr 30, 2026
b651369
Fix orchestration-traffic claim in private-connectivity.md
ppiegaze Apr 30, 2026
35c5964
Proofread deployment-models.md (two-channel followups)
ppiegaze Apr 30, 2026
dbcd969
Proofread all data-protection pages
ppiegaze Apr 30, 2026
d85522a
Proofread all identity-and-access pages
ppiegaze Apr 30, 2026
1c40adc
Rewrite threat-model as a thin index of scenarios
ppiegaze Apr 30, 2026
ccd722f
Proofread compliance section
ppiegaze Apr 30, 2026
20b0d34
clean up
ppiegaze Apr 30, 2026
5678f70
typo
ppiegaze Apr 30, 2026
3f559ee
Convert data-flow ASCII diagram and presigned URL table to bullet prose
ppiegaze Apr 30, 2026
d3b3f06
Split inline proxy prose into shorter paragraphs
ppiegaze Apr 30, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 39 additions & 16 deletions content/security/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,25 +7,48 @@ top_menu: true

# Security

Union.ai provides a production-grade workflow orchestration platform built on Flyte, designed for AI/ML and data-intensive workloads.
Security is foundational to Union.ai’s architecture, not an afterthought.
This document provides a comprehensive overview of Union.ai’s security practices, architecture, and compliance posture for enterprise security professionals evaluating the platform.

Union.ai’s security model is built on several core principles:

* **Data residency:** Customer data is stored and computed only within the customer's data plane. The Union.ai control plane stores only orchestration metadata—no task inputs, outputs, code, logs, secrets, or container images.
* **Architectural isolation:** A strict separation between the Union-hosted control plane and the customer-hosted data plane ensures that the blast radius of any control plane compromise does not extend to customer data.
* **Outbound only connectivity:** The Cloudflare Tunnel connecting the control plane to the data plane is outbound-only from the customer’s network, requiring no inbound firewall rules. All communication uses mutual TLS (mTLS) and is authenticated using the customer's Auth / SSO.
* **Compliance:** Union.ai is SOC 2 Type II certified for Security, Availability, and Integrity, with practices aligned to ISO 27001 and GDPR standards. Union is designed to meet HIPAA compliance requirements for handling Protected Health Information (PHI) and maintains CIS 1.4 AWS certification while pursuing CIS 3.0 certification (in progress). The Union.ai trust portal can be found at [trust.union.ai](https://trust.union.ai)
* **Defense in depth:** Multiple layers of encryption, authentication, authorization, and network segmentation protect data throughout its lifecycle.
* **Human / operational isolation:** Union.ai personnel access the customer's control plane UI only through authenticated, RBAC-controlled channels. Personnel do not have IAM credentials for customer cloud accounts and cannot directly access customer data stores, secrets, or compute infrastructure. In BYOC deployments, Union.ai additionally has [K8s cluster management access](./byoc-differences#human-access-to-customer-environments).
This section provides a comprehensive overview of Union.ai's security architecture, practices, and compliance posture for enterprise security professionals evaluating the platform.
Beyond describing the security model, it provides concrete verification steps so that reviewers can independently confirm each claim against a running system.

## Overview

**[Architecture](./architecture/_index)**
The system is divided into a control plane hosted by Union.ai and a data plane hosted on the customer's infrastructure.
The only connections between the two planes are outbound-only routes from the customer data plane to the control plane.
Consequently, no inbound firewall rules are required on the customer's network.

**[Data protection](./data-protection/_index)**
Bulk customer data items (files, DataFrames, code bundles, container images) are stored in the customer's data plane and never enter the control plane.
Smaller inline data items (structured task inputs/outputs, secret values during creation, log streams) pass through the control plane memory only transiently. They are not persisted there.
The control plane does persist orchestration and task metadata, but these are always encrypted at rest.

**[Identity and access](./identity-and-access/_index)**
Authentication is done via OIDC/SSO, API keys, and service accounts.
Role-based access control enforces least-privilege.
Union.ai personnel cannot access customer data or secrets.

**[Threat model](./threat-model)**
An analysis of potential threats and how they are mitigated is provided.
Control plane compromise, tunnel interception, and presigned URL leakage scenarios are examined,
and the architectural design and security controls that mitigate these risks are described.
The goal is to demonstrate that even in worst-case scenarios, customer data remains protected.

**[Compliance and governance](./compliance/_index)**
Union.ai is SOC 2 Type II certified for Security, Availability, and Processing Integrity, with practices aligned to ISO 27001 and CIS benchmarks.
The platform is designed to meet HIPAA requirements.
Details are available in the Public Trust Center at [trust.union.ai](https://trust.union.ai).
This includes organizational security practices, vulnerability management, and a shared responsibility model.

## Deployment models

Union.ai offers two deployment models, both sharing the same control plane / data plane architecture and security controls described in this document.
Union.ai offers two deployment models, both sharing the same control plane / data plane architecture and security controls described in this section.

In **Self-Managed** deployments, the customer operates their data plane independently; Union.ai has zero access to the customer’s infrastructure, with the Cloudflare tunnel as the only connection.
In **BYOC** deployments, Union.ai manages the data plane in the customer's cloud account via private connectivity (PrivateLink/PSC).
Union.ai handles upgrades, monitoring, and provisioning, while maintaining strict separation from customer data, secrets, and logs.

In **BYOC** deployments, Union.ai manages the Kubernetes cluster in the customer’s cloud account via private connectivity (PrivateLink/PSC), handling upgrades, monitoring, and provisioning while maintaining strict separation from customer data, secrets, and logs.
In **Self-managed** deployments, the customer operates their data plane independently.
The customer is responsible for all aspects of data plane management, including upgrades, monitoring, and provisioning.
Union.ai has no access to the customer's infrastructure, with the Cloudflare Tunnel and GRPC connections being the only pathways between Union.ai and the customer's network
(and even then, only outbound from the customer to Union.ai).

The core security architecture—encryption, RBAC, tenant isolation, presigned URL data access, and audit logging—is identical across both models. Sections where operational responsibilities differ are noted inline. [BYOC deployment differences](./byoc-differences) provides a detailed comparison.
For details, see [Deployment models](./architecture/deployment-models).
26 changes: 26 additions & 0 deletions content/security/architecture/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
title: Architecture
weight: 1
variants: -flyte +union
sidebar_expanded: true
---

# Architecture

Union.ai's security architecture rests on a foundational division between the Union.ai-hosted control plane, which orchestrates execution, and the customer-hosted data plane, where all computation occurs and all customer data resides. The two planes are connected by an outbound-only route that requires no inbound firewall rules on the customer side.

In the BYOC model, Union.ai manages the data plane over a private connection. In the self-managed model, the customer manages the data plane themselves. In both cases, the same security controls apply, and the same [data residency guarantees](../data-protection/classification-and-residency) hold.

This section covers:

* **[Two-plane separation](./two-plane-separation)**: The division between between the Union.ai-hosted control plane and the customer-hosted data plane is the foundation of the security architecture.

* **[Control plane](./control-plane)**: The control plane is the Union.ai-hosted orchestration component. It stores only orchestration and task metadata, which is encrypted at rest. Bulk data is referenced via signed URIs only, the actual bulk data never touches the control plane.

* **[Data plane](./data-plane)**: The data plane runs entirely within the customer's cloud account. All computation occurs here and all customer data resides here. It uses workload identity federation (IRSA / Workload Identity / Azure Workload Identity) instead of static credentials, so no long-lived access keys are stored on the data plane.

* **[Network architecture](./network)**: The data plane initiates all connections to the control plane via two outbound-only routes. There is no inbound attack surface on the customer's network and therefore no firewall rules are required.

* **[Private connectivity (BYOC)](./private-connectivity)**: In the BYOC model, Union.ai manages the customer's Kubernetes cluster via PrivateLink, Private Service Connect, or Azure Private Link. The Kubernetes API is never exposed to the public internet.

* **[Deployment models](./deployment-models)**: Self-managed and BYOC share the same two-plane architecture and security controls, differing only in who operates the data plane's Kubernetes cluster.
39 changes: 39 additions & 0 deletions content/security/architecture/control-plane.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: Control plane
weight: 2
variants: -flyte +union
---

# Control plane

The control plane is the Union.ai-hosted component that orchestrates task execution, manages user access, and provides the web interface. It runs on AWS infrastructure managed by Union.ai and is covered by Union.ai's SOC 2 Type II certification.

## What it does and does not store

The control plane stores the information required for orchestration:

- **Orchestration metadata**: Identifiers, action state (phase, timestamps, cluster assignment), user profiles, and scheduling configuration.
- **Task and run definitions**: Each run submission includes a full TaskSpec (container image, typed interface, resource requirements, security context) and a RunSpec (environment variables, labels, annotations). Trigger specs carry default input values for scheduled runs.
- **Error and event information**: Error messages from task executions (which may contain customer data from Python tracebacks), Kubernetes event messages, and per-attempt plugin state.

The control plane does not store:

- **Bulk customer data payloads**: When it references such data it stores only URIs pointing to objects in the customer's object store (for example, `s3://customer-bucket/org/project/domain/run/action/output.pb`).

For the full classification of what is and isn't stored in the control plane, the sensitive fields that may appear in task definitions, and how inline data (structured I/O, secret values during creation, log streams) transits control plane memory without being persisted, see [Data classification and residency](../data-protection/classification-and-residency).

## Infrastructure

The control plane runs on AWS with multi-AZ redundancy to ensure high availability. It uses managed cloud database services for orchestration metadata, task/run definitions, execution events, and error messages. All backends are encrypted at rest and isolated within a VPC with restricted security groups that permit access only from control plane application services. See [Encryption](../data-protection/encryption) for at-rest encryption details by data type.

TLS terminates at the edge, and all internal communication occurs over encrypted channels. Automated backups run on a defined schedule with point-in-time recovery capability. Union.ai maintains disaster recovery procedures and applies security patches on a regular cadence. The SOC 2 Type II report covers the availability, security, and operational controls of this infrastructure.

## Capabilities

The control plane exposes the following capabilities:

- **API and UI gateway** -- an authenticated HTTPS API and web console for users, the SDK, and the CLI. All requests are subject to authentication and RBAC enforcement before any orchestration logic runs.
- **Scheduling and execution tracking** -- schedules TaskActions across registered data plane clusters and records execution state (phase transitions, timestamps, errors) reported back from the data plane.
- **Cluster registry** -- maintains the inventory of registered data plane clusters and their health, and routes orchestration traffic accordingly.
- **Data gateway** -- proxies structured task inputs and outputs between clients and the data plane object store, streams execution logs from the data plane to clients, and brokers presigned URL signing requests for bulk data access. See [Data flow](../data-protection/data-flow) for what these pathways carry and how data is handled in transit.

Loading
Loading