Skip to content

Add security-2 documentation section#907

Open
ppiegaze wants to merge 41 commits intomainfrom
peeter/security-2
Open

Add security-2 documentation section#907
ppiegaze wants to merge 41 commits intomainfrom
peeter/security-2

Conversation

@ppiegaze
Copy link
Copy Markdown
Collaborator

Summary

  • Adds draft security-2/ section with 45 pages covering architecture, auth, compliance, encryption, network security, operations, secrets, and reference material
  • Organized into 8 subsections for the restructured security documentation

Test plan

  • Review content for accuracy
  • Verify Hugo build (make dist)
  • Check variant frontmatter on all pages
  • Validate internal links

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings April 10, 2026 09:39
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 10, 2026

Deploying docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: d3b3f06
Status: ✅  Deploy successful!
Preview URL: https://319b0bf7.docs-dog.pages.dev
Branch Preview URL: https://peeter-security-2.docs-dog.pages.dev

View logs

ppiegaze and others added 2 commits April 15, 2026 13:53
Draft security documentation covering architecture, auth, compliance,
encryption, network security, operations, secrets, and reference material.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Rewrite security-2 section for conciseness (~26% line reduction) while
preserving all information. Replace duplicated tables/paragraphs with
cross-reference links, tighten prose to match user-guide style, and add
Cloudflare tunnel firewall configuration link.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ppiegaze ppiegaze force-pushed the peeter/security-2 branch from 7a931bd to 3bac80e Compare April 15, 2026 11:53
ppiegaze and others added 20 commits April 17, 2026 12:28
Rename security-2/ to security/, replacing the original flat file
layout with the new organized subsection structure (architecture,
auth, compliance, keys, network, operations, reference, secrets).

Co-Authored-By: Claude Opus 4.6 <[email protected]>
New documentation section organized under five top-level categories:
- Architecture: two-plane separation, control/data plane, network, deployment models
- Data: classification/residency, data flow, encryption, secrets, workflow lifecycle
- Access: authentication, RBAC, tenant isolation, human access controls
- Compliance: certifications, HIPAA, GDPR, standards, shared responsibility
- Operations: logging/audit, vulnerability mgmt, threat modeling, org security

Each page includes prose content expanding the topic tree claims plus
Verification sections with reviewer focus ratings and concrete CLI
commands for security reviewers to independently confirm claims.

31 files total (6 section indexes + 25 content pages).

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Moves the new topic-tree-based security content from security-2/ into
security/, replacing the old flat subsection structure (auth/, keys/,
network/, secrets/, reference/) with the new five-category organization
(architecture/, data/, access/, compliance/, operations/).

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…perations/Compliance order

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Annotate security documentation with WARNING and NOTE callouts based on
source code audit of unionai/cloud and flyteorg/flyte-sdk repos. Key
findings: structured task I/O transits control plane memory, task
definition closures contain potentially sensitive fields, log streams
pass through unredacted, and tenant isolation has identified gaps.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Include task function names, workflow names, execution names, user
identity, and other identifier columns alongside the closure blob
contents already listed. Note encryption at rest (AES-256/KMS).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Break Data summary into three bullets distinguishing bulk data (never
enters CP), inline data (transits CP transiently), and CP database
metadata. Reformat all overview sections as heading + bullet list.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Replace v1 terminology (task definition closures, launch plan specs,
FlyteAdmin database) with v2 equivalents (TaskSpec/RunSpec blobs,
triggers, three CP databases). Note that v2 sends full TaskSpec inline
on every run and stores across PostgreSQL + 2x Cassandra.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Make the current architecture the default — no version labels or
contrasts with legacy behavior. Union.ai is simply Union.ai.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Replace 39 WARNING/NOTE callouts across 20 files with accurate inline
prose. Distinguish bulk data (never enters CP, presigned URLs), inline
data (transits CP memory, encrypted in transit, not persisted), and
metadata (stored in CP databases, encrypted at rest). Be precise about
encryption state at each phase.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- data-flow.md: per-hop encryption tables for all three data flow
  patterns, plus data flow path diagrams
- classification-and-residency.md: encryption columns in classification
  table (at rest, in transit, enters CP memory)
- control-plane.md: TaskSpec field enumeration table with sensitivity
  classifications
- secrets.md: per-phase encryption table for secret creation lifecycle
- encryption.md: comprehensive data protection summary table covering
  every data category across all phases

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Remove importance labels (Critical/High/Medium/Low) from all 37
  verification headings across 25 files
- Fix verification steps that claimed "no customer data in CP" to
  reflect the three-tier model (bulk/inline/metadata)
- Update tunnel traffic verification to acknowledge structured I/O
  transits the tunnel (not just "metadata-sized" traffic)
- Update control-plane verification to reference TaskSpec field table
- Fix workflow-data-flow retrieval step to distinguish binary vs
  structured outputs

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- "State Service" → "Actions Service" (actual code name)
- "workflow execution reaches a point" → "run requires a task to execute"
- "execution graph" → "run state"
- "Task registration" section merged into "Task deployment and run
  creation" (tasks are sent inline with each run, not registered
  separately)
- "register tasks" → "deploy tasks" in RBAC table

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Replace 103 occurrences of space-dash-dash-space across 27 files with
colons, parentheses, periods, or restructured sentences.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Document the optional service where customers can grant Union.ai staff
time-limited RBAC access to their view of the system for
troubleshooting. Distinguish from BYOC K8s cluster management access.
Available for both self-managed and BYOC deployments.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Reorganize from 5 sections to 4 sections + 1 standalone page:

- data/ → data-protection/ (renamed; gains logging-and-audit from ops)
- access/ → identity-and-access/ (renamed)
- operations/ → eliminated (content distributed)
  - logging-and-audit → data-protection/
  - threat-modeling → threat-model.md (top-level, high visibility)
  - organizational-security → compliance/
  - vulnerability-management → compliance/
  - _index.md benefits table → architecture/_index.md
- compliance/ → "Compliance and governance" (gains 2 pages from ops)

New structure:
  Architecture (planes, network, deployment, tunnel)
  Data protection (classification, flow, encryption, secrets, logging)
  Identity and access (auth, RBAC, tenant isolation, human access)
  Threat model (standalone page, promoted for visibility)
  Compliance and governance (certs, HIPAA, GDPR, org security, vuln mgmt)

All cross-references updated. No broken links.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@ppiegaze ppiegaze added security Secutiry section. Do not merge do-not-merge PR is ready for review, but should not be merged just yet labels Apr 23, 2026
ppiegaze and others added 19 commits April 28, 2026 12:35
Replaces internal RPC method names, service identifiers, and protocol
framework references in security docs with audience-appropriate generic
terms (intentional names retained only in the Components section).
Refines factual claims on user records and task metadata fields against
source code, adds a verification step demonstrating customer key
authority over bulk data, and tightens several long sentences.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Establishes data-protection/ as the canonical source for data classification,
residency, encryption, and flow patterns. Architecture pages describe structure
(planes, components, network paths, deployment models) and link to data-protection
for residency facts rather than restating them.

- Trim residency restatements from architecture/control-plane.md, data-plane.md,
  network.md, two-plane-separation.md
- Replace specific datastore mentions in control-plane.md with generic phrasing
  and link to encryption page
- Standardize size limits to 10 MiB submission / 20 MiB retrieval (drop "MB"
  and "10-20 MiB" combined phrasings)
- Fix deployment-models.md "AES-256 for all data" claim
- Trim workflow-data-flow.md restatements with link to classification-and-residency
- Apply user's semantic-line-break edits to security/_index.md

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Rewrite architecture/_index.md: intro paragraph reflecting all six subsections,
  followed by a single bulleted list with one bullet per subsection
- Move "Customer authority over data" CMK-disable test from
  architecture/two-plane-separation.md to data-protection/encryption.md
  (where customer-managed keys belong topically)
- Remove the Verification section from architecture/two-plane-separation.md;
  the residency portion is covered (better) in
  data-protection/classification-and-residency.md
- Apply user's "What it does and does not store" restructuring in control-plane.md

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace the named internal microservice breakdown (Admin, Queue Service,
Actions Service, Cluster Service, DataProxy) with a capability-level list.
The internal microservice decomposition isn't directly verifiable by
reviewers (control plane runs on Union.ai infrastructure) and naming each
service discloses implementation detail without aiding security review.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add protobuf schema review step (TaskTemplate, RunSpec) to
  data-protection/classification-and-residency.md, where the field
  enumeration lives
- Remove Verification section from architecture/control-plane.md;
  "What it stores" claims are verified in classification-and-residency,
  and Infrastructure / SOC 2 verification is already covered in
  compliance/certifications.md

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Move Data flow cross-reference to end of Components (was orphaned mid-list)
- Collapse duplicate workload identity description in Kubernetes security to a
  one-liner pointing at IAM and workload identity
- Trim duplicate Image Builder paragraph at the start of Container security
- Drop "operates as a standard Kubernetes controller" redundancy in Executor
- Drop "natural" qualifier in object store layout

Co-Authored-By: Claude Opus 4.6 <[email protected]>
A reviewer noted that the data plane initiates two distinct outbound-only
channels to the control plane, not one. Verified against unionai/cloud
source code: a Cloudflare Tunnel (cloudflared sidecar) and a separate
direct gRPC connection (operator/propeller dialing the regional cloudUrl).
The tunnel handles reverse-proxy traffic from CP to DP services
(DataProxy, log streaming, ingress); the gRPC channel handles
orchestration RPCs (cluster registration, action lifecycle, events,
catalog, admin).

network.md changes:
- Top-of-page intro mentions both channels
- New "Direct gRPC connection" section
- Tunnel traffic list corrected: "Orchestration instructions" and
  "State transitions" moved from tunnel to gRPC (they ride the gRPC
  channel, not the tunnel)
- Added "Apps & Serving ingress" to tunnel traffic list
- Communication paths table split into two cross-plane rows
- Verification reviewer focus expanded to cover both channels

data-plane.md: Tunnel Service component now notes the separate gRPC
connection alongside.

deployment-models.md: self-managed "only connection is Cloudflare Tunnel"
corrected to two outbound-only channels.

Also folds in earlier proofreading nits on network.md:
- Container images bypass via container registry pull, not presigned URLs
- Region table simplified (removed inconsistent placeholder Domain column)
- "VPN configuration is needed" -> "are needed"
- "simplified to permitting outbound HTTPS" rephrased
- "rotate implicitly" / "operator polling" rephrased to plain language
- Bidirectional tunnel-traffic note added before list
- Communication paths direction notation corrected (DP-initiated)
- Removed weak "request or build a tunnel audit mode" verification step

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Correct claim that orchestration traffic flows through the Cloudflare
  Tunnel; orchestration RPCs ride the direct gRPC channel. Defer data
  and orchestration details to network.md instead of restating
  (incorrectly) here.
- Add GCP/Azure equivalents to the AWS endpoint-listing verification
  command.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Common properties bullet now reflects both outbound channels (Cloudflare
  Tunnel and direct gRPC), not just the tunnel
- Resilience claim: "If the tunnel connection drops" -> "If either
  outbound channel drops" to match the two-channel reality
- Verification: rephrase "Simulate a control plane outage by disconnecting
  the tunnel" to "Simulate a connectivity disruption" since scaling down
  the Tunnel Service alone doesn't disable the direct gRPC channel
- Self-managed: clarify that the eliminated third-party access is to the
  data plane infrastructure specifically

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Across all 8 pages in content/security/data-protection/:

Factual / structural fixes:
- logging-and-audit.md weight: 1 -> 7 (was conflicting with classification-and-residency)
- Persisted logs no longer listed as living in S3 (they're in CloudWatch /
  Cloud Logging / Azure Monitor only)
- Generic "Data plane object store" replaces AWS-specific "Data plane S3"
- Outdated "Stackdriver" replaced with "Cloud Logging" throughout
- Stale cross-references to control-plane.md retargeted: field enumeration
  now lives in classification-and-residency.md
- "Two-plane separation" link replaced with "Network architecture" where
  the topic is network paths, and "Data flow" where the topic is log flow
- workflow-data-flow.md "two data flow patterns" -> "data flow patterns"
  (there are three)

Two-channel followups:
- encryption.md "Encryption in transit" lists both Cloudflare Tunnel and
  direct gRPC channels
- classification-and-residency.md orchestration metadata transit column:
  "TLS+mTLS+tunnel (events)" -> "TLS (gRPC events)" (events ride gRPC)
- encryption.md data protection summary: orchestration rows now show
  "TLS (gRPC)" instead of "TLS/mTLS/tunnel"
- multi-cloud.md cross-plane connectivity defers to network.md and
  acknowledges both channels

Tech generalization (matching control-plane.md treatment):
- All explicit "PostgreSQL" / "Cassandra" / "AWS RDS" references in
  control-plane storage tables removed; replaced with "control plane
  databases (AES-256/KMS)" or "managed cloud database service"
- "ClickHouse" replaced with "Observability metrics store (per-cluster)"
  in encryption-at-rest table

Other:
- _index.md long sentence split into bulleted list of three patterns
- encryption.md drops defensive "This is standard for any service that
  processes data"
- data-flow.md nested parens in bulk-data list flattened
- logging-and-audit.md audit verification section restructured: drops
  invented "union security audit" command, lists actual sources today
- logging-and-audit.md sentence-fragment "Self-service verification using
  existing features." replaced with standard phrasing
- secrets.md verification: competitor comparison moved out of the test
  step into a closing note

Co-Authored-By: Claude Opus 4.6 <[email protected]>
authentication.md:
- Drop "(Okta)" from the OIDC method row; any OIDC/SAML 2.0 provider works
- Rephrase confusing primary-IdP statement; customers configure their own
  identity provider
- Fix "a MFA prompt" -> "an MFA prompt"
- Standardize closing self-service note to match other pages

human-access.md:
- Disambiguate "control plane tenant" (was ambiguous between customer
  tenant and Union.ai-hosted infrastructure); clarify it's the Union.ai-
  hosted control plane
- Capitalize "Helm"
- Replace "the customer's own view of the system" with "the customer's
  tenant for troubleshooting"
- Scope "Access scope" section to the actual conditions under which Union.ai
  personnel access a customer's tenant (BYOC or optional support service);
  it had implied routine access
- Collapse redundant "cloud account or IAM roles, or access customer object
  stores..." into one list
- Self-managed verification: reflect both outbound channels (Cloudflare
  Tunnel and direct gRPC), not just the tunnel

rbac.md:
- Standardize "# Expect ..." comment alignment
- "a RBAC policy" -> "an RBAC policy"
- Standardize closing self-service note

tenant-isolation.md:
- Rename "## Isolation verification" -> "## Defense in depth" to avoid
  two H2s called "verification"
- Lowercase "protobuf" (it's the term, not a proper name)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The previous version restated impact analysis from Architecture, Encryption,
and Data flow, which caused drift (size limits, single-channel framing,
stale "see X for verification" links pointing at sections that had been
removed or moved).

The rewrite enumerates principal threat scenarios as one-paragraph framings
with links to the canonical pages where the controls and verification live.
Coverage expanded from 3 scenarios to 5: control plane compromise, cross-
plane network interception, presigned URL leakage, secret exfiltration,
and cross-tenant data access.

The Verification section is removed; verification now lives entirely on
each canonical page (no more "see referenced sections" indirection).

Page stays at top-level so it remains where security reviewers expect to
find a threat-modeling artifact.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Weight collisions and order:
- vulnerability-management.md weight 2 -> 7 (was colliding with hipaa.md)
- organizational-security.md weight 4 -> 6 (was colliding with standards.md)
- Order now matches the section index in _index.md

certifications.md:
- SOC 2 Type I trust criteria: "Integrity" -> "Processing Integrity"
  (the official trust service criterion name; matches Type II)
- "70+ verified controls" -> "73 verified controls" (exact)

gdpr.md:
- EU Central -> EU Central (Frankfurt) for symmetry with the other regions
- Stale verification link to two-plane-separation retargeted to
  data-protection/classification-and-residency (which now owns the
  residency verification)
- "For details" link similarly retargeted

hipaa.md:
- Drop "container images" from the bulk-PHI list (container images aren't
  typical PHI containers)
- Pronoun agreement: "If these contain PHI, it would be persisted" ->
  "If they contain PHI, they would be persisted"
- Restructured to consolidate the duplicate residency claim that
  previously appeared in both opening and closing paragraphs
- Stale verification link retargeted to classification-and-residency

standards.md:
- "complies with" -> "aligns with" (ISO 27001 is not a current Union.ai
  certification, per certifications.md)
- Corrected ISO 27001:2022 control titles:
  * A.8.20 "Network security" -> "Networks security" (official plural)
  * Replaced A.8.28 "Secure configuration" (actual title is "Secure
    coding") with A.8.22 "Segregation of networks" (which actually fits
    the described control about management plane separation)
  * A.8.21 "Cryptography" -> A.8.24 "Use of cryptography" (8.21 is
    actually "Security of network services"; cryptography is 8.24)
  * A.5.23 "Cloud service security" -> "Information security for use of
    cloud services" (official title)
- Replaced specific CIS v8 sub-control numbers (4.4, 12.11, 13.2 -- some
  of which I couldn't verify against the official CIS Controls v8) with
  alignment to top-level CIS controls 12 and 13

vulnerability-management.md:
- Cloudflare row "Tunnel connectivity" -> "Cross-plane connectivity
  (Tunnel and gRPC ingress)" to reflect that both outbound channels
  terminate at Cloudflare's edge

architecture/private-connectivity.md (consistency with standards.md):
- Same ISO/CIS control title corrections applied

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- data-flow.md: replace the 3-pattern ASCII diagram with three prose
  bullet points describing each pattern's flow and encryption
- data-flow.md: convert the presigned URL phase table to bullets
- Apply user edits to deployment-models.md (BYOC ordered before
  Self-managed) and data-protection/_index.md

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge PR is ready for review, but should not be merged just yet security Secutiry section. Do not merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant