-
Notifications
You must be signed in to change notification settings - Fork 80
Add design for managing Kopia repositories via BSL and new BSLR #1827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mpryc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing that isn't clear from the design doc is what changes may be needed upstream to allow BackupRepositories to be managed/created from outside Velero. I don't think Velero has any pluggability here currently, but if you want two different VM backups in the same namespace and BSL to use a different kopia repository, then I would think some velero-level integration would be required.
|
||
## Background | ||
|
||
The current architecture of OADP tightly couples each BackupStorageLocation (BSL) with a single Kopia repository. This repository is provisioned and controlled entirely by Velero’s core components. While this setup is adequate for standard backup scenarios, it introduces significant limitations when more flexible or granular configurations are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kopia repos are scoped to BSL+namespace, not just BSL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sseago true, will update.
|
||
## Goals | ||
|
||
* **Support Multiple Repository Instances per BSL** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Support Multiple Repository Instances per BSL in the same namespace"
woot! very excited about this one, thank you @mpryc |
@sseago currently, the design does not include any pluggability within Velero itself. The main goal is to enable managing Kopia repositories independently from Velero by using OADP. Specifically, OADP will handle initializing and managing Kopia repositories through functions that interact with the Velero codebase via the OADP controller. This creates an abstraction layer allowing users to access and manage Kopia server repositories outside of Velero. Another goal is to give access to kopia repositories outside of velero to allow e.g. retrieval of the files from kopia cli. |
@mpryc How do we keep velero and OADP from stepping on each other if they're both trying to manage kopia repos? i.e. if we have 2 VMs in the same namespace and they need to be stored in different kopia repos, how does that work if Velero creates one repo and uses it for both? I have a feeling I'm missing something here. |
Thank you for this proposal! After analyzing the design and comparing it with Velero's current BackupRepository implementation, I have some findings and questions that need clarification. Current UnderstandingBased on the clarification that "the design does not include any pluggability within Velero itself" and that OADP will manage Kopia repositories independently:
Current Velero ArchitectureIn Velero today, BackupRepository and BackupStorageLocation have a many-to-one relationship: graph TB
subgraph "Current Velero Architecture"
BSL["BackupStorageLocation"]
BR1["BackupRepository: ns1-default-kopia"]
BR2["BackupRepository: ns2-default-restic"]
BR3["BackupRepository: ns3-default-kopia"]
BR1 -->|References| BSL
BR2 -->|References| BSL
BR3 -->|References| BSL
BR1 -.->|Stores data in| S3[("S3/Azure/GCS Bucket")]
BR2 -.->|Stores data in| S3
BR3 -.->|Stores data in| S3
BSL -->|Points to| S3
end
style BSL fill:#1976d2,stroke:#0d47a1,stroke-width:2px,color:#fff
style BR1 fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style BR2 fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style BR3 fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style S3 fill:#616161,stroke:#424242,stroke-width:2px,color:#fff
Key characteristics:
Proposed OADP BSLR ArchitectureBased on my understanding, BSLR operates as a parallel system: graph TB
subgraph "OADP Layer"
OADP["OADP Controller"]
BSLR1["BSLR: vm-backups"]
BSLR2["BSLR: database-backups"]
OADP -->|Manages| BSLR1
OADP -->|Manages| BSLR2
OADP -->|Initializes| KopiaRepos[("Independent Kopia Repositories")]
end
subgraph "Velero Layer - Unchanged"
Velero["Velero"]
BSL["BackupStorageLocation"]
BR["BackupRepository"]
Velero -->|Uses| BSL
Velero -->|Creates| BR
BR -->|References| BSL
end
subgraph "External Access"
KopiaCLI["Kopia CLI"]
KopiaCLI -.->|Direct access| KopiaRepos
end
OADP -.->|Interacts with| Velero
BSLR1 -.->|Maps to| BSL
BSLR2 -.->|Maps to| BSL
style OADP fill:#00796b,stroke:#004d40,stroke-width:2px,color:#fff
style BSLR1 fill:#388e3c,stroke:#1b5e20,stroke-width:2px,color:#fff
style BSLR2 fill:#388e3c,stroke:#1b5e20,stroke-width:2px,color:#fff
style Velero fill:#1976d2,stroke:#0d47a1,stroke-width:2px,color:#fff
style BSL fill:#1565c0,stroke:#0d47a1,stroke-width:2px,color:#fff
style BR fill:#f57c00,stroke:#e65100,stroke-width:2px,color:#fff
style KopiaCLI fill:#5d4037,stroke:#3e2723,stroke-width:2px,color:#fff
style KopiaRepos fill:#424242,stroke:#212121,stroke-width:2px,color:#fff
Critical Questions1. Data Flow and IntegrationHow does OADP intercept or redirect the actual backup data flow? Since Velero will continue creating its own BackupRepository objects and managing backups as usual, how does OADP ensure that data goes to the appropriate BSLR-managed repository instead? 2. Namespace Conflict ResolutionAs @sseago pointed out: If two VMs in the same namespace need different Kopia repositories, but Velero creates one repository per namespace-BSL combination, how does OADP handle this conflict? For example:
3. Repository Initialization Race ConditionWho initializes the Kopia repositories - OADP or Velero? If both systems try to initialize repositories in the same storage location, how are conflicts avoided? 4. Backup Operation FlowDuring sequenceDiagram
participant User
participant Velero
participant OADP
participant Storage
User->>Velero: velero backup create
Velero->>Velero: Create Backup CR
Note over Velero: Creates PodVolumeBackup CRs
rect rgb(200, 50, 50)
Note right of Velero: UNCLEAR: How does OADP<br/>intercept this flow?
Velero-->>OADP: ???
OADP-->>Storage: Use BSLR repository?
end
alt Current Understanding
Velero->>Storage: Writes to namespace-based repo
else Proposed BSLR
OADP->>Storage: Writes to workload-based repo
end
5. Repository MappingHow does OADP map between:
6. Credential ManagementIf BSLR repositories have independent credentials, but Velero is still performing the actual backup operations, how are the BSLR credentials injected into the backup process? Suggested Clarifications
This would help address the "stepping on each other" concern and clarify how these two systems can coexist. SummaryThe core architectural question is: How do OADP and Velero coexist without conflicts when they have fundamentally different repository models?
Understanding this interaction is crucial for evaluating the design's feasibility and implementation approach. |
This design introduces the BackupStorageLocationRepository (BSLR) as a new custom resource that models and manages Kopia repositories on a per-BSL basis. Signed-off-by: Michal Pryc <[email protected]>
Velero and the new OADP BSLR mechanism avoid stepping on each other by clearly separating responsibility through the use of default vs non-default BackupStorageLocationRepository (BSLR) objects. Velero continues to manage the default BSLR (associated with the default BSL), while OADP only manages non-default BSLRs. In those cases, BSLR acts as a pointer to the Kopia repository, but OADP takes over orchestration and lifecycle management. This separation is handled explicitly in the logic: and
Velero itself will not create or manage the Kopia repositories for the above use case in this design — that responsibility lies with the OADP controller, via the new BackupStorageLocationRepository (BSLR) custom resource. So to address your example: if you have 2 VMs in the same namespace and want them to be stored in separate Kopia repositories, you would define 2 separate BSLRs (each tied to the same or different BSL, depending on your setup). Each VM would then reference its own BSLS — ensuring separation at the repository level. The BSLS is actually another layer which is covered in the design: #1830 Another possibility is to have one BSLR with one BSLS and users specified within BSLS. Each VM user will get their own credentials and Kopia Repository will separate access to the backups (snapshots) for them. |
@mpryc: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@kaovilai Thank you for your detailed review. I've added an additional design proposal that may address some of your concerns and complements the current BSLR design: #1830 This is a separate design, but it is intended to work in tandem with the current BSLR approach. Below are answers to your questions. Because your review was outside of the code I will copy-paste some of them inline:
All those three points are correct, meaning your understanding is perfect, which leads to conclusion design is self-explanatory.
First of all let's drop restic out of the equation as it's not relevant to the current and future OADP nor the BSLR (as per design). The current velero architecture is pretty much correct, the BSLR will be similar to BackupRepository, just managed, created by the OADP controller. We do not want to have same CR as Velero for this parallel mechanism. It would be also possible to use BackupRepository and Velero as management mechanism and drop the BSLR all together, but that would require BSLS (from another design) to point to BR and have their own naming mechanism to not to step each other. Also in such case the repository password management per BR would need to be included.
Yes you are correct - very much parallel mechanism similar to BR.
OADP does not redirect or intercept the actual backup data flow managed by Velero. Instead, it introduces the option for cluster administrators to explicitly enable the BackupStorageLocationServer (BSLS) for a given BackupStorageLocationRepository (BSLR), as outlined in the BSLS design proposal. Velero continues to manage its own backup lifecycle as usual, and any BSLR-enabled repository is managed in parallel by the OADP. OADP does not override or influence which repository Velero writes to. Administrators could optionally configure access control (ACL) to allow read-only access to a repository created by Velero - for example, enabling users to restore specific files without allowing them to write backups. However, such ACL support is outside the scope of the current design. In short, it is up to the cluster administrator to enable BSLS for selected BSLRs and manage access credentials accordingly.
With the new OADP design:
For example:
Velero will still perform the backup and restore operations, but it will operate against repositories that were created by the Velero and are BackupRepository objects. OADP has provisioned and made available repositories that were referenced by the BSLR mechanism and made them available for backup/restore via the BSLS proxy mechanism. Velero doesn't directly know about the BSLRs - OADP configures and controls access to them, resolving the namespace-level conflict externally.
OADP does not rely on Velero’s namespace-based repository model (namespace-bsl-kopia) for repository management. Instead, it decouples repository creation and selection from Velero entirely by introducing a new model:
|
cool cool, interested in seeing the updates we discussed today. Thank you @mpryc really cool work here! |
This design introduces the BackupStorageLocationRepository (BSLR) as a new custom resource that models and manages Kopia repositories on a per-BSL basis.
Why the changes were made
Propose new design that provides a clear separation between BSL config and repository state, and enables early provisioning of Kopia repos before backups run. This in the future will allow to create BSL Server on top of BSLR.
How to test the changes made
Read the design.