Skip to content

Conversation

@Allda
Copy link
Collaborator

@Allda Allda commented Oct 17, 2025

A new backup controller orchestrates a backup process for workspace PVC. A new configuration option is added to DevWorkspaceOperatorConfig that enables running regular cronjob that is responsible for backup mechanism. The job executes following steps:

  • Find a workspaces
  • Finds out that workspace has been recently stopped
  • Detect a workspace PVC
  • Execute a job in the same namespace that does the backup

The last step is currently not fully implemented as it requires running a buildah inside the container and it will be delivered as a separate feature.

Issue: eclipse-che/che#23570

What does this PR do?

What issues does this PR fix or reference?

Is it tested? How?

The feature has been tested locally and using integration tests. Following configuration should be added to the config to enable this feature:

config:                                                                         
  workspace:                                                                    
    backupCronJob:                                                              
      enable: true                                                              
      registry: kind-registry:5000/backup                                       
      schedule: '* * * * *'

After a config is added, stop any workspace and wait till a backup job is created.

$ kubectl get jobs
devworkspace-backup-2l679   Running    0/1           138m       138m
devworkspace-backup-2xvgl   Running    0/1           139m       139m
devworkspace-backup-45vxb   Running    0/1           145m       145m

The job creates a backup and push image to registry

+ set -e
+ exec /workspace-recovery.sh --backup
+ set -e
+ for i in "$@"
+ case $i in
+ backup
+ BACKUP_IMAGE=kind-registry:5000/backup/backup-default-common-pvc-test:latest
++ buildah from scratch
+ NEW_IMAGE=working-container
+ buildah copy working-container /workspace/workspacedfd9f53065ea452c//projects /
f099c09f924cf051a01d78cd34ca87a4c161d7c217df5ac627e90e66926fbe9f
+ buildah config --label DEVWORKSPACE=common-pvc-test working-container
+ buildah config --label NAMESPACE=default working-container
+ buildah commit working-container kind-registry:5000/backup/backup-default-common-pvc-test:latest
Getting image source signatures
Copying blob sha256:137b2a0909654325b7eff0a9dfe623e5abdc685c4d6ad8e4c8d163e0984cf805
Copying config sha256:86693ca728855121a4dce059d91c6c9a196b4611fea4cb17d7b38015310cf193
Writing manifest to image destination
86693ca728855121a4dce059d91c6c9a196b4611fea4cb17d7b38015310cf193
+ buildah umount working-container
+ buildah push --tls-verify=false kind-registry:5000/backup/backup-default-common-pvc-test:latest
Getting image source signatures
Copying blob sha256:137b2a0909654325b7eff0a9dfe623e5abdc685c4d6ad8e4c8d163e0984cf805
Copying config sha256:86693ca728855121a4dce059d91c6c9a196b4611fea4cb17d7b38015310cf193
Writing manifest to image destination
stream closed: EOF for default/devworkspace-backup-zjzk5-82psq (backup-workspace)

PR Checklist

  • E2E tests pass (when PR is ready, comment /test v8-devworkspace-operator-e2e, v8-che-happy-path to trigger)
    • v8-devworkspace-operator-e2e: DevWorkspace e2e test
    • v8-che-happy-path: Happy path for verification integration with Che

@openshift-ci
Copy link

openshift-ci bot commented Oct 17, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Allda
Once this PR has been reviewed and has the lgtm label, please assign dkwon17 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Allda Allda force-pushed the 23570 branch 2 times, most recently from 42dd45c to dffd7e6 Compare October 17, 2025 11:06
@rohanKanojia
Copy link
Member

@Allda : Really appreciate you taking the time to contribute this in such a short time. 🎉

Could you please also fill out the “Is it tested? How?” section in the PR template? It’ll help reviewers and future contributors verify the change more easily.

Thanks again for your effort! 🙌

@rohanKanojia
Copy link
Member

I tested this PR and it seems to work.

  1. Created DevWorkspaceOperatorConfig with this BackupCronJobConfig (backup every 3 minutes)
config:
  workspace:
    backupCronJob:
      enable: true
      schedule: "*/3 * * * *"
  1. Created a DevWorkspace and wait for it to get running
  2. Stopped workspace
  3. Controller detected stopped workspace and started creating jobs for backups:
NAME               STATUS    COMPLETIONS   DURATION   AGE
backup-job-8tnsp   Running   0/1                      0s
backup-job-8tnsp   Running   0/1           0s         0s
backup-job-8tnsp   Running   0/1           16s        16s
backup-job-8tnsp   Running   0/1           17s        17s
backup-job-8tnsp   Running   0/1           18s        18s
backup-job-8tnsp   Complete   1/1           18s        18s
backup-job-kc8rm   Running    0/1                      0s
backup-job-kc8rm   Running    0/1           0s         0s
backup-job-kc8rm   Running    0/1           6s         6s
backup-job-kc8rm   Running    0/1           7s         7s
backup-job-kc8rm   Running    0/1           8s         8s
backup-job-kc8rm   Complete   1/1           8s         8s

@Allda Allda force-pushed the 23570 branch 3 times, most recently from 0bc74b1 to 8427ba5 Compare October 29, 2025 10:24
@Allda
Copy link
Collaborator Author

Allda commented Oct 29, 2025

/retest

@codecov
Copy link

codecov bot commented Nov 3, 2025

Codecov Report

❌ Patch coverage is 64.13043% with 165 lines in your changes missing coverage. Please review.
✅ Project coverage is 35.30%. Comparing base (d92e750) to head (2679783).
⚠️ Report is 16 commits behind head on main.

Files with missing lines Patch % Lines
...trollers/backupcronjob/backupcronjob_controller.go 71.95% 87 Missing and 19 partials ⚠️
apis/controller/v1alpha1/zz_generated.deepcopy.go 0.00% 43 Missing ⚠️
main.go 0.00% 9 Missing ⚠️
internal/images/image.go 0.00% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1530      +/-   ##
==========================================
+ Coverage   34.09%   35.30%   +1.21%     
==========================================
  Files         160      161       +1     
  Lines       13348    13802     +454     
==========================================
+ Hits         4551     4873     +322     
- Misses       8487     8599     +112     
- Partials      310      330      +20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@ibuziuk ibuziuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Allda great job!
discussed the overall PR with @dkwon17 and I believe we should target it to be merged in the DWO 0.39.0 version

@rohanKanojia
Copy link
Member

rohanKanojia commented Nov 25, 2025

@Allda : Thank you! I can confirm that this approach works without any explicit configuration.

I can see the backup image being pushed to the configured registry. However, I see image has a different format application/vnd.oci.empty.v1+json . I guess this is due to using oras , right?

I tested on CRC with OpenShift 4.20.1

@Allda
Copy link
Collaborator Author

Allda commented Nov 27, 2025

@Allda : Thank you! I can confirm that this approach works without any explicit configuration.

I can see the backup image being pushed to the configured registry. However, I see image has a different format application/vnd.oci.empty.v1+json . I guess this is due to using oras , right?

I tested on CRC with OpenShift 4.20.1

Yes, oras artifact is not "real" image so it has a different config. The images that are produced by the oras should have following manifest:

{
  "schemaVersion": 2,
  "mediaType": "application/vnd.oci.image.manifest.v1+json",
  "artifactType": "application/vnd.devworkspace.backup.artifact.v1+json",
  "config": {
    "mediaType": "application/vnd.oci.empty.v1+json",
    "digest": "sha256:44136fa355b3678a1146ad16f7e8649e94fb4fc21fe77e8310c060f61caaff8a",
    "size": 2,
    "data": "e30="
  },
  "layers": [
    {
      "mediaType": "application/vnd.oci.image.layer.v1.tar",
      "digest": "sha256:95ad8b1b94796dca816be47952b21d8a3f568345c72dba58836ab80a2bd7a433",
      "size": 454760,
      "annotations": {
        "org.opencontainers.image.title": "devworkspace-backup.tar.gz"
      }
    }
  ],
  "annotations": {
    "devworkspace.name": "per-workspace-pvc-test",
    "devworkspace.namespace": "araszka",
    "org.opencontainers.image.created": "2025-11-25T09:27:07Z"
  }
}

@dkwon17
Copy link
Collaborator

dkwon17 commented Dec 2, 2025

@tolusha for some reason GitHub is not letting me comment on your comment: #1530 (comment)

IMHO I would prefer if we didn't add an annotation to the DevWorkspaces to avoid potentially sending a high number of requests to the apiserver

- Moving extraArgs to Oras config section
- Unify default values
- Change UBI base image
- Use constant for the PVC name

Signed-off-by: Ales Raszka <[email protected]>
Allda added 5 commits December 4, 2025 10:51
Instead of using global secret for a whole cluster the controller search
for namespace specific secret and use it if available. If not found it
fallback to the global secret.

Signed-off-by: Ales Raszka <[email protected]>
In case user uses internal registry in the OCP the image repository path
aligns with OCP namespace and creates a image stream given by the
workspace name.

Signed-off-by: Ales Raszka <[email protected]>
The internal OCP registry is supported by default without a need of
providing any registry auth secret. The backup image is pushed to the
same namespace where the workspace is running. The token is
auto-generated and mounted from the SA definition.

Signed-off-by: Ales Raszka <[email protected]>
@Allda
Copy link
Collaborator Author

Allda commented Dec 4, 2025

/retest

@Allda
Copy link
Collaborator Author

Allda commented Dec 4, 2025

With this comment I would like to give you an overview of the feature, how to set it up and show you results. The current implementation was focused on priorities to support Build in OCP registry or any other OCI compliant registry.

First let's start with the default OCP registry. For this variant user don't need to provide any auth secrets.

apiVersion: controller.devfile.io/v1alpha1                                      
config:                                                                         
  routing:                                                                      
    defaultRoutingClass: basic                                                  
  workspace:                                                                    
    backupCronJob:                                                              
      enable: true                                                              
      registry:                                                                 
        path: default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com
      schedule: '* * * * *'                                                     
    imagePullPolicy: Always                                                     
kind: DevWorkspaceOperatorConfig
# create namespace, deploy workspace and stop it
kubectl create namespace demo-backup

kubectl apply -f controllers/workspace/testdata/per-workspace-pvc-test-devworkspace.yaml -n demo-backup

kubectl patch devworkspace per-workspace-pvc-test --type=merge -p '{"spec": {"started": false}}' -n demo-backup

In next cronjob iteration the workspace will be backed up with the Job.

k get jobs                           
NAME                        STATUS     COMPLETIONS   DURATION   AGE
devworkspace-backup-6n4f9   Complete   1/1           11s        26m

Logs from the Job:


Backing up devworkspace 'per-workspace-pvc-test' in namespace 'demo-backup' to image 'default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com/demo-backup/per-workspace-pvc-test:latest'
Using mounted service account token for registry authentication
Logging in to OpenShift internal registry 'default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com' using service account token
Login Succeeded
Exists 0128acec2099 devworkspace-backup.tar.gz
Pushed [registry] default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com/demo-backup/per-workspace-pvc-test:latest
ArtifactType: application/vnd.devworkspace.backup.artifact.v1+json
Digest: sha256:d3d90974855e9d12f8791a87690ed55b05e885ad25eb3b03f31859b44c8acd12
Backup completed successfully.

The image is then available in the internal registry

oc get is             
NAME                     IMAGE REPOSITORY                                                                                                           TAGS     UPDATED
per-workspace-pvc-test   default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com/demo-backup/per-workspace-pvc-test   latest   4 minutes ago

The second option is to use any generic OCI compliant registry with custom authentication. Here is the example of the config with custom quay.io registry.

apiVersion: controller.devfile.io/v1alpha1                                      
config:                                                                         
  routing:                                                                      
    defaultRoutingClass: basic                                                  
  workspace:                                                                    
    backupCronJob:                                                              
      enable: true                                                              
      registry:                                                                 
        authSecret: my-secret                                                   
        path: quay.io/araszka                                                   
      schedule: '* * * * *'                                                     
    imagePullPolicy: Always  

User needs to provide access token to either a worskspace namespace or to the operator namespace.

kubectl create secret docker-registry my-secret --from-file=config.json -n demo-backup

Here is the log of the backup job:

Backing up devworkspace 'per-workspace-pvc-test' in namespace 'demo-backup' to image 'quay.io/araszka/demo-backup/per-workspace-pvc-test:latest'
Uploading 0128acec2099 devworkspace-backup.tar.gz
Uploaded 0128acec2099 devworkspace-backup.tar.gz
Pushed [registry] quay.io/araszka/demo-backup/per-workspace-pvc-test:latest
ArtifactType: application/vnd.devworkspace.backup.artifact.v1+json
Digest: sha256:bfa036e01ebfefb390b7dfa96a50dc87b2caab469c8333ec6863cdd0941d2175
Backup completed successfully.

@ibuziuk
Copy link
Contributor

ibuziuk commented Dec 4, 2025

@Allda could you please clarify if the backup is expected to work with both per-user & per-workspace PVC strategies? Also, please consider contributing documentation (could be a separate PR) - https://github.com/devfile/devworkspace-operator/blob/main/docs/dwo-configuration.md

@Allda
Copy link
Collaborator Author

Allda commented Dec 5, 2025

@Allda could you please clarify if the backup is expected to work with both per-user & per-workspace PVC strategies? Also, please consider contributing documentation (could be a separate PR) - https://github.com/devfile/devworkspace-operator/blob/main/docs/dwo-configuration.md

Yes, the controller supports both types of volumes and is based on the volume provisioner logic that was already available in the operator.

I'll create a separate PR with documentation.

@rohanKanojia
Copy link
Member

@Allda : Thanks for providing the steps. I followed these steps on CRC (2.57.0 / OpenShift 4.20.5 ).

External Registry Backup Scenario

  • ✅ works when external registry quay.io
  • ❌ Doesn't work when I specified DockerHub registry (it seems DockerHub doesn't support pushing to docker.io/user/namespace/image:tag type URLs)

DWOC Configuration:

  config:
    workspace:
      backupCronJob:
        enable: true
        registry:
          authSecret: dockerhub-push-secret
          path: docker.io/rohankanojia
        schedule: '* * * * *'

When I stop DevWorkspace, I see job pod logs in Error state:

oc get pods                                                                                                             ─╯
NAME                              READY   STATUS   RESTARTS   AGE
devworkspace-backup-sx7xp-bqbr7   0/1     Error    0          2m26s
devworkspace-backup-sx7xp-hvk28   0/1     Error    0          2m52s
devworkspace-backup-sx7xp-hwwkx   0/1     Error    0          3m8s
devworkspace-backup-sx7xp-m7m2d   0/1     Error    0          14s
devworkspace-backup-sx7xp-pv2v5   0/1     Error    0          100s

When inspecting individual pod logs , it seems it's trying to push to wrong URL (maybe DockerHub doesn't support it):

 oc logs pod/devworkspace-backup-sx7xp-bqbr7                                                                             ─╯
+ set -e
+ exec /workspace-recovery.sh --backup
+ : docker.io/rohankanojia
+ : rokumar-dev
+ : code-latest
+ : /workspace/workspace68618d70645f44b4/projects
+ BACKUP_IMAGE=docker.io/rohankanojia/rokumar-dev/code-latest:latest
+ echo

+ [[ 1 -eq 0 ]]
+ for arg in "$@"
+ case "$arg" in
+ backup
+ TARBALL_NAME=devworkspace-backup.tar.gz
+ cd /tmp
Backing up devworkspace 'code-latest' in namespace 'rokumar-dev' to image 'docker.io/rohankanojia/rokumar-dev/code-latest:latest'
+ echo 'Backing up devworkspace '\''code-latest'\'' in namespace '\''rokumar-dev'\'' to image '\''docker.io/rohankanojia/rokumar-dev/code-latest:latest'\'''
+ tar -czvf devworkspace-backup.tar.gz -C /workspace/workspace68618d70645f44b4/projects .
./
./web-nodejs-sample/
./web-nodejs-sample/.git/
./web-nodejs-sample/.git/branches/
./web-nodejs-sample/.git/description
./web-nodejs-sample/.git/hooks/
./web-nodejs-sample/.git/hooks/applypatch-msg.sample
./web-nodejs-sample/.git/hooks/commit-msg.sample
./web-nodejs-sample/.git/hooks/fsmonitor-watchman.sample
./web-nodejs-sample/.git/hooks/post-update.sample
./web-nodejs-sample/.git/hooks/pre-applypatch.sample
./web-nodejs-sample/.git/hooks/pre-commit.sample
./web-nodejs-sample/.git/hooks/pre-merge-commit.sample
./web-nodejs-sample/.git/hooks/pre-push.sample
./web-nodejs-sample/.git/hooks/pre-rebase.sample
./web-nodejs-sample/.git/hooks/pre-receive.sample
./web-nodejs-sample/.git/hooks/prepare-commit-msg.sample
./web-nodejs-sample/.git/hooks/push-to-checkout.sample
./web-nodejs-sample/.git/hooks/sendemail-validate.sample
./web-nodejs-sample/.git/hooks/update.sample
./web-nodejs-sample/.git/info/
./web-nodejs-sample/.git/info/exclude
./web-nodejs-sample/.git/config
./web-nodejs-sample/.git/objects/
./web-nodejs-sample/.git/objects/pack/
./web-nodejs-sample/.git/objects/pack/pack-00c9cb529f78a543deaf778dfaffcb012aef3c11.pack
./web-nodejs-sample/.git/objects/pack/pack-00c9cb529f78a543deaf778dfaffcb012aef3c11.rev
./web-nodejs-sample/.git/objects/pack/pack-00c9cb529f78a543deaf778dfaffcb012aef3c11.idx
./web-nodejs-sample/.git/objects/info/
./web-nodejs-sample/.git/HEAD
./web-nodejs-sample/.git/refs/
./web-nodejs-sample/.git/refs/heads/
./web-nodejs-sample/.git/refs/heads/main
./web-nodejs-sample/.git/refs/tags/
./web-nodejs-sample/.git/refs/remotes/
./web-nodejs-sample/.git/refs/remotes/origin/
./web-nodejs-sample/.git/refs/remotes/origin/HEAD
./web-nodejs-sample/.git/packed-refs
./web-nodejs-sample/.git/logs/
./web-nodejs-sample/.git/logs/refs/
./web-nodejs-sample/.git/logs/refs/remotes/
./web-nodejs-sample/.git/logs/refs/remotes/origin/
./web-nodejs-sample/.git/logs/refs/remotes/origin/HEAD
./web-nodejs-sample/.git/logs/refs/heads/
./web-nodejs-sample/.git/logs/refs/heads/main
./web-nodejs-sample/.git/logs/HEAD
./web-nodejs-sample/.git/index
./web-nodejs-sample/.git/FETCH_HEAD
./web-nodejs-sample/.gitattributes
./web-nodejs-sample/.github/
./web-nodejs-sample/.github/CODEOWNERS
./web-nodejs-sample/.gitignore
./web-nodejs-sample/.vscode/
./web-nodejs-sample/.vscode/launch.json
./web-nodejs-sample/LICENSE
./web-nodejs-sample/README.md
./web-nodejs-sample/app/
./web-nodejs-sample/app/app.js
./web-nodejs-sample/devfile.yaml
./web-nodejs-sample/package-lock.json
./web-nodejs-sample/package.json
./.code-workspace
+ oras_args=(push "$BACKUP_IMAGE" --artifact-type application/vnd.devworkspace.backup.artifact.v1+json --annotation devworkspace.name="$DEVWORKSPACE_NAME" --annotation devworkspace.namespace="$DEVWORKSPACE_NAMESPACE" --disable-path-validation)
+ [[ -n /tmp/.docker/.dockerconfigjson ]]
+ oras_args+=(--registry-config "$REGISTRY_AUTH_FILE")
+ [[ -n '' ]]
+ oras_args+=("$TARBALL_NAME")
+ oras push docker.io/rohankanojia/rokumar-dev/code-latest:latest --artifact-type application/vnd.devworkspace.backup.artifact.v1+json --annotation devworkspace.name=code-latest --annotation devworkspace.namespace=rokumar-dev --disable-path-validation --registry-config /tmp/.docker/.dockerconfigjson devworkspace-backup.tar.gz
Error response from registry: recognizable error message not found: HEAD "https://registry-1.docker.io/v2/rohankanojia/rokumar-dev/code-latest/manifests/sha256:31fe9a87280cfc81c416b86719385a65aace4dbb761015544295397a9f3001dd": response status code 401: Unauthorized

OpenShift Internal Registry Backup Scenario

❌ I wasn't able to get it working on CRC

When I stop DevWorkspace, I see job pod logs in Error state:

NAME                              READY   STATUS   RESTARTS   AGE
devworkspace-backup-tj4sj-dcsff   0/1     Error    0          107s
devworkspace-backup-tj4sj-f5tzb   0/1     Error    0          68s
devworkspace-backup-tj4sj-s2zvl   0/1     Error    0          92s
devworkspace-backup-tj4sj-t46f6   0/1     Error    0          24s

I see error below in job pod logs, it seems oras doesn't trust certificates during login :

 oc logs pod/devworkspace-backup-tj4sj-dcsff                                                                             ─╯
+ set -e
+ exec /workspace-recovery.sh --backup
+ : default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com
+ : rokumar-dev
+ : code-latest
+ : /workspace/workspace1ec1a03646d04952/projects
+ BACKUP_IMAGE=default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com/rokumar-dev/code-latest:latest

+ echo
+ [[ 1 -eq 0 ]]
+ for arg in "$@"
+ case "$arg" in
+ backup
+ TARBALL_NAME=devworkspace-backup.tar.gz
+ cd /tmp
Backing up devworkspace 'code-latest' in namespace 'rokumar-dev' to image 'default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com/rokumar-dev/code-latest:latest'
+ echo 'Backing up devworkspace '\''code-latest'\'' in namespace '\''rokumar-dev'\'' to image '\''default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com/rokumar-dev/code-latest:latest'\'''
+ tar -czvf devworkspace-backup.tar.gz -C /workspace/workspace1ec1a03646d04952/projects .
./
./web-nodejs-sample/
./web-nodejs-sample/.git/
./web-nodejs-sample/.git/branches/
./web-nodejs-sample/.git/description
./web-nodejs-sample/.git/hooks/
./web-nodejs-sample/.git/hooks/applypatch-msg.sample
./web-nodejs-sample/.git/hooks/commit-msg.sample
./web-nodejs-sample/.git/hooks/fsmonitor-watchman.sample
./web-nodejs-sample/.git/hooks/post-update.sample
./web-nodejs-sample/.git/hooks/pre-applypatch.sample
./web-nodejs-sample/.git/hooks/pre-commit.sample
./web-nodejs-sample/.git/hooks/pre-merge-commit.sample
./web-nodejs-sample/.git/hooks/pre-push.sample
./web-nodejs-sample/.git/hooks/pre-rebase.sample
./web-nodejs-sample/.git/hooks/pre-receive.sample
./web-nodejs-sample/.git/hooks/prepare-commit-msg.sample
./web-nodejs-sample/.git/hooks/push-to-checkout.sample
./web-nodejs-sample/.git/hooks/sendemail-validate.sample
./web-nodejs-sample/.git/hooks/update.sample
./web-nodejs-sample/.git/info/
./web-nodejs-sample/.git/info/exclude
./web-nodejs-sample/.git/config
./web-nodejs-sample/.git/objects/
./web-nodejs-sample/.git/objects/pack/
./web-nodejs-sample/.git/objects/pack/pack-00c9cb529f78a543deaf778dfaffcb012aef3c11.pack
./web-nodejs-sample/.git/objects/pack/pack-00c9cb529f78a543deaf778dfaffcb012aef3c11.rev
./web-nodejs-sample/.git/objects/pack/pack-00c9cb529f78a543deaf778dfaffcb012aef3c11.idx
./web-nodejs-sample/.git/objects/info/
./web-nodejs-sample/.git/HEAD
./web-nodejs-sample/.git/refs/
./web-nodejs-sample/.git/refs/heads/
./web-nodejs-sample/.git/refs/heads/main
./web-nodejs-sample/.git/refs/tags/
./web-nodejs-sample/.git/refs/remotes/
./web-nodejs-sample/.git/refs/remotes/origin/
./web-nodejs-sample/.git/refs/remotes/origin/HEAD
./web-nodejs-sample/.git/packed-refs
./web-nodejs-sample/.git/logs/
./web-nodejs-sample/.git/logs/refs/
./web-nodejs-sample/.git/logs/refs/remotes/
./web-nodejs-sample/.git/logs/refs/remotes/origin/
./web-nodejs-sample/.git/logs/refs/remotes/origin/HEAD
./web-nodejs-sample/.git/logs/refs/heads/
./web-nodejs-sample/.git/logs/refs/heads/main
./web-nodejs-sample/.git/logs/HEAD
./web-nodejs-sample/.git/index
./web-nodejs-sample/.git/FETCH_HEAD
./web-nodejs-sample/.gitattributes
./web-nodejs-sample/.github/
./web-nodejs-sample/.github/CODEOWNERS
./web-nodejs-sample/.gitignore
./web-nodejs-sample/.vscode/
./web-nodejs-sample/.vscode/launch.json
./web-nodejs-sample/LICENSE
./web-nodejs-sample/README.md
./web-nodejs-sample/app/
./web-nodejs-sample/app/app.js
./web-nodejs-sample/devfile.yaml
./web-nodejs-sample/package-lock.json
./web-nodejs-sample/package.json
./.code-workspace
+ oras_args=(push "$BACKUP_IMAGE" --artifact-type application/vnd.devworkspace.backup.artifact.v1+json --annotation devworkspace.name="$DEVWORKSPACE_NAME" --annotation devworkspace.namespace="$DEVWORKSPACE_NAMESPACE" --disable-path-validation)
+ [[ -n '' ]]
+ [[ -f /var/run/secrets/kubernetes.io/serviceaccount/token ]]
+ echo 'Using mounted service account token for registry authentication'
Using mounted service account token for registry authentication
++ cat /var/run/secrets/kubernetes.io/serviceaccount/token
+ TOKEN=eyJhbGciOiJSUzI1NiIsImtpZCI6Il9sRnl2SmFhSGlGTjllSmVmT3NqX3lYT0ZqZTBvR0doT0ZRTFptSU5sZm8ifQ.eyJhdWQiOlsiaHR0cHM6Ly9rdWJlcm5ldGVzLmRlZmF1bHQuc3ZjIl0sImV4cCI6MTc5NjQ3MzM4MCwiaWF0IjoxNzY0OTM3MzgwLCJpc3MiOiJodHRwczovL2t1YmVybmV0ZXMuZGVmYXVsdC5zdmMiLCJqdGkiOiJkMjQ1NDliMS02ZDAzLTRhMjQtOWI0ZC1hMmZhNzE4MGViODciLCJrdWJlcm5ldGVzLmlvIjp7Im5hbWVzcGFjZSI6InJva3VtYXItZGV2Iiwibm9kZSI6eyJuYW1lIjoiY3JjIiwidWlkIjoiNzYyOWMxYzMtZDUyOS00ZTU0LTg2ZmQtNzIzYTFjZmNiZGZjIn0sInBvZCI6eyJuYW1lIjoiZGV2d29ya3NwYWNlLWJhY2t1cC10ajRzai1kY3NmZiIsInVpZCI6IjViYTlkZmRmLWE0ZTYtNDEyYS05NTgxLTY3MjM3OTJmYjJlZCJ9LCJzZXJ2aWNlYWNjb3VudCI6eyJuYW1lIjoiZGV2d29ya3NwYWNlLWpvYi1ydW5uZXItd29ya3NwYWNlMWVjMWEwMzY0NmQwNDk1MiIsInVpZCI6IjI4ZjE5ZTUyLTc2ZTItNDMxNS1hYjU0LTcwMzk3ZGNkN2MyYiJ9LCJ3YXJuYWZ0ZXIiOjE3NjQ5NDA5ODd9LCJuYmYiOjE3NjQ5MzczODAsInN1YiI6InN5c3RlbTpzZXJ2aWNlYWNjb3VudDpyb2t1bWFyLWRldjpkZXZ3b3Jrc3BhY2Utam9iLXJ1bm5lci13b3Jrc3BhY2UxZWMxYTAzNjQ2ZDA0OTUyIn0.LCqMI4Bw-HhXiUhfxjbIDwFTm8dxOEvCa7uPlbUtN3Z2ANQ7eaVEOSelfhZ_kGxUcM3yaylcq3PdQgYsQueYOZRtOTviODYMt-yx8wl7ijKuD1IJZFwsyHUWjf1IBP9kWqboyKo5eqc4sBnQPYJklO_IEC3FgFhcfgmv7QfCacc6H25_NWALQ6AjCJgCn88gH7V6mvJSgwGMm8BiO6ptYUq5wug7LFq8kE1Me3-KXX8D8jYYUTpLQYpQ_Gt2hyOOVpuGG16gXES40tSX76AHet9rBKSHZ8vjuLVbKctTK2zrCWlBYNU--_U1vLBFgYVijV2JOCPrYKvS7SQKdX7pV3F4VbS-D65IYeqEnDzVrycdeFuChqZG4JlF8OHCV-JpDozx3PmDyIX-fmB4cKB7TpDLRGAuHRbedLo3wb1E_Omkyr3UrzmRGAnklcjUGV69Rg1Ha6_z0JFV0rX74PLZpZPgM1fIfn9DqZUiCMwzQTvpnM8Smg7Rrsl1l0xbToaaqL5HqDpfnHhLxFQD5kWxT15JZ1Fj1hzdkcd-0AKBaeJzBs72ekEYyjzgCtI8Cd_ef2wkahcuUkwT3NaD1MsVj4a12iPyJRIuiFAGdCAwFeGlqW_ryDYmgydaGILbrG-0tbund2afqBlSaNud-BEREX38ekYKO8DsIU6wA9fP6R8
++ echo default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com/rokumar-dev/code-latest:latest
++ cut -d/ -f1
+ REGISTRY_HOST=default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com
+ REGISTRY_AUTH_FILE=/tmp/registry_auth.json
+ [[ default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com == *\o\p\e\n\s\h\i\f\t* ]]
+ [[ -f /var/run/secrets/kubernetes.io/serviceaccount/ca.crt ]]
+ oras login --password-stdin --ca-file /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -u serviceaccount --registry-config /tmp/registry_auth.json default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com
Error: failed to validate the credentials for default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com: Get "https://default-route-openshift-image-registry.apps.pipelines-stage.0ce8.p1.openshiftapps.com/v2/": tls: failed to verify certificate: x509: certificate signed by unknown authority

I can see ImageStream gets created but tag is empty:

oc get is                                                                                                               ─╯
NAME          IMAGE REPOSITORY                                                                  TAGS   UPDATED
code-latest   default-route-openshift-image-registry.apps-crc.testing/rokumar-dev/code-latest 

sa := &corev1.ServiceAccount{
ObjectMeta: metav1.ObjectMeta{Name: saName, Namespace: workspace.Namespace, Labels: map[string]string{
constants.DevWorkspaceIDLabel: workspace.Status.DevWorkspaceId,
constants.DevWorkspaceWatchSecretLabel: "true",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
constants.DevWorkspaceWatchSecretLabel: "true",

return err
}

if _, err := controllerutil.CreateOrUpdate(ctx, r.Client, sa, func() error { return nil }); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use SyncObjectWithCluster [1]. NotInSyncError is ok, it is always thrown on first creation [2]

[1]

func SyncObjectWithCluster(specObj crclient.Object, api ClusterAPI) (crclient.Object, error) {

[2]
func createObjectGeneric(specObj crclient.Object, api ClusterAPI) error {

log.Info("Backup Job created for DevWorkspace", "id", dwID)

}
origConfig := client.MergeFrom(dwOperatorConfig.DeepCopy())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why do we need to make a deep copy?

}
dwOperatorConfig.Status.LastBackupTime = &metav1.Time{Time: metav1.Now().Time}

err = r.NonCachingClient.Status().Patch(ctx, dwOperatorConfig, origConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to use a NonCachingClient for patching DWOC


// wasStoppedSinceLastBackup checks if the DevWorkspace was stopped since the last backup time.
func (r *BackupCronJobReconciler) wasStoppedSinceLastBackup(workspace *dw.DevWorkspace, lastBackupTime *metav1.Time, log logr.Logger) bool {
if workspace.Status.Phase != dw.DevWorkspaceStatusStopped {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkwon17 @Allda
Could you confirm please, that we have to back up only stopped workspaces

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, backup should only be for stopped workspaces

if workspace.Status.Phase != dw.DevWorkspaceStatusStopped {
return false
}
log.Info("DevWorkspace is currently stopped, checking if it was stopped since last backup", "namespace", workspace.Namespace, "name", workspace.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit concerned about extra printing.
It will look like a spam in case of hundreds of workspaces.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, if 100s of workspaces are stopped and stopped in a day, there can be thousands of backup related logs out of the box. For now, I think we should set https://github.com/Allda/devworkspace-operator/blob/6af5fb855a0512cff8f90c9a7d128c17269eb587/main.go#L195 to:

Log:              ctrl.Log.WithName("controllers").WithName("BackupCronJob").V(1)

There's no built in functionality for changing log level for DWO: https://issues.redhat.com/browse/WTO-296, maybe it can be set manually for now (either in operand's deployment, or in the CSV) as an env var.

// in the namespace where the operator is running in.
// as the DevWorkspaceOperatorCongfig.
// +kubebuilder:validation:Optional
AuthSecret string `json:"authSecret,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkwon17
We can add requirements for all secrets to have controller.devfile.io/watch-secret=true, in this case no need to use noncachingclient

Copy link
Collaborator

@dkwon17 dkwon17 Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to do that,

@Allda could we add this line, or something similar in the description (line 86)?

The secret must have the controller.devfile.io/watch-secret=true label set.

and use the regular client in the handleRegistryAuthSecret function?

// to the OpenShift internal registry.
func (r *BackupCronJobReconciler) ensureImagePushRoleBinding(ctx context.Context, saName string, workspace *dw.DevWorkspace) error {
// Create ClusterRoleBinding for system:image-builder role
clusterRoleBinding := &rbacv1.ClusterRoleBinding{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you confirm, that RoleBinding doesn't suit and we need specifically ClusterRoleBinding ?

},
}

if _, err := controllerutil.CreateOrUpdate(ctx, r.Client, clusterRoleBinding, func() error { return nil }); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use SyncObjectWithCluster

@dkwon17
Copy link
Collaborator

dkwon17 commented Dec 5, 2025

@rohanKanojia for the error you're facing, I was able to work around it by having this in my config:

config:
  workspace:
    backupCronJob:
      oras:
        extraArgs: '--insecure'

Spec: corev1.PodSpec{
ServiceAccountName: JobRunnerSAName + "-" + workspace.Status.DevWorkspaceId,
RestartPolicy: corev1.RestartPolicyNever,
Containers: []corev1.Container{
Copy link
Collaborator

@dkwon17 dkwon17 Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the backup job, I noticed if there's a failure, the job is retried a maximum of 6 times, potentially leaving many pods. For example here are the pods that are in my namespace when I was testing this feature when trying to backup a workspace (not all at once, I tried multiple times throughout the day):

Image

For the cronjob implementation, is it possible to set successfulJobsHistoryLimit to 0 and failedJobsHistoryLimit to only 1?
https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#jobs-history-limits

I would like to limit the number of pods, because I am a bit concerned we might overwhelm etcd if we have 1000s of devworkspaces and many failed pods that remain on the cluster cc @ibuziuk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants