diff --git a/kubernetes/README-ZH.md b/kubernetes/README-ZH.md index 0ac8fc0f4..f185d4dea 100644 --- a/kubernetes/README-ZH.md +++ b/kubernetes/README-ZH.md @@ -25,22 +25,29 @@ BatchSandbox 自定义资源允许您创建和管理多个相同的沙箱环境 ### 资源池化 Pool 自定义资源维护一个预热的计算资源池,以实现快速沙箱供应: -- 可配置的缓冲区大小(最小和最大)以平衡资源可用性和成本 -- 池容量限制以控制总体资源消耗 -- 基于需求的自动资源分配和释放 -- 实时状态监控,显示总数、已分配和可用资源 +- **可配置的缓冲区大小**:设置最小和最大缓冲区,以确保资源可用性同时控制成本。 +- **池容量限制**:通过池范围的最小和最大限制来控制总体资源消耗。 +- **回收策略 (Recycle Policies)**:支持不同的 Pod 回收策略: + - **Delete (默认)**:Pod 在返回池时会被删除并根据模板重新创建,确保环境绝对纯净。 + - **Restart**:通过向所有容器的 PID 1 发送 SIGTERM 信号优雅终止进程,并依赖 Kubernetes 的 `restartPolicy` 触发重启。这种方式比 `Delete` 更快,但要求 `PodTemplateSpec` 中的 `restartPolicy` 设置为 `Always`。可通过 annotation `pool.opensandbox.io/recycle-timeout-sec` 自定义重启超时时间(默认 90 秒)。 +- **自动扩展**:基于当前需求和缓冲区设置进行动态资源分配和释放。 +- **实时状态监控**:显示总数、已分配、可用以及正在重启中的 Pod 数量。 ### 任务编排 集成的任务管理系统,在沙箱内执行自定义工作负载: -- **可选执行**:任务调度完全可选 - 可以在不带任务的情况下创建沙箱 -- **基于进程的任务**:支持在沙箱环境中执行基于进程的任务 -- **异构任务分发**:使用 shardTaskPatches 为批处理中的每个沙箱定制单独的任务 +- **可选执行**:任务调度完全可选 - 可以在不带任务的情况下创建沙箱。 +- **基于进程的任务**:支持在沙箱环境中执行基于进程的任务。 +- **异构任务分发**:使用 `shardTaskPatches` 为批处理中的每个沙箱定制单独的任务。 +- **资源释放策略**:通过 `taskResourcePolicyWhenCompleted` 控制任务完成后资源何时返回池: + - **Retain (默认)**:保持沙箱资源,直到 `BatchSandbox` 被删除或过期。 + - **Release**:在任务达到终态(SUCCEEDED 或 FAILED)后,立即自动将沙箱释放回资源池。 ### 高级调度 智能资源管理功能: -- 最小和最大缓冲区设置,以确保资源可用性同时控制成本 -- 池范围的容量限制,防止资源耗尽 -- 基于需求的自动扩展 +- **基于需求的自动扩展**:根据实时的沙箱分配请求,自动扩展和收缩资源池中的 Pod 数量。 +- **缓冲区管理**:通过 `bufferMin` 和 `bufferMax` 设置平衡即时可用性与资源开销。 +- **池约束**:使用 `poolMin` 和 `poolMax` 设置资源使用的硬边界。 +- **滚动更新**:当修改 `PodTemplateSpec` 时,自动进行池更新和 Pod 轮转。 ## 运行时 API 支持说明 @@ -390,6 +397,7 @@ spec: bufferMin: 2 poolMax: 20 poolMin: 5 + podRecyclePolicy: Delete ``` 应用资源池配置: @@ -442,6 +450,7 @@ spec: bufferMin: 2 poolMax: 20 poolMin: 5 + podRecyclePolicy: Delete ``` 使用我们刚刚创建的资源池创建一批带有基于进程的异构任务的沙箱: @@ -454,6 +463,7 @@ metadata: spec: replicas: 2 poolRef: task-example-pool + taskResourcePolicyWhenCompleted: Release taskTemplate: spec: process: diff --git a/kubernetes/README.md b/kubernetes/README.md index b668c8de2..7dcff9334 100644 --- a/kubernetes/README.md +++ b/kubernetes/README.md @@ -25,22 +25,29 @@ The BatchSandbox custom resource allows you to create and manage multiple identi ### Resource Pooling The Pool custom resource maintains a pool of pre-warmed compute resources to enable rapid sandbox provisioning: -- Configurable buffer sizes (minimum and maximum) to balance resource availability and cost -- Pool capacity limits to control overall resource consumption -- Automatic resource allocation and deallocation based on demand -- Real-time status monitoring showing total, allocated, and available resources +- **Configurable Buffer Sizes**: Minimum and maximum buffer settings to ensure resource availability while controlling costs. +- **Pool Capacity Limits**: Overall resource consumption control with pool-wide minimum and maximum limits. +- **Recycle Policies**: Support for different pod recycling strategies: + - **Delete (Default)**: Pods are deleted and recreated from the template when returned to the pool, ensuring a completely clean environment. + - **Restart**: PID 1 in all containers is gracefully terminated (SIGTERM), and the Kubernetes `restartPolicy` triggers a restart. This is faster than `Delete` but requires the `restartPolicy` in `PodTemplateSpec` to be set to `Always`. The restart timeout can be customized per-pool via the annotation `pool.opensandbox.io/recycle-timeout-sec` (default: 90s). +- **Automatic Scaling**: Dynamic resource allocation and deallocation based on current demand and buffer settings. +- **Real-time Status**: Monitoring of total, allocated, available, and restarting pods. ### Task Orchestration Integrated task management system that executes custom workloads within sandboxes: -- **Optional Execution**: Task scheduling is completely optional - sandboxes can be created without tasks -- **Process-Based Tasks**: Support for process-based tasks that execute within the sandbox environment -- **Heterogeneous Task Distribution**: Customize individual tasks for each sandbox in a batch using shardTaskPatches +- **Optional Execution**: Task scheduling is completely optional - sandboxes can be created without tasks. +- **Process-Based Tasks**: Support for process-based tasks that execute within the sandbox environment. +- **Heterogeneous Task Distribution**: Customize individual tasks for each sandbox in a batch using `shardTaskPatches`. +- **Resource Release Policy**: Control when resources are returned to the pool after task completion via `taskResourcePolicyWhenCompleted`: + - **Retain (Default)**: Keeps the sandbox resources until the `BatchSandbox` is deleted or expires. + - **Release**: Automatically releases the sandbox back to the pool immediately after the task reaches a terminal state (SUCCEEDED or FAILED). ### Advanced Scheduling Intelligent resource management features: -- Minimum and maximum buffer settings to ensure resource availability while controlling costs -- Pool-wide capacity limits to prevent resource exhaustion -- Automatic scaling based on demand +- **Demand-based Scaling**: Automatically scales the number of pods in the pool based on real-time sandbox allocation requests. +- **Buffer Management**: `bufferMin` and `bufferMax` settings to balance instant availability with resource overhead. +- **Pool Constraints**: `poolMin` and `poolMax` to set hard boundaries on resource usage. +- **Rolling Updates**: Automatic pool update and pod rotation when the `PodTemplateSpec` is modified. ## Runtime API Support Notes @@ -389,6 +396,7 @@ spec: bufferMin: 2 poolMax: 20 poolMin: 5 + podRecyclePolicy: Delete ``` Apply the pool configuration: @@ -441,6 +449,7 @@ spec: bufferMin: 2 poolMax: 20 poolMin: 5 + podRecyclePolicy: Delete ``` Create a batch of sandboxes with process-based heterogeneous tasks using the pool we just created: @@ -453,6 +462,7 @@ metadata: spec: replicas: 2 poolRef: task-example-pool + taskResourcePolicyWhenCompleted: Release taskTemplate: spec: process: diff --git a/kubernetes/apis/sandbox/v1alpha1/pool_types.go b/kubernetes/apis/sandbox/v1alpha1/pool_types.go index 3c8b7e812..b78ba1ab3 100644 --- a/kubernetes/apis/sandbox/v1alpha1/pool_types.go +++ b/kubernetes/apis/sandbox/v1alpha1/pool_types.go @@ -32,8 +32,23 @@ type PoolSpec struct { // CapacitySpec controls the size of the resource pool. // +kubebuilder:validation:Required CapacitySpec CapacitySpec `json:"capacitySpec"` + // PodRecyclePolicy controls the recycle policy for Pods released from BatchSandbox. + // +optional + // +kubebuilder:default=Delete + PodRecyclePolicy PodRecyclePolicy `json:"podRecyclePolicy,omitempty"` } +// PodRecyclePolicy defines the recycle policy for Pods released from BatchSandbox. +// +kubebuilder:validation:Enum=Delete;Restart +type PodRecyclePolicy string + +const ( + // PodRecyclePolicyDelete deletes the Pod directly when released from BatchSandbox. + PodRecyclePolicyDelete PodRecyclePolicy = "Delete" + // PodRecyclePolicyRestart restarts containers before reusing the Pod. + PodRecyclePolicyRestart PodRecyclePolicy = "Restart" +) + type CapacitySpec struct { // BufferMax is the maximum number of nodes kept in the warm buffer. // +kubebuilder:validation:Minimum=0 @@ -66,6 +81,9 @@ type PoolStatus struct { Allocated int32 `json:"allocated"` // Available is the number of nodes currently available in the pool. Available int32 `json:"available"` + // Restarting is the number of Pods that are being restarted for recycle. + // +optional + Restarting int32 `json:"restarting,omitempty"` } // +genclient diff --git a/kubernetes/charts/opensandbox-controller/README.md b/kubernetes/charts/opensandbox-controller/README.md index fe177c14e..89d9d68d8 100644 --- a/kubernetes/charts/opensandbox-controller/README.md +++ b/kubernetes/charts/opensandbox-controller/README.md @@ -186,6 +186,7 @@ spec: bufferMin: 2 poolMax: 20 poolMin: 5 + podRecyclePolicy: Delete ``` ### Create a Batch Sandbox diff --git a/kubernetes/charts/opensandbox-controller/templates/clusterrole.yaml b/kubernetes/charts/opensandbox-controller/templates/clusterrole.yaml index 4ba42d397..575e3173f 100644 --- a/kubernetes/charts/opensandbox-controller/templates/clusterrole.yaml +++ b/kubernetes/charts/opensandbox-controller/templates/clusterrole.yaml @@ -73,6 +73,12 @@ rules: - get - patch - update +- apiGroups: + - "" + resources: + - pods/exec + verbs: + - create - apiGroups: - sandbox.opensandbox.io resources: diff --git a/kubernetes/cmd/controller/main.go b/kubernetes/cmd/controller/main.go index 1e95cc281..958a72774 100644 --- a/kubernetes/cmd/controller/main.go +++ b/kubernetes/cmd/controller/main.go @@ -19,6 +19,7 @@ import ( "flag" "os" "path/filepath" + "time" // Import all Kubernetes client auth plugins (e.g. Azure, GCP, OIDC, etc.) // to ensure that exec-entrypoint and run can make use of them. @@ -26,6 +27,7 @@ import ( "k8s.io/apimachinery/pkg/runtime" utilruntime "k8s.io/apimachinery/pkg/util/runtime" + "k8s.io/client-go/kubernetes" clientgoscheme "k8s.io/client-go/kubernetes/scheme" ctrl "sigs.k8s.io/controller-runtime" "sigs.k8s.io/controller-runtime/pkg/certwatcher" @@ -77,6 +79,9 @@ func main() { var kubeClientQPS float64 var kubeClientBurst int + // Restart timeout configuration + var restartTimeout time.Duration + flag.StringVar(&metricsAddr, "metrics-bind-address", "0", "The address the metrics endpoint binds to. "+ "Use :8443 for HTTPS or :8080 for HTTP, or leave as 0 to disable the metrics service.") flag.StringVar(&probeAddr, "health-probe-bind-address", ":8081", "The address the probe endpoint binds to.") @@ -104,6 +109,7 @@ func main() { flag.BoolVar(&logCompress, "log-compress", true, "Compress determines if the rotated log files should be compressed using gzip") flag.Float64Var(&kubeClientQPS, "kube-client-qps", 100, "QPS for Kubernetes client rate limiter.") flag.IntVar(&kubeClientBurst, "kube-client-burst", 200, "Burst for Kubernetes client rate limiter.") + flag.DurationVar(&restartTimeout, "restart-timeout", 90*time.Second, "Timeout for Pod restart operations. If a Pod fails to restart within this duration, it will be deleted.") opts := zap.Options{} opts.BindFlags(flag.CommandLine) @@ -259,11 +265,15 @@ func main() { setupLog.Error(err, "unable to create controller", "controller", "BatchSandbox") os.Exit(1) } + kubeClient := kubernetes.NewForConfigOrDie(mgr.GetConfig()) + restartTracker := controller.NewRestartTracker(mgr.GetClient(), kubeClient, mgr.GetConfig()) if err := (&controller.PoolReconciler{ - Client: mgr.GetClient(), - Scheme: mgr.GetScheme(), - Recorder: mgr.GetEventRecorderFor("pool-controller"), - Allocator: controller.NewDefaultAllocator(mgr.GetClient()), + Client: mgr.GetClient(), + Scheme: mgr.GetScheme(), + Recorder: mgr.GetEventRecorderFor("pool-controller"), + Allocator: controller.NewDefaultAllocator(mgr.GetClient()), + RestartTracker: restartTracker, + RestartTimeout: restartTimeout, }).SetupWithManager(mgr); err != nil { setupLog.Error(err, "unable to create controller", "controller", "Pool") os.Exit(1) diff --git a/kubernetes/config/crd/bases/sandbox.opensandbox.io_pools.yaml b/kubernetes/config/crd/bases/sandbox.opensandbox.io_pools.yaml index 8b987cada..b975a14e4 100644 --- a/kubernetes/config/crd/bases/sandbox.opensandbox.io_pools.yaml +++ b/kubernetes/config/crd/bases/sandbox.opensandbox.io_pools.yaml @@ -84,6 +84,14 @@ spec: - poolMax - poolMin type: object + podRecyclePolicy: + default: Delete + description: PodRecyclePolicy controls the recycle policy for Pods + released from BatchSandbox. + enum: + - Delete + - Restart + type: string template: description: Pod Template used to create pre-warmed nodes in the pool. x-kubernetes-preserve-unknown-fields: true @@ -109,6 +117,11 @@ spec: BatchSandbox's generation, which is updated on mutation by the API Server. format: int64 type: integer + restarting: + description: Restarting is the number of Pods that are being restarted + for recycle. + format: int32 + type: integer revision: description: Revision is the latest version of pool type: string diff --git a/kubernetes/config/rbac/role.yaml b/kubernetes/config/rbac/role.yaml index 87fb96026..1e574956f 100644 --- a/kubernetes/config/rbac/role.yaml +++ b/kubernetes/config/rbac/role.yaml @@ -17,6 +17,12 @@ rules: - patch - update - watch +- apiGroups: + - "" + resources: + - pods/exec + verbs: + - create - apiGroups: - "" resources: diff --git a/kubernetes/config/samples/sandbox_v1alpha1_batchsandbox-with-task.yaml b/kubernetes/config/samples/sandbox_v1alpha1_batchsandbox-with-task.yaml index 41d83985c..1be3bb0e5 100644 --- a/kubernetes/config/samples/sandbox_v1alpha1_batchsandbox-with-task.yaml +++ b/kubernetes/config/samples/sandbox_v1alpha1_batchsandbox-with-task.yaml @@ -21,6 +21,7 @@ spec: - -f - /dev/null expireTime: "2025-12-03T12:55:41Z" + taskResourcePolicyWhenCompleted: Release taskTemplate: spec: process: diff --git a/kubernetes/config/samples/sandbox_v1alpha1_pool.yaml b/kubernetes/config/samples/sandbox_v1alpha1_pool.yaml index 80973c353..6b1ec4df5 100644 --- a/kubernetes/config/samples/sandbox_v1alpha1_pool.yaml +++ b/kubernetes/config/samples/sandbox_v1alpha1_pool.yaml @@ -71,3 +71,4 @@ spec: bufferMin: 1 poolMax: 5 poolMin: 0 + podRecyclePolicy: Delete diff --git a/kubernetes/go.mod b/kubernetes/go.mod index 594a2c3ed..59073f1b3 100644 --- a/kubernetes/go.mod +++ b/kubernetes/go.mod @@ -15,6 +15,12 @@ require ( sigs.k8s.io/controller-runtime v0.21.0 ) +require ( + github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674 // indirect + github.com/moby/spdystream v0.5.0 // indirect + github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f // indirect +) + require ( cel.dev/expr v0.19.1 // indirect github.com/antlr4-go/antlr/v4 v4.13.0 // indirect diff --git a/kubernetes/go.sum b/kubernetes/go.sum index d43d2d694..577fd2fa7 100644 --- a/kubernetes/go.sum +++ b/kubernetes/go.sum @@ -2,6 +2,8 @@ cel.dev/expr v0.19.1 h1:NciYrtDRIR0lNCnH1LFJegdjspNx9fI59O7TWcua/W4= cel.dev/expr v0.19.1/go.mod h1:MrpN08Q+lEBs+bGYdLxxHkZoUSsCp0nSKTs0nTymJgw= github.com/antlr4-go/antlr/v4 v4.13.0 h1:lxCg3LAv+EUK6t1i0y1V6/SLeUi0eKEKdhQAlS8TVTI= github.com/antlr4-go/antlr/v4 v4.13.0/go.mod h1:pfChB/xh/Unjila75QW7+VU4TSnWnnk9UTnmpPaOR2g= +github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5 h1:0CwZNZbxp69SHPdPJAN/hZIm0C4OItdklCFmMRWYpio= +github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5/go.mod h1:wHh0iHkYZB8zMSxRWpUBQtwG5a7fFgvEO+odwuTv2gs= github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM= github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw= github.com/blang/semver/v4 v4.0.0 h1:1PFHFE6yCCTv8C1TeyNNarDzntLi7wMI5i/pzqYIsAM= @@ -66,6 +68,8 @@ github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db h1:097atOisP2aRj7vFgY github.com/google/pprof v0.0.0-20241029153458-d1b30febd7db/go.mod h1:vavhavw2zAxS5dIdcRluK6cSGGPlZynqzFM8NdvU144= github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0= github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo= +github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674 h1:JeSE6pjso5THxAzdVpqr6/geYxZytqFMBCOtn/ujyeo= +github.com/gorilla/websocket v1.5.4-0.20250319132907-e064f32e3674/go.mod h1:r4w70xmWCQKmi1ONH4KIaBptdivuRPyosB9RmPlGEwA= github.com/grpc-ecosystem/grpc-gateway/v2 v2.24.0 h1:TmHmbvxPmaegwhDubVz0lICL0J5Ka2vwTzhoePEXsGE= github.com/grpc-ecosystem/grpc-gateway/v2 v2.24.0/go.mod h1:qztMSjm835F2bXf+5HKAPIS5qsmQDqZna/PgVt4rWtI= github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8= @@ -89,6 +93,8 @@ github.com/kylelemons/godebug v1.1.0 h1:RPNrshWIDI6G2gRW9EHilWtl7Z6Sb1BR0xunSBf0 github.com/kylelemons/godebug v1.1.0/go.mod h1:9/0rRGxNHcop5bhtWyNeEfOS8JIWk580+fNqagV/RAw= github.com/mailru/easyjson v0.7.7 h1:UGYAvKxe3sBsEDzO8ZeWOSlIQfWFlxbzLZe7hwFURr0= github.com/mailru/easyjson v0.7.7/go.mod h1:xzfreul335JAWq5oZzymOObrkdz5UnU4kGfJJLY9Nlc= +github.com/moby/spdystream v0.5.0 h1:7r0J1Si3QO/kjRitvSLVVFUjxMEb/YLj6S9FF62JBCU= +github.com/moby/spdystream v0.5.0/go.mod h1:xBAYlnt/ay+11ShkdFKNAG7LsyK/tmNBVvVOwrfMgdI= github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg= github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q= @@ -96,6 +102,8 @@ github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9G github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk= github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 h1:C3w9PqII01/Oq1c1nUAm88MOHcQC9l5mIlSMApZMrHA= github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822/go.mod h1:+n7T8mK8HuQTcFwEeznm/DIxMOiR9yIdICNftLE1DvQ= +github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f h1:y5//uYreIhSUg3J1GEMiLbxo1LJaP8RfCpH6pymGZus= +github.com/mxk/go-flowrate v0.0.0-20140419014527-cca7078d478f/go.mod h1:ZdcZmHo+o7JKHSa8/e818NopupXU1YMK5fe1lsApnBw= github.com/onsi/ginkgo/v2 v2.22.0 h1:Yed107/8DjTr0lKCNt7Dn8yQ6ybuDRQoMGrNFKzMfHg= github.com/onsi/ginkgo/v2 v2.22.0/go.mod h1:7Du3c42kxCUegi0IImZ1wUQzMBVecgIHjR1C+NkhLQo= github.com/onsi/gomega v1.36.1 h1:bJDPBO7ibjxcbHMgSCoo4Yj18UWbKDlLwX1x9sybDcw= diff --git a/kubernetes/internal/controller/allocator.go b/kubernetes/internal/controller/allocator.go index fc038b460..800d1cf46 100644 --- a/kubernetes/internal/controller/allocator.go +++ b/kubernetes/internal/controller/allocator.go @@ -23,6 +23,7 @@ import ( "strconv" corev1 "k8s.io/api/core/v1" + "k8s.io/apimachinery/pkg/util/sets" "sigs.k8s.io/controller-runtime/pkg/client" logf "sigs.k8s.io/controller-runtime/pkg/log" @@ -177,12 +178,14 @@ type AllocSpec struct { Pool *sandboxv1alpha1.Pool // all pods of pool Pods []*corev1.Pod + + RecyclingPods sets.Set[string] } type AllocStatus struct { - // pod allocated to sandbox + // PodAllocation maps pod name to sandbox name for currently allocated pods. PodAllocation map[string]string - // pod request count + // PodSupplement is the number of additional pods needed to meet sandbox demands. PodSupplement int32 } @@ -212,7 +215,6 @@ func NewDefaultAllocator(client client.Client) Allocator { func (allocator *defaultAllocator) Schedule(ctx context.Context, spec *AllocSpec) (*AllocStatus, []SandboxSyncInfo, bool, error) { log := logf.FromContext(ctx) - log.Info("Schedule started", "pool", spec.Pool.Name, "totalPods", len(spec.Pods), "sandboxes", len(spec.Sandboxes)) status, err := allocator.initAllocation(ctx, spec) if err != nil { return nil, nil, false, err @@ -222,21 +224,23 @@ func (allocator *defaultAllocator) Schedule(ctx context.Context, spec *AllocSpec if _, ok := status.PodAllocation[pod.Name]; ok { continue } + if spec.RecyclingPods.Has(pod.Name) { + continue + } if pod.Status.Phase != corev1.PodRunning { continue } availablePods = append(availablePods, pod.Name) } - log.V(1).Info("Schedule init", "existingAllocations", len(status.PodAllocation), "availablePods", len(availablePods)) sandboxToPods := make(map[string][]string) for podName, sandboxName := range status.PodAllocation { sandboxToPods[sandboxName] = append(sandboxToPods[sandboxName], podName) } - sandboxAlloc, dirtySandboxes, poolAllocate, err := allocator.allocate(ctx, status, sandboxToPods, availablePods, spec.Sandboxes, spec.Pods) + sandboxAlloc, dirtySandboxes, poolAllocate, err := allocator.allocate(ctx, status, sandboxToPods, availablePods, spec.Sandboxes) if err != nil { log.Error(err, "allocate failed") } - poolDeallocate, err := allocator.deallocate(ctx, status, sandboxToPods, spec.Sandboxes) + poolDeallocate, err := allocator.deallocate(ctx, status, sandboxToPods, spec.Sandboxes, spec.RecyclingPods) if err != nil { log.Error(err, "deallocate failed") } @@ -276,7 +280,7 @@ func (allocator *defaultAllocator) initAllocation(ctx context.Context, spec *All return status, nil } -func (allocator *defaultAllocator) allocate(ctx context.Context, status *AllocStatus, sandboxToPods map[string][]string, availablePods []string, sandboxes []*sandboxv1alpha1.BatchSandbox, pods []*corev1.Pod) (map[string][]string, []string, bool, error) { +func (allocator *defaultAllocator) allocate(ctx context.Context, status *AllocStatus, sandboxToPods map[string][]string, availablePods []string, sandboxes []*sandboxv1alpha1.BatchSandbox) (map[string][]string, []string, bool, error) { errs := make([]error, 0) sandboxAlloc := make(map[string][]string) dirtySandboxes := make([]string, 0) @@ -358,7 +362,7 @@ func (allocator *defaultAllocator) doAllocate(ctx context.Context, status *Alloc return sandboxAlloc, remainAvailablePods, sandboxDirty, poolAllocate, nil } -func (allocator *defaultAllocator) deallocate(ctx context.Context, status *AllocStatus, sandboxToPods map[string][]string, sandboxes []*sandboxv1alpha1.BatchSandbox) (bool, error) { +func (allocator *defaultAllocator) deallocate(ctx context.Context, status *AllocStatus, sandboxToPods map[string][]string, sandboxes []*sandboxv1alpha1.BatchSandbox, recycling sets.Set[string]) (bool, error) { log := logf.FromContext(ctx) poolDeallocate := false errs := make([]error, 0) @@ -385,6 +389,9 @@ func (allocator *defaultAllocator) deallocate(ctx context.Context, status *Alloc pods := sandboxToPods[name] log.Info("GC deleted sandbox allocation", "sandbox", name, "podCount", len(pods)) for _, pod := range pods { + if recycling.Has(pod) { + continue + } delete(status.PodAllocation, pod) poolDeallocate = true } @@ -394,7 +401,6 @@ func (allocator *defaultAllocator) deallocate(ctx context.Context, status *Alloc } func (allocator *defaultAllocator) doDeallocate(ctx context.Context, status *AllocStatus, sandboxToPods map[string][]string, sbx *sandboxv1alpha1.BatchSandbox) (bool, error) { - log := logf.FromContext(ctx) deallocate := false name := sbx.Name allocatedPods, ok := sandboxToPods[name] @@ -406,9 +412,10 @@ func (allocator *defaultAllocator) doDeallocate(ctx context.Context, status *All return false, err } for _, pod := range toRelease.Pods { - delete(status.PodAllocation, pod) - deallocate = true - log.V(1).Info("Pod released from sandbox", "pod", pod, "sandbox", name) + if _, ok := status.PodAllocation[pod]; ok { + delete(status.PodAllocation, pod) + deallocate = true + } } pods := make([]string, 0) for _, pod := range allocatedPods { diff --git a/kubernetes/internal/controller/allocator_test.go b/kubernetes/internal/controller/allocator_test.go index 76f56dabc..3f4a54d1d 100644 --- a/kubernetes/internal/controller/allocator_test.go +++ b/kubernetes/internal/controller/allocator_test.go @@ -16,11 +16,13 @@ package controller import ( "context" + "encoding/json" "reflect" "testing" corev1 "k8s.io/api/core/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" + "k8s.io/apimachinery/pkg/util/sets" sandboxv1alpha1 "github.com/alibaba/OpenSandbox/sandbox-k8s/apis/sandbox/v1alpha1" "github.com/golang/mock/gomock" @@ -270,24 +272,97 @@ func TestAllocatorSchedule(t *testing.T) { }, }, }, + // Pod1 is allocated to sbx1 in pool level poolAlloc: &PoolAllocation{ - PodAllocation: map[string]string{}, + PodAllocation: map[string]string{ + "pod1": "sbx1", + }, }, + // Sandbox has pod1 allocated sandboxAlloc: &SandboxAllocation{ Pods: []string{ "pod1", }, }, + // Sandbox releases pod1 release: &AllocationRelease{ Pods: []string{ - "pod1", "sbx1", + "pod1", }, }, + // Pod1 should be removed from allocation and added to recycle wantStatus: &AllocStatus{ PodAllocation: map[string]string{}, PodSupplement: 0, }, }, + { + name: "pod with deallocated-from label is excluded", + spec: &AllocSpec{ + Pods: []*corev1.Pod{ + { + ObjectMeta: metav1.ObjectMeta{ + Name: "pod-normal", + }, + Status: corev1.PodStatus{ + Phase: corev1.PodRunning, + }, + }, + { + ObjectMeta: metav1.ObjectMeta{ + Name: "pod-deallocated", + Labels: map[string]string{ + "pool.opensandbox.io/deallocated-from": "bsx-uid-123", + }, + }, + Status: corev1.PodStatus{ + Phase: corev1.PodRunning, + }, + }, + }, + Pool: &sandboxv1alpha1.Pool{ + ObjectMeta: metav1.ObjectMeta{ + Name: "pool1", + }, + }, + RecyclingPods: sets.New("pod-deallocated"), + Sandboxes: []*sandboxv1alpha1.BatchSandbox{ + { + ObjectMeta: metav1.ObjectMeta{ + Name: "sbx1", + }, + Spec: sandboxv1alpha1.BatchSandboxSpec{ + PoolRef: "pool1", + Replicas: &replica1, + }, + }, + { + ObjectMeta: metav1.ObjectMeta{ + Name: "sbx2", + }, + Spec: sandboxv1alpha1.BatchSandboxSpec{ + PoolRef: "pool1", + Replicas: &replica1, + }, + }, + }, + }, + poolAlloc: &PoolAllocation{ + PodAllocation: map[string]string{}, + }, + sandboxAlloc: &SandboxAllocation{ + Pods: []string{}, + }, + release: &AllocationRelease{ + Pods: []string{}, + }, + wantStatus: &AllocStatus{ + PodAllocation: map[string]string{ + "pod-normal": "sbx1", + }, + PodSupplement: 1, // sbx2 needs a pod but only normal pod available + }, + }, } for _, c := range cases { t.Run(c.name, func(t *testing.T) { @@ -490,3 +565,71 @@ func TestSyncSandboxAllocationError(t *testing.T) { err := allocator.SyncSandboxAllocation(context.Background(), sandbox, pods) assert.Error(t, err) } + +func TestScheduleExcludesRestartingPods(t *testing.T) { + ctrl := gomock.NewController(t) + defer ctrl.Finish() + store := NewMockAllocationStore(ctrl) + syncer := NewMockAllocationSyncer(ctrl) + allocator := &defaultAllocator{ + store: store, + syncer: syncer, + } + replica1 := int32(1) + + // Create pods: one normal, one restarting (should be excluded from allocation) + restartingMeta := PodRecycleMeta{ + State: RecycleStateRestarting, + TriggeredAt: 1234567890, + } + restartingMetaJSON, _ := json.Marshal(restartingMeta) + + pods := []*corev1.Pod{ + { + ObjectMeta: metav1.ObjectMeta{ + Name: "pod-normal", + }, + Status: corev1.PodStatus{Phase: corev1.PodRunning}, + }, + { + ObjectMeta: metav1.ObjectMeta{ + Name: "pod-restarting", + Annotations: map[string]string{ + AnnoPodRecycleMeta: string(restartingMetaJSON), + }, + }, + Status: corev1.PodStatus{Phase: corev1.PodRunning}, + }, + } + sandboxes := []*sandboxv1alpha1.BatchSandbox{ + { + ObjectMeta: metav1.ObjectMeta{Name: "sbx1"}, + Spec: sandboxv1alpha1.BatchSandboxSpec{Replicas: &replica1}, + }, + { + ObjectMeta: metav1.ObjectMeta{Name: "sbx2"}, + Spec: sandboxv1alpha1.BatchSandboxSpec{Replicas: &replica1}, + }, + } + spec := &AllocSpec{ + Pods: pods, + Sandboxes: sandboxes, + Pool: &sandboxv1alpha1.Pool{ObjectMeta: metav1.ObjectMeta{Name: "pool1"}}, + RecyclingPods: sets.New("pod-restarting"), + } + + store.EXPECT().GetAllocation(gomock.Any(), gomock.Any()).Return(&PoolAllocation{PodAllocation: map[string]string{}}, nil).Times(1) + syncer.EXPECT().GetAllocation(gomock.Any(), gomock.Any()).Return(&SandboxAllocation{Pods: []string{}}, nil).Times(2) + syncer.EXPECT().GetRelease(gomock.Any(), gomock.Any()).Return(&AllocationRelease{Pods: []string{}}, nil).Times(2) + + status, pendingSyncs, poolDirty, err := allocator.Schedule(context.Background(), spec) + + assert.NoError(t, err) + assert.True(t, poolDirty) + // Only the normal pod should be allocated, sbx2 should have no pod + assert.Contains(t, status.PodAllocation, "pod-normal") + assert.NotContains(t, status.PodAllocation, "pod-restarting") + // sbx2 should need supplement since restarting pod is excluded + assert.Equal(t, int32(1), status.PodSupplement) + assert.Len(t, pendingSyncs, 1) +} diff --git a/kubernetes/internal/controller/apis.go b/kubernetes/internal/controller/apis.go index c32964aff..8999dd4a8 100644 --- a/kubernetes/internal/controller/apis.go +++ b/kubernetes/internal/controller/apis.go @@ -17,6 +17,7 @@ package controller import ( "encoding/json" + corev1 "k8s.io/api/core/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" "github.com/alibaba/OpenSandbox/sandbox-k8s/internal/utils" @@ -31,9 +32,77 @@ const ( AnnoPoolAllocStatusKey = "pool.opensandbox.io/alloc-status" AnnoPoolAllocGenerationKey = "pool.opensandbox.io/alloc-generation" + // Pod Recycle 相关 Annotation + AnnoPodRecycleMeta = "pool.opensandbox.io/recycle-meta" + FinalizerTaskCleanup = "batch-sandbox.sandbox.opensandbox.io/task-cleanup" + FinalizerPoolRecycle = "batch-sandbox.sandbox.opensandbox.io/pool-recycle" + + // Value is the BatchSandbox UID. + LabelPodDeallocatedFrom = "pool.opensandbox.io/deallocated-from" + // LabelPodRecycleConfirmed marks that Pool has confirmed recycling. + // Value is the BatchSandbox UID from deallocated-from label. + LabelPodRecycleConfirmed = "pool.opensandbox.io/recycle-confirmed" + + AnnoPodRecycleTimeoutSec = "pool.opensandbox.io/recycle-timeout-sec" +) + +// PodRecycleState defines the state of Pod recycle. +type PodRecycleState string + +const ( + // RecycleStateNone indicates the Pod is in normal state and can be allocated. + RecycleStateNone PodRecycleState = "None" + // RecycleStateRestarting indicates the Pod containers are restarting. + // This is the only active recycle state. The Pod transitions from None → Restarting + // when a restart is triggered, and back to None when all containers are restarted and ready. + RecycleStateRestarting PodRecycleState = "Restarting" ) +// PodRecycleMeta holds metadata for Pod recycle state machine. +type PodRecycleMeta struct { + // State: None or Restarting + State PodRecycleState `json:"state"` + + // TriggeredAt: Restart trigger timestamp (milliseconds) + TriggeredAt int64 `json:"triggeredAt"` + + // ContainerRestartCounts stores the restart count of each container before triggering restart. + // Used to detect if container has restarted even when StartedAt timestamp has second-level precision. + ContainerRestartCounts map[string]int32 `json:"containerRestartCounts,omitempty"` + + // ContainerStartedAt stores the StartedAt time of each container before triggering restart. + // Used to detect if container has restarted by comparing StartedAt timestamps. + ContainerStartedAt map[string]int64 `json:"containerStartedAt,omitempty"` +} + +// parsePodRecycleMeta parses the recycle metadata from Pod annotations. +func parsePodRecycleMeta(obj metav1.Object) (*PodRecycleMeta, error) { + meta := &PodRecycleMeta{} + if raw := obj.GetAnnotations()[AnnoPodRecycleMeta]; raw != "" { + if err := json.Unmarshal([]byte(raw), meta); err != nil { + return nil, err + } + } + return meta, nil +} + +// setPodRecycleMeta sets the recycle metadata to Pod annotations. +func setPodRecycleMeta(obj metav1.Object, meta *PodRecycleMeta) { + if obj.GetAnnotations() == nil { + obj.SetAnnotations(map[string]string{}) + } + obj.GetAnnotations()[AnnoPodRecycleMeta] = utils.DumpJSON(meta) +} + +func isRestarting(pod *corev1.Pod) bool { + return pod.Annotations[AnnoPodRecycleMeta] != "" +} + +func isRecycling(pod *corev1.Pod) bool { + return pod.Labels[LabelPodDeallocatedFrom] != "" || pod.Annotations[AnnoPodRecycleMeta] != "" +} + // AnnotationSandboxEndpoints Use the exported constant from pkg/utils var AnnotationSandboxEndpoints = pkgutils.AnnotationEndpoints diff --git a/kubernetes/internal/controller/batchsandbox_controller.go b/kubernetes/internal/controller/batchsandbox_controller.go index 6008d3cde..f39a13c1c 100644 --- a/kubernetes/internal/controller/batchsandbox_controller.go +++ b/kubernetes/internal/controller/batchsandbox_controller.go @@ -123,18 +123,31 @@ func (r *BatchSandboxReconciler) Reconcile(ctx context.Context, req ctrl.Request // handle finalizers if batchSbx.DeletionTimestamp == nil { + // Add FinalizerTaskCleanup if task scheduling is needed if taskStrategy.NeedTaskScheduling() { - if !controllerutil.ContainsFinalizer(batchSbx, FinalizerTaskCleanup) { - err := utils.UpdateFinalizer(r.Client, batchSbx, utils.AddFinalizerOpType, FinalizerTaskCleanup) - if err != nil { - log.Error(err, "failed to add finalizer", "finalizer", FinalizerTaskCleanup) - } else { - log.Info("added finalizer", "finalizer", FinalizerTaskCleanup) - } + if added, err := r.ensureFinalizer(ctx, batchSbx, FinalizerTaskCleanup); err != nil || !added { + return ctrl.Result{}, err + } + } + // Add FinalizerPoolRecycle for pool mode with Restart policy + if poolStrategy.IsPooledMode() { + if added, err := r.ensureFinalizer(ctx, batchSbx, FinalizerPoolRecycle); err != nil || !added { return ctrl.Result{}, err } } } else { + // Handle deletion: FinalizerPoolRecycle first, then FinalizerTaskCleanup + // After pool recycle, handle task cleanup if needed + if !controllerutil.ContainsFinalizer(batchSbx, FinalizerTaskCleanup) { + // Handle pool recycle must after task cleanup + needReconcile, err := r.handlePoolRecycle(ctx, batchSbx) + if err != nil { + return ctrl.Result{}, err + } + if needReconcile { + return ctrl.Result{RequeueAfter: 3 * time.Second}, nil + } + } if !taskStrategy.NeedTaskScheduling() { return ctrl.Result{}, nil } @@ -272,47 +285,61 @@ func calPodIndex(poolStrategy strategy.PoolStrategy, batchSbx *sandboxv1alpha1.B } func (r *BatchSandboxReconciler) listPods(ctx context.Context, poolStrategy strategy.PoolStrategy, batchSbx *sandboxv1alpha1.BatchSandbox) ([]*corev1.Pod, error) { - var ret []*corev1.Pod if poolStrategy.IsPooledMode() { - var ( - allocSet = make(sets.Set[string]) - releasedSet = make(sets.Set[string]) - ) - alloc, err := parseSandboxAllocation(batchSbx) + ret, err := r.getCurrentPoolPods(ctx, batchSbx) if err != nil { return nil, err } - allocSet.Insert(alloc.Pods...) + return ret, nil + } + var ret []*corev1.Pod + podList := &corev1.PodList{} + if err := r.Client.List(ctx, podList, &client.ListOptions{ + Namespace: batchSbx.Namespace, + FieldSelector: fields.SelectorFromSet(fields.Set{fieldindex.IndexNameForOwnerRefUID: string(batchSbx.UID)}), + }); err != nil { + return nil, err + } + for i := range podList.Items { + ret = append(ret, &podList.Items[i]) + } + return ret, nil +} - released, err := parseSandboxReleased(batchSbx) - if err != nil { - return nil, err - } - releasedSet.Insert(released.Pods...) +func (r *BatchSandboxReconciler) getCurrentPoolPods(ctx context.Context, batchSbx *sandboxv1alpha1.BatchSandbox) ([]*corev1.Pod, error) { - activePods := allocSet.Difference(releasedSet) - for name := range activePods { - pod := &corev1.Pod{} - // TODO maybe performance is problem - if err := r.Client.Get(ctx, types.NamespacedName{Namespace: batchSbx.Namespace, Name: name}, pod); err != nil { - if errors.IsNotFound(err) { - continue - } - return nil, err + var ( + allocSet = make(sets.Set[string]) + releasedSet = make(sets.Set[string]) + ) + alloc, err := parseSandboxAllocation(batchSbx) + if err != nil { + return nil, err + } + allocSet.Insert(alloc.Pods...) + + released, err := parseSandboxReleased(batchSbx) + if err != nil { + return nil, err + } + releasedSet.Insert(released.Pods...) + + activePods := allocSet.Difference(releasedSet) + return r.getPodsByNames(ctx, batchSbx, activePods) +} + +func (r *BatchSandboxReconciler) getPodsByNames(ctx context.Context, batchSbx *sandboxv1alpha1.BatchSandbox, podNames sets.Set[string]) ([]*corev1.Pod, error) { + var ret []*corev1.Pod + for name := range podNames { + pod := &corev1.Pod{} + // TODO maybe performance is problem + if err := r.Client.Get(ctx, types.NamespacedName{Namespace: batchSbx.Namespace, Name: name}, pod); err != nil { + if errors.IsNotFound(err) { + continue } - ret = append(ret, pod) - } - } else { - podList := &corev1.PodList{} - if err := r.Client.List(ctx, podList, &client.ListOptions{ - Namespace: batchSbx.Namespace, - FieldSelector: fields.SelectorFromSet(fields.Set{fieldindex.IndexNameForOwnerRefUID: string(batchSbx.UID)}), - }); err != nil { return nil, err } - for i := range podList.Items { - ret = append(ret, &podList.Items[i]) - } + ret = append(ret, pod) } return ret, nil } @@ -425,6 +452,13 @@ func (r *BatchSandboxReconciler) getTasksCleanupUnfinished(batchSbx *sandboxv1al } func (r *BatchSandboxReconciler) releasePods(ctx context.Context, batchSbx *sandboxv1alpha1.BatchSandbox, toReleasePods []string) error { + pods, err := r.getPodsByNames(ctx, batchSbx, sets.New(toReleasePods...)) + if err != nil { + return err + } + if err = r.addDeallocatedFromLabel(ctx, batchSbx, pods); err != nil { + return err + } releasedSet := make(sets.Set[string]) released, err := parseSandboxReleased(batchSbx) if err != nil { @@ -464,7 +498,6 @@ func (r *BatchSandboxReconciler) scaleBatchSandbox(ctx context.Context, batchSan for i := range pods { pod := pods[i] BatchSandboxScaleExpectations.ObserveScale(controllerutils.GetControllerKey(batchSandbox), expectations.Create, pod.Name) - pods = append(pods, pod) idx, err := parseIndex(pod) if err != nil { return fmt.Errorf("failed to parse idx Pod %s, err %w", pod.Name, err) @@ -557,3 +590,111 @@ func (r *BatchSandboxReconciler) SetupWithManager(mgr ctrl.Manager) error { WithOptions(controller.Options{MaxConcurrentReconciles: 32}). Complete(r) } + +// ensureFinalizer ensures the given finalizer is present on the object. +// Returns (true, nil) if finalizer was already present, (false, nil) if finalizer was added successfully, +// or (false, err) if an error occurred. +func (r *BatchSandboxReconciler) ensureFinalizer(ctx context.Context, batchSbx *sandboxv1alpha1.BatchSandbox, finalizer string) (bool, error) { + log := logf.FromContext(ctx) + if controllerutil.ContainsFinalizer(batchSbx, finalizer) { + return true, nil + } + err := utils.UpdateFinalizer(r.Client, batchSbx, utils.AddFinalizerOpType, finalizer) + if err != nil { + log.Error(err, "failed to add finalizer", "finalizer", finalizer) + return false, err + } + log.Info("added finalizer", "finalizer", finalizer) + return false, nil +} + +// checkPoolRecycleFinalizer checks if all pods are recycled or confirmed. +// Returns true if Finalizer can be removed. +func (r *BatchSandboxReconciler) checkPoolRecycleFinalizer(ctx context.Context, bsx *sandboxv1alpha1.BatchSandbox) (bool, error) { + alloc, err := parseSandboxAllocation(bsx) + if err != nil { + return false, err + } + + for _, podName := range alloc.Pods { + pod := &corev1.Pod{} + err := r.Get(ctx, types.NamespacedName{Namespace: bsx.Namespace, Name: podName}, pod) + if errors.IsNotFound(err) { + continue // Pod deleted, OK + } + if err != nil { + return false, err + } + + // Check if recycle is confirmed + confirmedUID := pod.Labels[LabelPodRecycleConfirmed] + if confirmedUID != string(bsx.UID) { + // Not yet confirmed, keep waiting + return false, nil + } + } + // All pods confirmed or deleted + return true, nil +} + +// addDeallocatedFromLabel adds deallocated-from label to pods. +func (r *BatchSandboxReconciler) addDeallocatedFromLabel(ctx context.Context, bsx *sandboxv1alpha1.BatchSandbox, pods []*corev1.Pod) error { + for _, pod := range pods { + if pod.Labels[LabelPodRecycleConfirmed] == string(bsx.UID) { + continue + } + // Check if label already exists with correct value + if pod.Labels[LabelPodDeallocatedFrom] == string(bsx.UID) { + continue + } + // Add label + old := pod.DeepCopy() + if pod.Labels == nil { + pod.Labels = make(map[string]string) + } + pod.Labels[LabelPodDeallocatedFrom] = string(bsx.UID) + patch := client.MergeFrom(old) + if err := r.Patch(ctx, pod, patch); err != nil { + return err + } + } + return nil +} + +func (r *BatchSandboxReconciler) handlePoolRecycle(ctx context.Context, batchSbx *sandboxv1alpha1.BatchSandbox) (needReconcile bool, err error) { + log := logf.FromContext(ctx) + if !controllerutil.ContainsFinalizer(batchSbx, FinalizerPoolRecycle) { + return false, nil + } + pods, err := r.getCurrentPoolPods(ctx, batchSbx) + if err != nil { + return false, err + } + if err = r.addDeallocatedFromLabel(ctx, batchSbx, pods); err != nil { + log.Error(err, "failed to add deallocated-from label") + return false, err + } + + // Check if all pods are recycled or confirmed + allRecycled, err := r.checkPoolRecycleFinalizer(ctx, batchSbx) + if err != nil { + log.Error(err, "failed to check pool recycle finalizer") + return false, err + } + + if !allRecycled { + log.Info("waiting for pods to be recycled") + // Requeue to check again + return true, nil + } + + err = utils.UpdateFinalizer(r.Client, batchSbx, utils.RemoveFinalizerOpType, FinalizerPoolRecycle) + if err != nil { + if !errors.IsNotFound(err) { + log.Error(err, "failed to remove finalizer", "finalizer", FinalizerPoolRecycle) + } + return false, err + } + log.Info("pool recycle completed, removed finalizer", "finalizer", FinalizerPoolRecycle) + return false, nil +} diff --git a/kubernetes/internal/controller/batchsandbox_controller_test.go b/kubernetes/internal/controller/batchsandbox_controller_test.go index 5767f1cdb..dd635913e 100644 --- a/kubernetes/internal/controller/batchsandbox_controller_test.go +++ b/kubernetes/internal/controller/batchsandbox_controller_test.go @@ -400,6 +400,16 @@ var _ = Describe("BatchSandbox Controller", func() { Expect(k8sClient.Delete(ctx, resource)).To(Succeed()) }) + It("should successfully add pool recycle finalizer for pool mode BatchSandbox", func() { + Eventually(func(g Gomega) { + bs := &sandboxv1alpha1.BatchSandbox{} + if err := k8sClient.Get(ctx, typeNamespacedName, bs); err != nil { + return + } + g.Expect(controllerutil.ContainsFinalizer(bs, FinalizerPoolRecycle)).To(BeTrue(), "FinalizerPoolRecycle should be present for pool mode BatchSandbox") + }, timeout, interval).Should(Succeed()) + }) + It("should successfully update batch sandbox status, sbx endpoints info when get pod from pool alloc", func() { // mock pool allocation mockPods := []string{} diff --git a/kubernetes/internal/controller/pool_controller.go b/kubernetes/internal/controller/pool_controller.go index db053537e..55c8252e8 100644 --- a/kubernetes/internal/controller/pool_controller.go +++ b/kubernetes/internal/controller/pool_controller.go @@ -31,6 +31,7 @@ import ( "k8s.io/apimachinery/pkg/runtime" "k8s.io/apimachinery/pkg/types" "k8s.io/apimachinery/pkg/util/json" + "k8s.io/apimachinery/pkg/util/sets" "k8s.io/client-go/tools/record" "k8s.io/client-go/util/retry" ctrl "sigs.k8s.io/controller-runtime" @@ -62,12 +63,28 @@ var ( PoolScaleExpectations = expectations.NewScaleExpectations() ) +// scheduleResult holds the result of scheduleSandbox operation. +type scheduleResult struct { + // podAllocation maps pod name to sandbox name for currently allocated pods. + podAllocation map[string]string + // pendingSyncs contains sandboxes that need allocation sync. + pendingSyncs []SandboxSyncInfo + // idlePods are pods not allocated to any sandbox and available for use. + idlePods []string + // supplySandbox is the number of additional sandboxes waiting for pods. + supplySandbox int32 + // poolDirty indicates if pool allocation annotation needs update. + poolDirty bool +} + // PoolReconciler reconciles a Pool object type PoolReconciler struct { client.Client - Scheme *runtime.Scheme - Recorder record.EventRecorder - Allocator Allocator + Scheme *runtime.Scheme + Recorder record.EventRecorder + Allocator Allocator + RestartTracker RestartTracker + RestartTimeout time.Duration } // +kubebuilder:rbac:groups=sandbox.opensandbox.io,resources=pools,verbs=get;list;watch;create;update;patch;delete @@ -76,6 +93,7 @@ type PoolReconciler struct { // +kubebuilder:rbac:groups=sandbox.opensandbox.io,resources=batchsandboxes,verbs=get;list;watch;patch // +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch;create;update;patch;delete // +kubebuilder:rbac:groups=core,resources=pods/status,verbs=get;update;patch +// +kubebuilder:rbac:groups=core,resources=pods/exec,verbs=create // +kubebuilder:rbac:groups=core,resources=events,verbs=get;list;watch;create;update;patch;delete func (r *PoolReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) { @@ -140,6 +158,34 @@ func (r *PoolReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl. return r.reconcilePool(ctx, pool, batchSandboxes, pods) } +func (r *PoolReconciler) collectRecyclingPods(batchSandboxes []*sandboxv1alpha1.BatchSandbox, pods []*corev1.Pod) (sets.Set[string], error) { + recycling := make(sets.Set[string]) + for _, batchSandbox := range batchSandboxes { + if batchSandbox.DeletionTimestamp != nil { + allocation, err := parseSandboxAllocation(batchSandbox) + if err != nil { + return nil, err + } + for _, podName := range allocation.Pods { + recycling.Insert(podName) + } + released, err := parseSandboxReleased(batchSandbox) + if err != nil { + return nil, err + } + for _, podName := range released.Pods { + recycling.Insert(podName) + } + } + } + for _, pod := range pods { + if isRecycling(pod) { + recycling.Insert(pod.Name) + } + } + return recycling, nil +} + // reconcilePool contains the main reconciliation logic func (r *PoolReconciler) reconcilePool(ctx context.Context, pool *sandboxv1alpha1.Pool, batchSandboxes []*sandboxv1alpha1.BatchSandbox, pods []*corev1.Pod) (ctrl.Result, error) { log := logf.FromContext(ctx) @@ -152,33 +198,41 @@ func (r *PoolReconciler) reconcilePool(ctx context.Context, pool *sandboxv1alpha return err } - // 2. Schedule and allocate - podAllocation, pendingSyncs, idlePods, supplySandbox, poolDirty, err := r.scheduleSandbox(ctx, latestPool, batchSandboxes, pods) + // 2. First, handle Pod Recycle to ensure pods are ready for scheduling + recyclingPods, err := r.collectRecyclingPods(batchSandboxes, pods) if err != nil { + log.Error(err, "Failed to collect recycling pods") return err } - needReconcile := false - delay := time.Duration(0) - if supplySandbox > 0 && len(idlePods) > 0 { + delay := defaultRetryTime + if needReconcile, err = r.handlePodRecycle(ctx, latestPool, pods); err != nil { + log.Error(err, "Failed to handle pod recycle") + return err + } + // 3. Schedule and allocate (after recycling is handled) + scheRes, err := r.scheduleSandbox(ctx, latestPool, batchSandboxes, pods, recyclingPods) + if err != nil { + return err + } + + if scheRes.supplySandbox > 0 && len(scheRes.idlePods) > 0 { needReconcile = true delay = defaultRetryTime } - if int32(len(idlePods)) >= supplySandbox { - supplySandbox = 0 + if int32(len(scheRes.idlePods)) >= scheRes.supplySandbox { + scheRes.supplySandbox = 0 } else { - supplySandbox -= int32(len(idlePods)) + scheRes.supplySandbox -= int32(len(scheRes.idlePods)) } - - if poolDirty { - if err := r.Allocator.PersistPoolAllocation(ctx, latestPool, &AllocStatus{PodAllocation: podAllocation}); err != nil { + if scheRes.poolDirty { + if err := r.Allocator.PersistPoolAllocation(ctx, latestPool, &AllocStatus{PodAllocation: scheRes.podAllocation}); err != nil { log.Error(err, "Failed to persist pool allocation") return err } } - var syncErrs []error - for _, syncInfo := range pendingSyncs { + for _, syncInfo := range scheRes.pendingSyncs { if err := r.Allocator.SyncSandboxAllocation(ctx, syncInfo.Sandbox, syncInfo.Pods); err != nil { log.Error(err, "Failed to sync sandbox allocation", "sandbox", syncInfo.SandboxName) syncErrs = append(syncErrs, fmt.Errorf("failed to sync sandbox %s: %w", syncInfo.SandboxName, err)) @@ -194,23 +248,24 @@ func (r *PoolReconciler) reconcilePool(ctx context.Context, pool *sandboxv1alpha if err != nil { return err } - latestIdlePods, deleteOld, supplyNew := r.updatePool(ctx, latestRevision, pods, idlePods) + latestIdlePods, deleteOld, supplyNew := r.updatePool(ctx, latestRevision, pods, scheRes.idlePods) args := &scaleArgs{ latestRevision: latestRevision, pool: latestPool, pods: pods, - allocatedCnt: int32(len(podAllocation)), + allocatedCnt: int32(len(scheRes.podAllocation)), + recycling: recyclingPods, idlePods: latestIdlePods, redundantPods: deleteOld, - supplyCnt: supplySandbox + supplyNew, + supplyCnt: scheRes.supplySandbox + supplyNew, } if err := r.scalePool(ctx, args); err != nil { return err } // 6. Update Status - if err := r.updatePoolStatus(ctx, latestRevision, latestPool, pods, podAllocation); err != nil { + if err := r.updatePoolStatus(ctx, latestRevision, latestPool, pods, scheRes.podAllocation); err != nil { return err } @@ -260,6 +315,9 @@ func (r *PoolReconciler) SetupWithManager(mgr ctrl.Manager) error { if oldObj.Spec.Replicas != newObj.Spec.Replicas { return true } + if oldObj.DeletionTimestamp == nil && newObj.DeletionTimestamp != nil { + return true + } return false }, DeleteFunc: func(e event.DeleteEvent) bool { @@ -307,26 +365,37 @@ func (r *PoolReconciler) SetupWithManager(mgr ctrl.Manager) error { Complete(r) } -func (r *PoolReconciler) scheduleSandbox(ctx context.Context, pool *sandboxv1alpha1.Pool, batchSandboxes []*sandboxv1alpha1.BatchSandbox, pods []*corev1.Pod) (map[string]string, []SandboxSyncInfo, []string, int32, bool, error) { +func (r *PoolReconciler) scheduleSandbox(ctx context.Context, pool *sandboxv1alpha1.Pool, batchSandboxes []*sandboxv1alpha1.BatchSandbox, pods []*corev1.Pod, recyclingPods sets.Set[string]) (*scheduleResult, error) { log := logf.FromContext(ctx) spec := &AllocSpec{ - Sandboxes: batchSandboxes, - Pool: pool, - Pods: pods, + Sandboxes: batchSandboxes, + Pool: pool, + Pods: pods, + RecyclingPods: recyclingPods, } status, pendingSyncs, poolDirty, err := r.Allocator.Schedule(ctx, spec) if err != nil { - return nil, nil, nil, 0, false, err + return nil, err } idlePods := make([]string, 0) for _, pod := range pods { - if _, ok := status.PodAllocation[pod.Name]; !ok { - idlePods = append(idlePods, pod.Name) + if _, ok := status.PodAllocation[pod.Name]; ok { + continue } + if recyclingPods.Has(pod.Name) { + continue + } + idlePods = append(idlePods, pod.Name) } log.Info("Schedule result", "pool", pool.Name, "allocated", len(status.PodAllocation), "idlePods", len(idlePods), "supplement", status.PodSupplement, "pendingSyncs", len(pendingSyncs), "poolDirty", poolDirty) - return status.PodAllocation, pendingSyncs, idlePods, status.PodSupplement, poolDirty, nil + return &scheduleResult{ + podAllocation: status.PodAllocation, + pendingSyncs: pendingSyncs, + idlePods: idlePods, + supplySandbox: status.PodSupplement, + poolDirty: poolDirty, + }, nil } func (r *PoolReconciler) updatePool(ctx context.Context, latestRevision string, pods []*corev1.Pod, idlePods []string) ([]string, []string, int32) { @@ -364,7 +433,8 @@ type scaleArgs struct { pool *sandboxv1alpha1.Pool pods []*corev1.Pod allocatedCnt int32 - supplyCnt int32 // to create + recycling sets.Set[string] // pods that are restarting and not available + supplyCnt int32 // to create idlePods []string redundantPods []string } @@ -380,9 +450,11 @@ func (r *PoolReconciler) scalePool(ctx context.Context, args *scaleArgs) error { } totalCnt := int32(len(args.pods)) allocatedCnt := args.allocatedCnt + recycling := args.recycling supplyCnt := args.supplyCnt redundantPods := args.redundantPods - bufferCnt := totalCnt - allocatedCnt + // Buffer count excludes allocated and restarting pods + bufferCnt := totalCnt - allocatedCnt - int32(len(recycling)) // Calculate desired buffer cnt. desiredBufferCnt := bufferCnt @@ -399,7 +471,7 @@ func (r *PoolReconciler) scalePool(ctx context.Context, args *scaleArgs) error { } log.Info("Scale pool decision", "pool", pool.Name, - "totalCnt", totalCnt, "allocatedCnt", allocatedCnt, "bufferCnt", bufferCnt, + "totalCnt", totalCnt, "allocatedCnt", allocatedCnt, "recycling", recycling, "bufferCnt", bufferCnt, "desiredBufferCnt", desiredBufferCnt, "supplyCnt", supplyCnt, "desiredTotalCnt", desiredTotalCnt, "redundantPods", len(redundantPods), "idlePods", len(args.idlePods)) @@ -418,7 +490,7 @@ func (r *PoolReconciler) scalePool(ctx context.Context, args *scaleArgs) error { if desiredTotalCnt < totalCnt { scaleIn = totalCnt - desiredTotalCnt } - podsToDelete := r.pickPodsToDelete(pods, args.idlePods, args.redundantPods, scaleIn) + podsToDelete := r.pickPodsToDelete(pods, args.idlePods, args.redundantPods, scaleIn, args.recycling) log.Info("Scaling down pool", "pool", pool.Name, "scaleIn", scaleIn, "redundantPods", len(redundantPods), "podsToDelete", len(podsToDelete)) for _, pod := range podsToDelete { log.Info("Deleting pool pod", "pool", pool.Name, "pod", pod.Name) @@ -434,33 +506,41 @@ func (r *PoolReconciler) scalePool(ctx context.Context, args *scaleArgs) error { func (r *PoolReconciler) updatePoolStatus(ctx context.Context, latestRevision string, pool *sandboxv1alpha1.Pool, pods []*corev1.Pod, podAllocation map[string]string) error { oldStatus := pool.Status.DeepCopy() availableCnt := int32(0) + restartingCnt := int32(0) for _, pod := range pods { if _, ok := podAllocation[pod.Name]; ok { continue } - if pod.Status.Phase != corev1.PodRunning { + // Count restarting pods regardless of phase + if isRestarting(pod) { + restartingCnt++ continue } - availableCnt++ + if pod.Status.Phase == corev1.PodRunning && !isRecycling(pod) { + availableCnt++ + } + // Non-running, non-restarting, or non-allocatable pods are implicitly counted in Total - Allocated - Available - Restarting } + pool.Status.ObservedGeneration = pool.Generation pool.Status.Total = int32(len(pods)) pool.Status.Allocated = int32(len(podAllocation)) pool.Status.Available = availableCnt + pool.Status.Restarting = restartingCnt pool.Status.Revision = latestRevision if equality.Semantic.DeepEqual(oldStatus, pool.Status) { return nil } log := logf.FromContext(ctx) log.Info("Update pool status", "ObservedGeneration", pool.Status.ObservedGeneration, "Total", pool.Status.Total, - "Allocated", pool.Status.Allocated, "Available", pool.Status.Available, "Revision", pool.Status.Revision) + "Allocated", pool.Status.Allocated, "Available", pool.Status.Available, "Restarting", pool.Status.Restarting, "Revision", pool.Status.Revision) if err := r.Status().Update(ctx, pool); err != nil { return err } return nil } -func (r *PoolReconciler) pickPodsToDelete(pods []*corev1.Pod, idlePodNames []string, redundantPodNames []string, scaleIn int32) []*corev1.Pod { +func (r *PoolReconciler) pickPodsToDelete(pods []*corev1.Pod, idlePodNames []string, redundantPodNames []string, scaleIn int32, recycling sets.Set[string]) []*corev1.Pod { var idlePods []*corev1.Pod podMap := make(map[string]*corev1.Pod) for _, pod := range pods { @@ -483,12 +563,18 @@ func (r *PoolReconciler) pickPodsToDelete(pods []*corev1.Pod, idlePodNames []str if !ok { continue } + if recycling.Has(pod.Name) { + continue + } podsToDelete = append(podsToDelete, pod) } for _, pod := range idlePods { // delete pod from pool scale if scaleIn <= 0 { break } + if recycling.Has(pod.Name) { + continue + } if pod.DeletionTimestamp == nil { podsToDelete = append(podsToDelete, pod) } @@ -520,3 +606,43 @@ func (r *PoolReconciler) createPoolPod(ctx context.Context, pool *sandboxv1alpha r.Recorder.Eventf(pool, corev1.EventTypeNormal, "SuccessfulCreate", "Created pool pod: %v", pod.Name) return nil } + +// handlePodRecycle handles Pod recycle based on PodRecyclePolicy. +// It should be called when a Pod is released from BatchSandbox. +func (r *PoolReconciler) handlePodRecycle(ctx context.Context, pool *sandboxv1alpha1.Pool, pods []*corev1.Pod) (bool, error) { + log := logf.FromContext(ctx) + errs := make([]error, 0) + policy := sandboxv1alpha1.PodRecyclePolicyDelete + if pool.Spec.PodRecyclePolicy != "" { + policy = pool.Spec.PodRecyclePolicy + } + timeout := r.RestartTimeout + if pool.Annotations != nil { + if timeoutSec := pool.Annotations[AnnoPodRecycleTimeoutSec]; timeoutSec != "" { + if sec, err := time.ParseDuration(timeoutSec + "s"); err != nil { + log.V(1).Error(err, "Failed to parse pod recycle timeout, use default timeout", "timeoutSec", timeoutSec) + } else { + timeout = sec + } + } + } + needReconcile := false + for _, pod := range pods { + if !isRecycling(pod) { + continue + } + needReconcile = true + var err error + if policy == sandboxv1alpha1.PodRecyclePolicyRestart { + err = r.RestartTracker.HandleRestart(ctx, pod, timeout) + } else { + log.Info("Deleting Pod with Delete policy", "pod", pod.Name) + err = r.Delete(ctx, pod) + } + if err != nil { + log.Error(err, "Failed to recycle pod", "pod", pod.Name) + errs = append(errs, err) + } + } + return needReconcile, gerrors.Join(errs...) +} diff --git a/kubernetes/internal/controller/restart_tracker.go b/kubernetes/internal/controller/restart_tracker.go new file mode 100644 index 000000000..5d2b850bc --- /dev/null +++ b/kubernetes/internal/controller/restart_tracker.go @@ -0,0 +1,281 @@ +// Copyright 2025 Alibaba Group Holding Ltd. +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package controller + +import ( + "context" + "io" + "strings" + "time" + + corev1 "k8s.io/api/core/v1" + "k8s.io/client-go/kubernetes" + "k8s.io/client-go/kubernetes/scheme" + "k8s.io/client-go/rest" + "k8s.io/client-go/tools/remotecommand" + "sigs.k8s.io/controller-runtime/pkg/client" + logf "sigs.k8s.io/controller-runtime/pkg/log" + + "github.com/alibaba/OpenSandbox/sandbox-k8s/internal/utils" +) + +// Restart timeout configurations +const ( + defaultRestartTimeout = 90 * time.Second + killTimeout = 30 * time.Second +) + +// restartTracker manages the Pod restart lifecycle as part of the PoolReconciler. +// It encapsulates all restart-related logic including triggering kills, tracking +// restart progress, and determining when Pods are ready for reuse. +// +// Simplified state machine: +// +// None → Restarting (trigger kill, fire-and-forget) +// ↓ (each reconcile: check final result) +// all restarted & ready → None (clear annotation, reuse) +// timeout / CrashLoop → delete Pod +type restartTracker struct { + client client.Client + kubeClient kubernetes.Interface + restConfig *rest.Config +} + +type RestartTracker interface { + HandleRestart(ctx context.Context, pod *corev1.Pod, timeout time.Duration) error +} + +// NewRestartTracker creates a new restartTracker with custom restart timeout. +func NewRestartTracker(c client.Client, kubeClient kubernetes.Interface, restConfig *rest.Config) RestartTracker { + r := &restartTracker{ + client: c, + kubeClient: kubeClient, + restConfig: restConfig, + } + return r +} + +// HandleRestart handles the Restart recycle policy for a Pod. +// If the Pod has already been triggered for restart, it checks the restart status. +// Otherwise, it initializes the restart and kicks off a fire-and-forget kill goroutine. +func (t *restartTracker) HandleRestart(ctx context.Context, pod *corev1.Pod, timeout time.Duration) error { + log := logf.FromContext(ctx) + // Parse existing meta + meta, err := parsePodRecycleMeta(pod) + if err != nil { + log.Error(err, "Failed to parse recycle meta, will reset and retry", "pod", pod.Name) + meta = &PodRecycleMeta{} + } + // If already triggered, check restart progress + if meta.TriggeredAt > 0 && meta.State == RecycleStateRestarting { + return t.checkRestartStatus(ctx, pod, timeout) + } + + meta.TriggeredAt = time.Now().UnixMilli() + meta.State = RecycleStateRestarting + // Record current restart counts to detect restart even when StartedAt has second-level precision + meta.ContainerRestartCounts = make(map[string]int32) + meta.ContainerStartedAt = make(map[string]int64) + for _, container := range pod.Status.ContainerStatuses { + meta.ContainerRestartCounts[container.Name] = container.RestartCount + if container.State.Running != nil { + meta.ContainerStartedAt[container.Name] = container.State.Running.StartedAt.UnixMilli() + } + } + if err = t.updatePodRecycleMeta(ctx, pod, meta); err != nil { + log.Error(err, "Failed to update recycle meta", "pod", pod.Name) + return err + } + // Fire-and-forget: kill containers in background. + // This is done after updating the annotation to ensure the restart is tracked. + t.killPodContainers(ctx, pod) + log.Info("Triggered restart for Pod", "pod", pod.Name, "triggeredAt", meta.TriggeredAt) + return nil +} + +// updatePodRecycleMeta updates the recycle metadata to Pod annotations and sets the recycle-confirmed label. +// It reads the deallocated-from label value and sets it as recycle-confirmed label. +func (t *restartTracker) updatePodRecycleMeta(ctx context.Context, pod *corev1.Pod, meta *PodRecycleMeta) error { + old := pod.DeepCopy() + setPodRecycleMeta(pod, meta) + + // Set recycle-confirmed label from deallocated-from label value + deallocatedFrom := pod.Labels[LabelPodDeallocatedFrom] + if deallocatedFrom != "" { + pod.Labels[LabelPodRecycleConfirmed] = deallocatedFrom + } + + patch := client.MergeFrom(old) + return t.client.Patch(ctx, pod, patch) +} + +// killPodContainers kills all containers in the Pod (excluding initContainers) +func (t *restartTracker) killPodContainers(ctx context.Context, pod *corev1.Pod) { + log := logf.FromContext(ctx) + for _, container := range pod.Spec.Containers { + go func(cName string) { + killCtx, cancel := context.WithTimeout(context.Background(), killTimeout) + defer cancel() + + if err := t.execGracefulKill(killCtx, pod, cName); err != nil { + log.Info("Graceful kill exec finished with error (may be expected)", + "pod", pod.Name, "container", cName, "err", err) + } else { + log.V(1).Info("Successfully triggered graceful kill", "pod", pod.Name, "container", cName) + } + }(container.Name) + } +} + +// execGracefulKill attempts to trigger a SIGTERM (15) signal to the container's PID 1. +func (t *restartTracker) execGracefulKill(ctx context.Context, pod *corev1.Pod, containerName string) error { + // Common shell entry points in various container images. + shellEntries := []string{"/bin/sh", "/usr/bin/sh", "sh"} + + var lastErr error + for _, entry := range shellEntries { + cmd := []string{ + entry, "-c", + "if [ -x /bin/kill ]; then /bin/kill -15 1; " + + "elif [ -x /usr/bin/kill ]; then /usr/bin/kill -15 1; " + + "else kill -15 1; fi", + } + err := t.executeExec(ctx, pod, containerName, cmd) + if err == nil { + return nil + } + lastErr = err + if !strings.Contains(err.Error(), "executable file not found") && + !strings.Contains(err.Error(), "no such file or directory") { + break + } + } + return lastErr +} + +// executeExec performs a low-level Pod exec operation. +func (t *restartTracker) executeExec(ctx context.Context, pod *corev1.Pod, containerName string, cmd []string) error { + req := t.kubeClient.CoreV1().RESTClient(). + Post(). + Namespace(pod.Namespace). + Resource("pods"). + Name(pod.Name). + SubResource("exec"). + VersionedParams(&corev1.PodExecOptions{ + Container: containerName, + Command: cmd, + Stdin: false, + Stdout: true, + Stderr: true, + }, scheme.ParameterCodec) + + executor, err := remotecommand.NewSPDYExecutor(t.restConfig, "POST", req.URL()) + if err != nil { + return err + } + return executor.StreamWithContext(ctx, remotecommand.StreamOptions{ + Stdout: io.Discard, + Stderr: io.Discard, + }) +} + +// checkRestartStatus checks if the Pod has completed restart and is ready to be reused. +func (t *restartTracker) checkRestartStatus(ctx context.Context, pod *corev1.Pod, timeout time.Duration) error { + log := logf.FromContext(ctx) + + meta, err := parsePodRecycleMeta(pod) + if err != nil { + log.Error(err, "Failed to parse recycle meta", "pod", pod.Name) + return err + } + + elapsed := time.Duration(time.Now().UnixMilli()-meta.TriggeredAt) * time.Millisecond + + allRestarted := true + triggerAt := time.UnixMilli(meta.TriggeredAt) + for _, container := range pod.Status.ContainerStatuses { + restarted := false + running := container.State.Running + // Check if container has restarted by either: + // 1. StartedAt time is after trigger time (original logic) + // 2. RestartCount has increased (handles same-second restarts) + // 3. StartedAt time is greater than recorded original StartedAt + if running != nil && running.StartedAt.Time.After(triggerAt) { + restarted = true + log.Info("Container restarted detected by start time after trigger", + "pod", pod.Name, "container", container.Name, + "trigger", triggerAt, "current", running.StartedAt.Time) + } else if originalCount, ok := meta.ContainerRestartCounts[container.Name]; ok && container.RestartCount > originalCount { + restarted = true + log.Info("Container restarted detected by restart count increased", + "pod", pod.Name, "container", container.Name, + "originalCount", originalCount, "currentCount", container.RestartCount) + } else if running != nil { + if originalStartedAt, ok := meta.ContainerStartedAt[container.Name]; ok { + if running.StartedAt.UnixMilli() > originalStartedAt { + restarted = true + log.Info("Container restarted detected by startedAt increased", + "pod", pod.Name, "container", container.Name, + "originalStartedAt", time.UnixMilli(originalStartedAt), "currentStartedAt", running.StartedAt.Time) + } + } + } + if !restarted || !container.Ready { + allRestarted = false + } + } + + podReady := utils.IsPodReady(pod) + if allRestarted && podReady { + if err = t.clearPodRecycleMeta(ctx, pod); err != nil { + return err + } + log.Info("Pod restart completed, ready for reuse", "pod", pod.Name, "elapsed", elapsed) + // Trigger requeue to ensure subsequent checks see the updated pod state. + // This prevents race conditions where another reconcile reads stale cached data. + } + restartTimeout := timeout + if restartTimeout == 0 { + restartTimeout = defaultRestartTimeout + } + if elapsed > restartTimeout { + log.Info("Pod restart timeout, deleting", "pod", pod.Name, + "elapsed", elapsed, "timeout", restartTimeout, + "allRestarted", allRestarted) + return t.client.Delete(ctx, pod) + } + log.Info("Pod still restarting", "pod", pod.Name, "elapsed", elapsed, + "allRestarted", allRestarted, "podReady", podReady, "timeout", restartTimeout, "elapsed", elapsed) + return nil +} + +// clearPodRecycleMeta clears the recycle metadata annotation from Pod and the deallocated-from label. +// It keeps the recycle-confirmed label as a receipt that recycling was processed. +// After successful patch, it re-fetches the pod to ensure the local object reflects the latest state. +func (t *restartTracker) clearPodRecycleMeta(ctx context.Context, pod *corev1.Pod) error { + old := pod.DeepCopy() + if pod.Annotations != nil { + delete(pod.Annotations, AnnoPodRecycleMeta) + } + if pod.Labels != nil { + delete(pod.Labels, LabelPodDeallocatedFrom) + } + + patch := client.MergeFrom(old) + if err := t.client.Patch(ctx, pod, patch); err != nil { + return err + } + return nil +} diff --git a/kubernetes/internal/controller/restart_tracker_test.go b/kubernetes/internal/controller/restart_tracker_test.go new file mode 100644 index 000000000..cb3bd4698 --- /dev/null +++ b/kubernetes/internal/controller/restart_tracker_test.go @@ -0,0 +1,15 @@ +// Copyright 2025 Alibaba Group Holding Ltd. +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package controller diff --git a/kubernetes/test/e2e/e2e_test.go b/kubernetes/test/e2e/e2e_test.go index 0e3c57c8c..3fc027790 100644 --- a/kubernetes/test/e2e/e2e_test.go +++ b/kubernetes/test/e2e/e2e_test.go @@ -16,1284 +16,19 @@ package e2e import ( "bytes" - "encoding/json" "fmt" "os" - "os/exec" "path/filepath" - "strings" "text/template" - "time" - - . "github.com/onsi/ginkgo/v2" - . "github.com/onsi/gomega" "github.com/alibaba/OpenSandbox/sandbox-k8s/test/utils" + . "github.com/onsi/ginkgo/v2" ) // namespace where the project is deployed in const namespace = "opensandbox-system" -var _ = Describe("Manager", Ordered, func() { - var controllerPodName string - - // Before running the tests, set up the environment by creating the namespace, - // enforce the restricted security policy to the namespace, installing CRDs, - // and deploying the controller. - BeforeAll(func() { - By("creating manager namespace") - cmd := exec.Command("kubectl", "create", "ns", namespace) - _, err := utils.Run(cmd) - Expect(err).NotTo(HaveOccurred(), "Failed to create namespace") - - By("labeling the namespace to enforce the restricted security policy") - cmd = exec.Command("kubectl", "label", "--overwrite", "ns", namespace, - "pod-security.kubernetes.io/enforce=restricted") - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred(), "Failed to label namespace with restricted policy") - - By("installing CRDs") - cmd = exec.Command("make", "install") - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred(), "Failed to install CRDs") - - By("deploying the controller-manager") - cmd = exec.Command("make", "deploy", fmt.Sprintf("CONTROLLER_IMG=%s", utils.ControllerImage)) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred(), "Failed to deploy the controller-manager") - }) - - // After all tests have been executed, clean up by undeploying the controller, uninstalling CRDs, - // and deleting the namespace. - AfterAll(func() { - By("cleaning up the curl pod for metrics") - cmd := exec.Command("kubectl", "delete", "pod", "curl-metrics", "-n", namespace) - _, _ = utils.Run(cmd) - - By("undeploying the controller-manager") - cmd = exec.Command("make", "undeploy") - _, _ = utils.Run(cmd) - - By("uninstalling CRDs") - cmd = exec.Command("make", "uninstall") - _, _ = utils.Run(cmd) - - By("removing manager namespace") - cmd = exec.Command("kubectl", "delete", "ns", namespace) - _, _ = utils.Run(cmd) - }) - - // After each test, check for failures and collect logs, events, - // and pod descriptions for debugging. - AfterEach(func() { - specReport := CurrentSpecReport() - if specReport.Failed() { - By("Fetching controller manager pod logs") - cmd := exec.Command("kubectl", "logs", controllerPodName, "-n", namespace) - controllerLogs, err := utils.Run(cmd) - if err == nil { - _, _ = fmt.Fprintf(GinkgoWriter, "Controller logs:\n %s", controllerLogs) - } else { - _, _ = fmt.Fprintf(GinkgoWriter, "Failed to get Controller logs: %s", err) - } - - By("Fetching Kubernetes events") - cmd = exec.Command("kubectl", "get", "events", "-n", namespace, "--sort-by=.lastTimestamp") - eventsOutput, err := utils.Run(cmd) - if err == nil { - _, _ = fmt.Fprintf(GinkgoWriter, "Kubernetes events:\n%s", eventsOutput) - } else { - _, _ = fmt.Fprintf(GinkgoWriter, "Failed to get Kubernetes events: %s", err) - } - - By("Fetching curl-metrics logs") - cmd = exec.Command("kubectl", "logs", "curl-metrics", "-n", namespace) - metricsOutput, err := utils.Run(cmd) - if err == nil { - _, _ = fmt.Fprintf(GinkgoWriter, "Metrics logs:\n %s", metricsOutput) - } else { - _, _ = fmt.Fprintf(GinkgoWriter, "Failed to get curl-metrics logs: %s", err) - } - - By("Fetching controller manager pod description") - cmd = exec.Command("kubectl", "describe", "pod", controllerPodName, "-n", namespace) - podDescription, err := utils.Run(cmd) - if err == nil { - fmt.Println("Pod description:\n", podDescription) - } else { - fmt.Println("Failed to describe controller pod") - } - } - }) - - SetDefaultEventuallyTimeout(2 * time.Minute) - SetDefaultEventuallyPollingInterval(time.Second) - - Context("Manager", func() { - It("should run successfully", func() { - By("validating that the controller-manager pod is running as expected") - verifyControllerUp := func(g Gomega) { - // Get the name of the controller-manager pod - goTemplate := `{{ range .items }}` + - `{{ if not .metadata.deletionTimestamp }}` + - `{{ .metadata.name }}` + - `{{ "\n" }}{{ end }}{{ end }}` - cmd := exec.Command("kubectl", "get", - "pods", "-l", "control-plane=controller-manager", - "-o", "go-template="+goTemplate, - "-n", namespace, - ) - - podOutput, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred(), "Failed to retrieve controller-manager pod information") - podNames := utils.GetNonEmptyLines(podOutput) - g.Expect(podNames).To(HaveLen(1), "expected 1 controller pod running") - controllerPodName = podNames[0] - g.Expect(controllerPodName).To(ContainSubstring("controller-manager")) - - // Validate the pod's status - cmd = exec.Command("kubectl", "get", - "pods", controllerPodName, "-o", "jsonpath={.status.phase}", - "-n", namespace, - ) - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal("Running"), "Incorrect controller-manager pod status") - } - Eventually(verifyControllerUp).Should(Succeed()) - }) - }) - - Context("Pool", func() { - BeforeAll(func() { - By("waiting for controller to be ready") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pods", "-l", "control-plane=controller-manager", - "-n", namespace, "-o", "jsonpath={.items[0].status.phase}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal("Running")) - }, 2*time.Minute).Should(Succeed()) - }) - - It("should correctly create pods and maintain pool status", func() { - const poolName = "test-pool-basic" - const testNamespace = "default" - const poolMin = 2 - const poolMax = 5 - const bufferMin = 1 - const bufferMax = 3 - - By("creating a basic Pool") - poolYAML, err := renderTemplate("testdata/pool-basic.yaml", map[string]interface{}{ - "PoolName": poolName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "BufferMax": bufferMax, - "BufferMin": bufferMin, - "PoolMax": poolMax, - "PoolMin": poolMin, - }) - Expect(err).NotTo(HaveOccurred()) - - poolFile := filepath.Join("/tmp", "test-pool-basic.yaml") - err = os.WriteFile(poolFile, []byte(poolYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(poolFile) - - cmd := exec.Command("kubectl", "apply", "-f", poolFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred(), "Failed to create Pool") - - By("verifying Pool creates pods and maintains correct status") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status}") - statusOutput, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - g.Expect(statusOutput).To(ContainSubstring(`"total":`), "Pool status should have total field") - g.Expect(statusOutput).To(ContainSubstring(`"allocated":`), "Pool status should have allocated field") - g.Expect(statusOutput).To(ContainSubstring(`"available":`), "Pool status should have available field") - - cmd = exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.total}") - totalStr, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - total := 0 - if totalStr != "" { - fmt.Sscanf(totalStr, "%d", &total) - } - g.Expect(total).To(BeNumerically(">=", poolMin), "Pool total should be >= poolMin") - g.Expect(total).To(BeNumerically("<=", poolMax), "Pool total should be <= poolMax") - }, 2*time.Minute).Should(Succeed()) - - By("verifying pods are created") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pods", "-n", testNamespace, - "-l", fmt.Sprintf("sandbox.opensandbox.io/pool-name=%s", poolName), - "-o", "jsonpath={.items[*].metadata.name}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).NotTo(BeEmpty(), "Pool should create pods") - }, 2*time.Minute).Should(Succeed()) - - By("cleaning up the Pool") - cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - }) - - It("should correctly manage capacity when poolMin and poolMax change", func() { - const poolName = "test-pool-capacity" - const testNamespace = "default" - - By("creating a Pool with initial capacity") - poolYAML, err := renderTemplate("testdata/pool-basic.yaml", map[string]interface{}{ - "PoolName": poolName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "BufferMax": 3, - "BufferMin": 1, - "PoolMax": 5, - "PoolMin": 2, - }) - Expect(err).NotTo(HaveOccurred()) - - poolFile := filepath.Join("/tmp", "test-pool-capacity.yaml") - err = os.WriteFile(poolFile, []byte(poolYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(poolFile) - - cmd := exec.Command("kubectl", "apply", "-f", poolFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("waiting for initial Pool to be ready") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.total}") - totalStr, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - total := 0 - if totalStr != "" { - fmt.Sscanf(totalStr, "%d", &total) - } - g.Expect(total).To(BeNumerically(">=", 2)) - }, 2*time.Minute).Should(Succeed()) - - By("increasing poolMin to trigger scale up") - poolYAML, err = renderTemplate("testdata/pool-basic.yaml", map[string]interface{}{ - "PoolName": poolName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "BufferMax": 3, - "BufferMin": 1, - "PoolMax": 10, - "PoolMin": 5, - }) - Expect(err).NotTo(HaveOccurred()) - - err = os.WriteFile(poolFile, []byte(poolYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - - cmd = exec.Command("kubectl", "apply", "-f", poolFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("verifying Pool scales up to meet new poolMin") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.total}") - totalStr, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - total := 0 - if totalStr != "" { - fmt.Sscanf(totalStr, "%d", &total) - } - g.Expect(total).To(BeNumerically(">=", 5), "Pool should scale up to meet poolMin=5") - g.Expect(total).To(BeNumerically("<=", 10), "Pool should not exceed poolMax=10") - }, 2*time.Minute).Should(Succeed()) - - By("decreasing poolMax to below current total") - poolYAML, err = renderTemplate("testdata/pool-basic.yaml", map[string]interface{}{ - "PoolName": poolName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "BufferMax": 2, - "BufferMin": 1, - "PoolMax": 3, - "PoolMin": 2, - }) - Expect(err).NotTo(HaveOccurred()) - - err = os.WriteFile(poolFile, []byte(poolYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - - cmd = exec.Command("kubectl", "apply", "-f", poolFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("verifying Pool respects new poolMax constraint") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.total}") - totalStr, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - total := 0 - if totalStr != "" { - fmt.Sscanf(totalStr, "%d", &total) - } - g.Expect(total).To(BeNumerically("<=", 3), "Pool should scale down to meet poolMax=3") - }, 2*time.Minute).Should(Succeed()) - - By("cleaning up the Pool") - cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - }) - - It("should upgrade pool template correctly", func() { - const poolName = "test-pool-upgrade" - const testNamespace = "default" - const batchSandboxName = "test-bs-for-upgrade" - - By("creating a Pool with initial template") - poolYAML, err := renderTemplate("testdata/pool-basic.yaml", map[string]interface{}{ - "PoolName": poolName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "BufferMax": 3, - "BufferMin": 2, - "PoolMax": 5, - "PoolMin": 2, - }) - Expect(err).NotTo(HaveOccurred()) - - poolFile := filepath.Join("/tmp", "test-pool-upgrade.yaml") - err = os.WriteFile(poolFile, []byte(poolYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(poolFile) - - cmd := exec.Command("kubectl", "apply", "-f", poolFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("waiting for Pool to be ready") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.total}") - totalStr, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(totalStr).NotTo(BeEmpty()) - }, 2*time.Minute).Should(Succeed()) - - By("allocating a pod from the pool via BatchSandbox") - batchSandboxYAML, err := renderTemplate("testdata/batchsandbox-pooled-no-expire.yaml", map[string]interface{}{ - "BatchSandboxName": batchSandboxName, - "Namespace": testNamespace, - "Replicas": 1, - "PoolName": poolName, - }) - Expect(err).NotTo(HaveOccurred()) - - bsFile := filepath.Join("/tmp", "test-bs-upgrade.yaml") - err = os.WriteFile(bsFile, []byte(batchSandboxYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(bsFile) - - cmd = exec.Command("kubectl", "apply", "-f", bsFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("waiting for BatchSandbox to allocate pod") - var allocatedPodNames []string - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal("1")) - - cmd = exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") - allocStatusJSON, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(allocStatusJSON).NotTo(BeEmpty(), "alloc-status annotation should exist") - - var allocStatus struct { - Pods []string `json:"pods"` - } - err = json.Unmarshal([]byte(allocStatusJSON), &allocStatus) - g.Expect(err).NotTo(HaveOccurred()) - - allocatedPodNames = allocStatus.Pods - g.Expect(len(allocatedPodNames)).To(Equal(1), "Should have 1 allocated pod") - }, 2*time.Minute).Should(Succeed()) - - By("getting all pool pods") - cmd = exec.Command("kubectl", "get", "pods", "-n", testNamespace, - "-l", fmt.Sprintf("sandbox.opensandbox.io/pool-name=%s", poolName), - "-o", "jsonpath={.items[*].metadata.name}") - allPoolPodsStr, err := utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - allPoolPods := strings.Fields(allPoolPodsStr) - - By("calculating available pods (all pool pods - allocated pods)") - availablePodsBeforeUpgrade := []string{} - allocatedPodMap := make(map[string]bool) - for _, podName := range allocatedPodNames { - allocatedPodMap[podName] = true - } - for _, podName := range allPoolPods { - if !allocatedPodMap[podName] { - availablePodsBeforeUpgrade = append(availablePodsBeforeUpgrade, podName) - } - } - - By("updating Pool template with new environment variable") - updatedPoolYAML, err := renderTemplate("testdata/pool-with-env.yaml", map[string]interface{}{ - "PoolName": poolName, - "Namespace": testNamespace, - "SandboxImage": utils.SandboxImage, - "BufferMax": 3, - "BufferMin": 2, - "PoolMax": 5, - "PoolMin": 2, - }) - Expect(err).NotTo(HaveOccurred()) - - err = os.WriteFile(poolFile, []byte(updatedPoolYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - - cmd = exec.Command("kubectl", "apply", "-f", poolFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("verifying allocated pod is NOT upgraded") - Consistently(func(g Gomega) { - for _, allocatedPod := range allocatedPodNames { - cmd := exec.Command("kubectl", "get", "pod", allocatedPod, "-n", testNamespace, - "-o", "jsonpath={.metadata.name}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal(allocatedPod), "Allocated pod should not be recreated") - } - }, 30*time.Second, 3*time.Second).Should(Succeed()) - - By("verifying available pods are recreated with new template") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pods", "-n", testNamespace, - "-l", fmt.Sprintf("sandbox.opensandbox.io/pool-name=%s", poolName), - "-o", "jsonpath={.items[*].metadata.name}") - allPodsAfterStr, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - allPodsAfter := strings.Fields(allPodsAfterStr) - - // Get currently allocated pods - cmd = exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") - allocStatusJSON, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - var allocStatus struct { - Pods []string `json:"pods"` - } - err = json.Unmarshal([]byte(allocStatusJSON), &allocStatus) - g.Expect(err).NotTo(HaveOccurred()) - - currentAllocatedPods := make(map[string]bool) - for _, podName := range allocStatus.Pods { - currentAllocatedPods[podName] = true - } - - // Calculate available pods after upgrade - availablePodsAfterUpgrade := []string{} - for _, podName := range allPodsAfter { - if !currentAllocatedPods[podName] { - availablePodsAfterUpgrade = append(availablePodsAfterUpgrade, podName) - } - } - - // Check if at least one available pod was recreated - recreated := false - for _, oldPod := range availablePodsBeforeUpgrade { - found := false - for _, newPod := range availablePodsAfterUpgrade { - if oldPod == newPod { - found = true - break - } - } - if !found { - recreated = true - break - } - } - g.Expect(recreated).To(BeTrue(), "At least one available pod should be recreated") - }, 3*time.Minute).Should(Succeed()) - - By("verifying new pods have the upgraded environment variable") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pods", "-n", testNamespace, - "-l", fmt.Sprintf("sandbox.opensandbox.io/pool-name=%s", poolName), - "-o", "json") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - var podList struct { - Items []struct { - Metadata struct { - Name string `json:"name"` - } `json:"metadata"` - Spec struct { - Containers []struct { - Name string `json:"name"` - Env []struct { - Name string `json:"name"` - Value string `json:"value"` - } `json:"env"` - } `json:"containers"` - } `json:"spec"` - } `json:"items"` - } - err = json.Unmarshal([]byte(output), &podList) - g.Expect(err).NotTo(HaveOccurred()) - - // Get currently allocated pods - cmd = exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") - allocStatusJSON, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - var allocStatus struct { - Pods []string `json:"pods"` - } - err = json.Unmarshal([]byte(allocStatusJSON), &allocStatus) - g.Expect(err).NotTo(HaveOccurred()) - - allocatedPodMap := make(map[string]bool) - for _, podName := range allocStatus.Pods { - allocatedPodMap[podName] = true - } - - // Find at least one available pod with UPGRADED=true - foundUpgraded := false - for _, pod := range podList.Items { - if !allocatedPodMap[pod.Metadata.Name] { - // This is an available pod - for _, container := range pod.Spec.Containers { - if container.Name == "sandbox-container" { - for _, env := range container.Env { - if env.Name == "UPGRADED" && env.Value == "true" { - foundUpgraded = true - break - } - } - } - } - } - } - g.Expect(foundUpgraded).To(BeTrue(), "At least one available pod should have UPGRADED=true env var") - }, 2*time.Minute).Should(Succeed()) - - By("cleaning up BatchSandbox and Pool") - cmd = exec.Command("kubectl", "delete", "batchsandbox", batchSandboxName, "-n", testNamespace) - _, _ = utils.Run(cmd) - - cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - }) - }) - - Context("BatchSandbox", func() { - BeforeAll(func() { - By("waiting for controller to be ready") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pods", "-l", "control-plane=controller-manager", - "-n", namespace, "-o", "jsonpath={.items[0].status.phase}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal("Running")) - }, 2*time.Minute).Should(Succeed()) - }) - - It("should work correctly in non-pooled mode", func() { - const batchSandboxName = "test-bs-non-pooled" - const testNamespace = "default" - const replicas = 2 - - By("creating a non-pooled BatchSandbox") - bsYAML, err := renderTemplate("testdata/batchsandbox-non-pooled.yaml", map[string]interface{}{ - "BatchSandboxName": batchSandboxName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "Replicas": replicas, - }) - Expect(err).NotTo(HaveOccurred()) - - bsFile := filepath.Join("/tmp", "test-bs-non-pooled.yaml") - err = os.WriteFile(bsFile, []byte(bsYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(bsFile) - - cmd := exec.Command("kubectl", "apply", "-f", bsFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("verifying pods are created directly from template") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pods", "-n", testNamespace, - "-o", "json") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - var podList struct { - Items []struct { - Metadata struct { - Name string `json:"name"` - OwnerReferences []struct { - Kind string `json:"kind"` - Name string `json:"name"` - UID string `json:"uid"` - } `json:"ownerReferences"` - } `json:"metadata"` - } `json:"items"` - } - err = json.Unmarshal([]byte(output), &podList) - g.Expect(err).NotTo(HaveOccurred()) - - // Find pods owned by this BatchSandbox - ownedPods := []string{} - for _, pod := range podList.Items { - for _, owner := range pod.Metadata.OwnerReferences { - if owner.Kind == "BatchSandbox" && owner.Name == batchSandboxName { - ownedPods = append(ownedPods, pod.Metadata.Name) - break - } - } - } - g.Expect(len(ownedPods)).To(Equal(replicas), "Should create %d pods", replicas) - }, 2*time.Minute).Should(Succeed()) - - By("verifying BatchSandbox status is correctly updated") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.status}") - statusOutput, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(statusOutput).To(ContainSubstring(fmt.Sprintf(`"replicas":%d`, replicas))) - g.Expect(statusOutput).To(ContainSubstring(fmt.Sprintf(`"allocated":%d`, replicas))) - g.Expect(statusOutput).To(ContainSubstring(fmt.Sprintf(`"ready":%d`, replicas))) - }, 2*time.Minute).Should(Succeed()) - - By("verifying endpoint annotation is set") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/endpoints}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).NotTo(BeEmpty()) - endpoints := strings.Split(output, ",") - g.Expect(len(endpoints)).To(Equal(replicas)) - }, 30*time.Second).Should(Succeed()) - - By("cleaning up BatchSandbox") - cmd = exec.Command("kubectl", "delete", "batchsandbox", batchSandboxName, "-n", testNamespace) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("verifying pods are deleted") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pods", "-n", testNamespace, "-o", "json") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - var podList struct { - Items []struct { - Metadata struct { - Name string `json:"name"` - DeletionTimestamp *string `json:"deletionTimestamp"` - OwnerReferences []struct { - Kind string `json:"kind"` - Name string `json:"name"` - } `json:"ownerReferences"` - } `json:"metadata"` - } `json:"items"` - } - err = json.Unmarshal([]byte(output), &podList) - g.Expect(err).NotTo(HaveOccurred()) - - // Check no pods are owned by this BatchSandbox or they have deletionTimestamp - for _, pod := range podList.Items { - for _, owner := range pod.Metadata.OwnerReferences { - if owner.Kind == "BatchSandbox" && owner.Name == batchSandboxName { - g.Expect(pod.Metadata.DeletionTimestamp).NotTo(BeNil(), - "Pod %s owned by BatchSandbox should have deletionTimestamp set", pod.Metadata.Name) - } - } - } - }, 2*time.Minute).Should(Succeed()) - }) - - It("should work correctly in pooled mode", func() { - const poolName = "test-pool-for-bs" - const batchSandboxName = "test-bs-pooled" - const testNamespace = "default" - const replicas = 2 - - By("creating a Pool") - poolYAML, err := renderTemplate("testdata/pool-basic.yaml", map[string]interface{}{ - "PoolName": poolName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "BufferMax": 3, - "BufferMin": 2, - "PoolMax": 5, - "PoolMin": 2, - }) - Expect(err).NotTo(HaveOccurred()) - - poolFile := filepath.Join("/tmp", "test-pool-for-bs.yaml") - err = os.WriteFile(poolFile, []byte(poolYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(poolFile) - - cmd := exec.Command("kubectl", "apply", "-f", poolFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("waiting for Pool to be ready") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.total}") - totalStr, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(totalStr).NotTo(BeEmpty()) - }, 2*time.Minute).Should(Succeed()) - - By("creating a pooled BatchSandbox") - bsYAML, err := renderTemplate("testdata/batchsandbox-pooled-no-expire.yaml", map[string]interface{}{ - "BatchSandboxName": batchSandboxName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "Replicas": replicas, - "PoolName": poolName, - }) - Expect(err).NotTo(HaveOccurred()) - - bsFile := filepath.Join("/tmp", "test-bs-pooled.yaml") - err = os.WriteFile(bsFile, []byte(bsYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(bsFile) - - cmd = exec.Command("kubectl", "apply", "-f", bsFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("verifying BatchSandbox allocates pods from pool") - Eventually(func(g Gomega) { - // Verify alloc-status annotation contains pool pod names - cmd = exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") - allocStatusJSON, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(allocStatusJSON).NotTo(BeEmpty(), "alloc-status annotation should exist") - - var allocStatus struct { - Pods []string `json:"pods"` - } - err = json.Unmarshal([]byte(allocStatusJSON), &allocStatus) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(len(allocStatus.Pods)).To(Equal(replicas), "Should have %d pods in alloc-status", replicas) - - // Verify the pods in alloc-status are from the pool - for _, podName := range allocStatus.Pods { - cmd = exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, - "-o", "jsonpath={.metadata.labels.sandbox\\.opensandbox\\.io/pool-name}") - poolLabel, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(poolLabel).To(Equal(poolName), "Pod %s should be from pool %s", podName, poolName) - } - }, 2*time.Minute).Should(Succeed()) - - By("verifying BatchSandbox status is correctly updated") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.status}") - statusOutput, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(statusOutput).To(ContainSubstring(fmt.Sprintf(`"replicas":%d`, replicas))) - g.Expect(statusOutput).To(ContainSubstring(fmt.Sprintf(`"ready":%d`, replicas))) - }, 30*time.Second).Should(Succeed()) - - By("verifying endpoint annotation is set") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/endpoints}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).NotTo(BeEmpty()) - endpoints := strings.Split(output, ",") - g.Expect(len(endpoints)).To(Equal(replicas)) - }, 30*time.Second).Should(Succeed()) - - By("recording Pool allocated count") - cmd = exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - allocatedBefore, err := utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("cleaning up BatchSandbox") - cmd = exec.Command("kubectl", "delete", "batchsandbox", batchSandboxName, "-n", testNamespace) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("verifying pods are returned to pool") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - allocatedAfter, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - before := 0 - if allocatedBefore != "" { - fmt.Sscanf(allocatedBefore, "%d", &before) - } - after := 0 - if allocatedAfter != "" { - fmt.Sscanf(allocatedAfter, "%d", &after) - } - g.Expect(after).To(BeNumerically("<", before), "Allocated count should decrease") - }, 30*time.Second).Should(Succeed()) - - By("cleaning up Pool") - cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - }) - - It("should expire and delete non-pooled BatchSandbox correctly", func() { - const batchSandboxName = "test-bs-expire-non-pooled" - const testNamespace = "default" - const replicas = 1 - - By("creating a non-pooled BatchSandbox with expireTime") - expireTime := time.Now().Add(45 * time.Second).UTC().Format(time.RFC3339) - - bsYAML, err := renderTemplate("testdata/batchsandbox-non-pooled-expire.yaml", map[string]interface{}{ - "BatchSandboxName": batchSandboxName, - "Namespace": testNamespace, - "Replicas": replicas, - "ExpireTime": expireTime, - "SandboxImage": utils.SandboxImage, - }) - Expect(err).NotTo(HaveOccurred()) - - bsFile := filepath.Join("/tmp", "test-bs-expire-non-pooled.yaml") - err = os.WriteFile(bsFile, []byte(bsYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(bsFile) - - cmd := exec.Command("kubectl", "apply", "-f", bsFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("verifying BatchSandbox is created") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal(fmt.Sprintf("%d", replicas))) - }, 2*time.Minute).Should(Succeed()) - - By("recording pod names") - cmd = exec.Command("kubectl", "get", "pods", "-n", testNamespace, "-o", "json") - output, err := utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - var podList struct { - Items []struct { - Metadata struct { - Name string `json:"name"` - OwnerReferences []struct { - Kind string `json:"kind"` - Name string `json:"name"` - } `json:"ownerReferences"` - } `json:"metadata"` - } `json:"items"` - } - err = json.Unmarshal([]byte(output), &podList) - Expect(err).NotTo(HaveOccurred()) - - podNamesList := []string{} - for _, pod := range podList.Items { - for _, owner := range pod.Metadata.OwnerReferences { - if owner.Kind == "BatchSandbox" && owner.Name == batchSandboxName { - podNamesList = append(podNamesList, pod.Metadata.Name) - break - } - } - } - Expect(len(podNamesList)).To(BeNumerically(">", 0), "Should have pods owned by BatchSandbox") - - By("waiting for BatchSandbox to expire and be deleted") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace) - _, err := utils.Run(cmd) - g.Expect(err).To(HaveOccurred()) - g.Expect(err.Error()).To(ContainSubstring("not found")) - }, 2*time.Minute).Should(Succeed()) - - By("verifying pods are deleted") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pods", "-n", testNamespace, "-o", "json") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - var currentPodList struct { - Items []struct { - Metadata struct { - Name string `json:"name"` - DeletionTimestamp *string `json:"deletionTimestamp"` - OwnerReferences []struct { - Kind string `json:"kind"` - Name string `json:"name"` - } `json:"ownerReferences"` - } `json:"metadata"` - } `json:"items"` - } - err = json.Unmarshal([]byte(output), ¤tPodList) - g.Expect(err).NotTo(HaveOccurred()) - - // Verify no pods are owned by the deleted BatchSandbox or they have deletionTimestamp - for _, pod := range currentPodList.Items { - for _, owner := range pod.Metadata.OwnerReferences { - if owner.Kind == "BatchSandbox" && owner.Name == batchSandboxName { - g.Expect(pod.Metadata.DeletionTimestamp).NotTo(BeNil(), - "Pod %s owned by BatchSandbox should have deletionTimestamp set", pod.Metadata.Name) - } - } - } - }, 30*time.Second).Should(Succeed()) - }) - - It("should expire and return pooled BatchSandbox pods to pool", func() { - const poolName = "test-pool-for-expire" - const batchSandboxName = "test-bs-expire-pooled" - const testNamespace = "default" - const replicas = 1 - - By("creating a Pool") - poolYAML, err := renderTemplate("testdata/pool-basic.yaml", map[string]interface{}{ - "PoolName": poolName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "BufferMax": 3, - "BufferMin": 2, - "PoolMax": 5, - "PoolMin": 2, - }) - Expect(err).NotTo(HaveOccurred()) - - poolFile := filepath.Join("/tmp", "test-pool-for-expire.yaml") - err = os.WriteFile(poolFile, []byte(poolYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(poolFile) - - cmd := exec.Command("kubectl", "apply", "-f", poolFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("waiting for Pool to be ready") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.total}") - totalStr, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(totalStr).NotTo(BeEmpty()) - }, 2*time.Minute).Should(Succeed()) - - By("recording Pool allocated count before BatchSandbox creation") - cmd = exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - allocatedBeforeBS, err := utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("creating a pooled BatchSandbox with expireTime") - expireTime := time.Now().Add(45 * time.Second).UTC().Format(time.RFC3339) - bsYAML, err := renderTemplate("testdata/batchsandbox-pooled.yaml", map[string]interface{}{ - "BatchSandboxName": batchSandboxName, - "SandboxImage": utils.SandboxImage, - "Namespace": testNamespace, - "Replicas": replicas, - "PoolName": poolName, - "ExpireTime": expireTime, - }) - Expect(err).NotTo(HaveOccurred()) - - bsFile := filepath.Join("/tmp", "test-bs-expire-pooled.yaml") - err = os.WriteFile(bsFile, []byte(bsYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - defer os.Remove(bsFile) - - cmd = exec.Command("kubectl", "apply", "-f", bsFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("recording pod names from alloc-status") - var podNamesList []string - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") - allocStatusJSON, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(allocStatusJSON).NotTo(BeEmpty()) - - var allocStatus struct { - Pods []string `json:"pods"` - } - err = json.Unmarshal([]byte(allocStatusJSON), &allocStatus) - g.Expect(err).NotTo(HaveOccurred()) - podNamesList = allocStatus.Pods - g.Expect(len(podNamesList)).To(BeNumerically(">", 0), "Should have allocated pods") - }, 2*time.Minute).Should(Succeed()) - - allocatedAfterBS := "" - By("verifying Pool allocated count increased after BatchSandbox allocation") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - _allocatedAfterBS, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - allocatedAfterBS = _allocatedAfterBS - - before := 0 - if allocatedBeforeBS != "" { - fmt.Sscanf(allocatedBeforeBS, "%d", &before) - } - - after := 0 - if _allocatedAfterBS != "" { - fmt.Sscanf(allocatedAfterBS, "%d", &after) - } - - g.Expect(after).To(BeNumerically(">", before), "Pool allocated count should increase after BatchSandbox allocation") - }, 30*time.Second).Should(Succeed()) - - By("waiting for BatchSandbox to expire and be deleted") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace) - _, err := utils.Run(cmd) - g.Expect(err).To(HaveOccurred()) - g.Expect(err.Error()).To(ContainSubstring("not found")) - }, 2*time.Minute).Should(Succeed()) - - By("verifying pods still exist and are returned to pool") - Eventually(func(g Gomega) { - for _, podName := range podNamesList { - cmd := exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, - "-o", "jsonpath={.metadata.name}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal(podName), "Pod should still exist") - } - }, 30*time.Second).Should(Succeed()) - - By("verifying Pool allocated count decreased after BatchSandbox expiration") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - allocatedAfterExpiration, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - before := 0 - if allocatedAfterBS != "" { - fmt.Sscanf(allocatedAfterBS, "%d", &before) - } - after := 0 - if allocatedAfterExpiration != "" { - fmt.Sscanf(allocatedAfterExpiration, "%d", &after) - } - g.Expect(after).To(BeNumerically("<", before), "Allocated count should decrease") - }, 30*time.Second).Should(Succeed()) - - By("cleaning up Pool") - cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - }) - }) - - Context("Task", func() { - BeforeAll(func() { - By("waiting for controller to be ready") - Eventually(func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pods", "-l", "control-plane=controller-manager", - "-n", namespace, "-o", "jsonpath={.items[0].status.phase}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal("Running")) - }, 2*time.Minute).Should(Succeed()) - }) - - It("should successfully manage Pool with task scheduling", func() { - const poolName = "test-pool" - const batchSandboxName = "test-batchsandbox-with-task" - const testNamespace = "default" - const replicas = 2 - - By("creating a Pool with task-executor sidecar") - poolTemplateFile := filepath.Join("testdata", "pool-with-task-executor.yaml") - poolYAML, err := renderTemplate(poolTemplateFile, map[string]interface{}{ - "PoolName": poolName, - "Namespace": testNamespace, - "TaskExecutorImage": utils.TaskExecutorImage, - }) - Expect(err).NotTo(HaveOccurred()) - - poolFile := filepath.Join("/tmp", "test-pool.yaml") - err = os.WriteFile(poolFile, []byte(poolYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - - cmd := exec.Command("kubectl", "apply", "-f", poolFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred(), "Failed to create Pool") - - By("waiting for Pool to be ready") - verifyPoolReady := func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.total}") - output, err := utils.Run(cmd) - By(fmt.Sprintf("waiting for Pool to be ready, output %s", output)) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).NotTo(BeEmpty(), "Pool status.total should not be empty") - } - Eventually(verifyPoolReady, 2*time.Minute).Should(Succeed()) - - By("creating a BatchSandbox with process-based tasks using the Pool") - batchSandboxTemplateFile := filepath.Join("testdata", "batchsandbox-with-process-task.yaml") - batchSandboxYAML, err := renderTemplate(batchSandboxTemplateFile, map[string]interface{}{ - "BatchSandboxName": batchSandboxName, - "Namespace": testNamespace, - "Replicas": replicas, - "PoolName": poolName, - "TaskExecutorImage": utils.TaskExecutorImage, - }) - Expect(err).NotTo(HaveOccurred()) - - batchSandboxFile := filepath.Join("/tmp", "test-batchsandbox.yaml") - err = os.WriteFile(batchSandboxFile, []byte(batchSandboxYAML), 0644) - Expect(err).NotTo(HaveOccurred()) - - cmd = exec.Command("kubectl", "apply", "-f", batchSandboxFile) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred(), "Failed to create BatchSandbox") - - By("verifying BatchSandbox successfully allocated endpoints") - verifyBatchSandboxAllocated := func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal(fmt.Sprintf("%d", replicas)), "BatchSandbox should allocate %d replicas", replicas) - } - Eventually(verifyBatchSandboxAllocated, 2*time.Minute).Should(Succeed()) - - By("verifying BatchSandbox endpoints are available") - verifyEndpoints := func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/endpoints}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).NotTo(BeEmpty(), "BatchSandbox should have sandbox.opensandbox.io/endpoints annotation") - endpoints := strings.Split(output, ",") - g.Expect(len(endpoints)).To(Equal(replicas), "Should have %d endpoints", replicas) - } - Eventually(verifyEndpoints, 30*time.Second).Should(Succeed()) - - By("verifying BatchSandbox status is as expected") - verifyBatchSandboxStatus := func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.status}") - statusOutput, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(statusOutput).To(ContainSubstring(fmt.Sprintf(`"replicas":%d`, replicas))) - g.Expect(statusOutput).To(ContainSubstring(fmt.Sprintf(`"allocated":%d`, replicas))) - g.Expect(statusOutput).To(ContainSubstring(fmt.Sprintf(`"ready":%d`, replicas))) - } - Eventually(verifyBatchSandboxStatus, 30*time.Second).Should(Succeed()) - - By("verifying all tasks are successfully scheduled and succeeded") - verifyTasksSucceeded := func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.status.taskSucceed}") - output, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal(fmt.Sprintf("%d", replicas)), "All tasks should succeed") - - cmd = exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace, - "-o", "jsonpath={.status.taskFailed}") - output, err = utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - g.Expect(output).To(Equal("0"), "No tasks should fail") - } - Eventually(verifyTasksSucceeded, 2*time.Minute).Should(Succeed()) - - By("recording Pool status before deletion") - cmd = exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - poolAllocatedBefore, err := utils.Run(cmd) - Expect(err).NotTo(HaveOccurred()) - - By("deleting the BatchSandbox") - cmd = exec.Command("kubectl", "delete", "batchsandbox", batchSandboxName, "-n", testNamespace) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred(), "Failed to delete BatchSandbox") - - By("verifying all tasks are unloaded and BatchSandbox is deleted") - verifyBatchSandboxDeleted := func(g Gomega) { - cmd := exec.Command("kubectl", "get", "batchsandbox", batchSandboxName, "-n", testNamespace) - _, err := utils.Run(cmd) - g.Expect(err).To(HaveOccurred(), "BatchSandbox should be deleted") - g.Expect(err.Error()).To(ContainSubstring("not found")) - } - Eventually(verifyBatchSandboxDeleted, 2*time.Minute).Should(Succeed()) - - By("verifying pods are returned to the Pool") - verifyPodsReturnedToPool := func(g Gomega) { - cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, - "-o", "jsonpath={.status.allocated}") - poolAllocatedAfter, err := utils.Run(cmd) - g.Expect(err).NotTo(HaveOccurred()) - - beforeCount := 0 - if poolAllocatedBefore != "" { - fmt.Sscanf(poolAllocatedBefore, "%d", &beforeCount) - } - afterCount := 0 - if poolAllocatedAfter != "" { - fmt.Sscanf(poolAllocatedAfter, "%d", &afterCount) - } - g.Expect(afterCount).To(BeNumerically("<=", beforeCount), - "Pool allocated count should decrease or stay same after BatchSandbox deletion") - } - Eventually(verifyPodsReturnedToPool, 30*time.Second).Should(Succeed()) - - By("cleaning up the Pool") - cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace) - _, err = utils.Run(cmd) - Expect(err).NotTo(HaveOccurred(), "Failed to delete Pool") - - By("cleaning up temporary files") - os.Remove(poolFile) - os.Remove(batchSandboxFile) - }) - }) - -}) +var _ = Describe("Manager", Ordered, func() {}) // renderTemplate renders a YAML template file with the given data. func renderTemplate(templateFile string, data map[string]interface{}) (string, error) { diff --git a/kubernetes/test/e2e/pod_recycle_policy_test.go b/kubernetes/test/e2e/pod_recycle_policy_test.go new file mode 100644 index 000000000..b9f2e599c --- /dev/null +++ b/kubernetes/test/e2e/pod_recycle_policy_test.go @@ -0,0 +1,879 @@ +// Copyright 2025 Alibaba Group Holding Ltd. +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +package e2e + +import ( + "encoding/json" + "fmt" + "os/exec" + "strings" + "time" + + "github.com/alibaba/OpenSandbox/sandbox-k8s/internal/controller" + . "github.com/onsi/ginkgo/v2" + . "github.com/onsi/gomega" + + "github.com/alibaba/OpenSandbox/sandbox-k8s/test/utils" +) + +// Pod Recycle Policy E2E Tests +var _ = Describe("Pod Recycle Policy", Ordered, func() { + const testNamespace = "default" + + BeforeAll(func() { + By("creating manager namespace") + cmd := exec.Command("kubectl", "create", "ns", namespace) + _, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create namespace") + + By("labeling the namespace to enforce the restricted security policy") + cmd = exec.Command("kubectl", "label", "--overwrite", "ns", namespace, + "pod-security.kubernetes.io/enforce=restricted") + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to label namespace with restricted policy") + + By("installing CRDs") + cmd = exec.Command("make", "install") + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to install CRDs") + + By("deploying the controller-manager") + cmd = exec.Command("make", "deploy", fmt.Sprintf("CONTROLLER_IMG=%s", utils.ControllerImage)) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to deploy the controller-manager") + + By("patching controller deployment with restart-timeout for testing") + cmd = exec.Command("kubectl", "patch", "deployment", "opensandbox-controller-manager", "-n", namespace, + "--type", "json", "-p", + `[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--restart-timeout=60s"}]`) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to patch controller deployment") + + By("waiting for controller rollout to complete") + cmd = exec.Command("kubectl", "rollout", "status", "deployment/opensandbox-controller-manager", "-n", namespace, "--timeout=60s") + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to wait for controller rollout") + + By("waiting for controller to be ready") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pods", "-l", "control-plane=controller-manager", + "-n", namespace, "-o", "jsonpath={.items[0].status.phase}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("Running")) + }, 2*time.Minute).Should(Succeed()) + }) + + AfterAll(func() { + By("undeploying the controller-manager") + cmd := exec.Command("make", "undeploy") + _, _ = utils.Run(cmd) + + By("uninstalling CRDs") + cmd = exec.Command("make", "uninstall") + _, _ = utils.Run(cmd) + + By("removing manager namespace") + cmd = exec.Command("kubectl", "delete", "ns", namespace) + _, _ = utils.Run(cmd) + }) + + SetDefaultEventuallyTimeout(3 * time.Minute) + SetDefaultEventuallyPollingInterval(2 * time.Second) + + Context("Delete Policy", func() { + It("should delete pod when BatchSandbox is deleted with Delete policy", func() { + poolName := "delete-policy-pool" + bsbxName := "delete-policy-bsbx" + + By("creating Pool with Delete policy") + poolYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: Pool +metadata: + name: %s + namespace: %s +spec: + podRecyclePolicy: Delete + template: + spec: + containers: + - name: sandbox-container + image: task-executor:dev + command: ["/bin/sh", "-c", "trap 'exit 0' TERM; while true; do sleep 1; done"] + capacitySpec: + bufferMax: 1 + bufferMin: 1 + poolMax: 1 + poolMin: 1 +`, poolName, testNamespace) + cmd := exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(poolYAML) + _, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create Pool") + + By("waiting for Pool to have available pods") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, + "-o", "jsonpath={.status.available}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + + By("creating BatchSandbox") + bsbxYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: BatchSandbox +metadata: + name: %s + namespace: %s +spec: + replicas: 1 + poolRef: %s +`, bsbxName, testNamespace, poolName) + cmd = exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(bsbxYAML) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create BatchSandbox") + + By("waiting for BatchSandbox to be allocated") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxName, "-n", testNamespace, + "-o", "jsonpath={.status.allocated}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + + By("getting the allocated pod name") + cmd = exec.Command("kubectl", "get", "batchsandbox", bsbxName, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") + output, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred()) + var alloc controller.SandboxAllocation + Expect(json.Unmarshal([]byte(output), &alloc)).To(Succeed()) + Expect(alloc.Pods).To(HaveLen(1)) + podName := alloc.Pods[0] + + By("deleting BatchSandbox") + cmd = exec.Command("kubectl", "delete", "batchsandbox", bsbxName, "-n", testNamespace) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to delete BatchSandbox") + + By("verifying pod is deleted") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, "--ignore-not-found") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(BeEmpty(), "Pod should be deleted with Delete policy") + }).Should(Succeed()) + + By("cleaning up Pool") + cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace, "--timeout=30s") + _, _ = utils.Run(cmd) + }) + }) + + Context("Restart Policy - Success", func() { + It("should restart and reuse pod when BatchSandbox is deleted with Restart policy", func() { + poolName := "restart-policy-pool" + bsbxName := "restart-policy-bsbx" + + By("creating Pool with Restart policy") + poolYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: Pool +metadata: + name: %s + namespace: %s +spec: + podRecyclePolicy: Restart + template: + spec: + containers: + - name: sandbox-container + image: task-executor:dev + command: ["/bin/sh", "-c", "trap 'exit 0' TERM; while true; do sleep 1; done"] + capacitySpec: + bufferMax: 1 + bufferMin: 1 + poolMax: 1 + poolMin: 1 +`, poolName, testNamespace) + cmd := exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(poolYAML) + _, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create Pool") + + By("waiting for Pool to have available pods") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, + "-o", "jsonpath={.status.available}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + + By("creating BatchSandbox") + bsbxYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: BatchSandbox +metadata: + name: %s + namespace: %s +spec: + replicas: 1 + poolRef: %s +`, bsbxName, testNamespace, poolName) + cmd = exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(bsbxYAML) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create BatchSandbox") + + By("waiting for BatchSandbox to be allocated") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxName, "-n", testNamespace, + "-o", "jsonpath={.status.allocated}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + + By("getting the allocated pod name and initial restart count") + cmd = exec.Command("kubectl", "get", "batchsandbox", bsbxName, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") + output, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred()) + var alloc controller.SandboxAllocation + Expect(json.Unmarshal([]byte(output), &alloc)).To(Succeed()) + Expect(alloc.Pods).To(HaveLen(1)) + podName := alloc.Pods[0] + + cmd = exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, + "-o", "jsonpath={.status.containerStatuses[0].restartCount}") + output, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred()) + initialRestartCount := output + + By("deleting BatchSandbox") + cmd = exec.Command("kubectl", "delete", "batchsandbox", bsbxName, "-n", testNamespace) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to delete BatchSandbox") + + By("verifying pod is NOT deleted") + Consistently(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, "--ignore-not-found", "-o", "name") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(ContainSubstring(podName), "Pod should NOT be deleted with Restart policy") + }, 30*time.Second).Should(Succeed()) + + By("waiting for pod restart count to increase") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, + "-o", "jsonpath={.status.containerStatuses[0].restartCount}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).ToNot(Equal(initialRestartCount), "Restart count should increase") + }).Should(Succeed()) + + By("waiting for recycle-meta annotation to be cleared (restart completed)") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/recycle-meta}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(BeEmpty(), "recycle-meta annotation should be cleared after restart completes") + }).Should(Succeed()) + + By("waiting for pod to be Ready again") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, + "-o", "jsonpath={.status.conditions[?(@.type=='Ready')].status}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("True"), "Pod should be Ready after restart") + }).Should(Succeed()) + + By("verifying pod is available for reuse (deallocated-from label cleared)") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, + "-o", "jsonpath={.metadata.labels.pool\\.opensandbox\\.io/deallocated-from}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(BeEmpty(), "deallocated-from label should be cleared for reuse") + }).Should(Succeed()) + + By("creating new BatchSandbox to verify pod can be reused") + bsbxName2 := "restart-policy-bsbx-2" + bsbxYAML2 := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: BatchSandbox +metadata: + name: %s + namespace: %s +spec: + replicas: 1 + poolRef: %s +`, bsbxName2, testNamespace, poolName) + cmd = exec.Command("kubectl", "apply", "-f", "-") + GinkgoWriter.Printf("Creating second BatchSandbox %s\n", bsbxYAML2) + cmd.Stdin = strings.NewReader(bsbxYAML2) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create second BatchSandbox") + + By("verifying the same pod is reused") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxName2, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + var alloc2 controller.SandboxAllocation + g.Expect(json.Unmarshal([]byte(output), &alloc2)).To(Succeed()) + g.Expect(alloc2.Pods).To(ContainElement(podName), "Same pod should be reused") + }).Should(Succeed()) + + By("cleaning up") + cmd = exec.Command("kubectl", "delete", "batchsandbox", bsbxName2, "-n", testNamespace, "--timeout=60s") + _, _ = utils.Run(cmd) + cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace, "--timeout=30s") + _, _ = utils.Run(cmd) + }) + }) + + Context("Restart Policy - Failure", func() { + It("should delete pod when restart times out", func() { + poolName := "restart-timeout-pool" + bsbxName := "restart-timeout-bsbx" + + By("creating Pool with Restart policy and a container that exits immediately") + poolYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: Pool +metadata: + name: %s + namespace: %s +spec: + podRecyclePolicy: Restart + template: + spec: + containers: + - name: sandbox-container + image: task-executor:dev + command: ["/bin/sh", "-c", "sleep infinity"] + capacitySpec: + bufferMax: 1 + bufferMin: 1 + poolMax: 1 + poolMin: 1 +`, poolName, testNamespace) + cmd := exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(poolYAML) + _, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create Pool") + + By("waiting for Pool to have pods created") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, + "-o", "jsonpath={.status.total}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + + By("creating BatchSandbox") + bsbxYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: BatchSandbox +metadata: + name: %s + namespace: %s +spec: + replicas: 1 + poolRef: %s +`, bsbxName, testNamespace, poolName) + cmd = exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(bsbxYAML) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create BatchSandbox") + + By("getting the pod name") + time.Sleep(3 * time.Second) + cmd = exec.Command("kubectl", "get", "pods", "-n", testNamespace, + "-l", "sandbox.opensandbox.io/pool-name="+poolName, + "-o", "jsonpath={.items[0].metadata.name}") + output, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred()) + podName := output + Expect(podName).NotTo(BeEmpty()) + + By("deleting BatchSandbox to trigger restart") + cmd = exec.Command("kubectl", "delete", "batchsandbox", bsbxName, "-n", testNamespace, "--timeout=60s") + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to delete BatchSandbox") + + By("waiting for restart timeout - pod should be marked for deletion or already deleted") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, + "-o", "jsonpath={.metadata.deletionTimestamp}") + output, err := utils.Run(cmd) + success := (err == nil && output != "") || (err != nil && strings.Contains(err.Error(), "not found")) + g.Expect(success).To(BeTrue(), "Pod %s should have deletionTimestamp or be deleted", podName) + }, 60*time.Second).Should(Succeed()) + + By("cleaning up Pool") + cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace, "--timeout=30s") + _, _ = utils.Run(cmd) + }) + }) + + Context("Batch Operations", func() { + It("should handle multiple BatchSandbox deletions with Restart policy", func() { + poolName := "batch-ops-pool" + + By("creating Pool with Restart policy") + poolYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: Pool +metadata: + name: %s + namespace: %s +spec: + podRecyclePolicy: Restart + template: + spec: + containers: + - name: sandbox-container + image: task-executor:dev + command: ["/bin/sh", "-c", "trap 'exit 0' TERM; while true; do sleep 1; done"] + capacitySpec: + bufferMax: 0 + bufferMin: 0 + poolMax: 3 + poolMin: 3 +`, poolName, testNamespace) + cmd := exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(poolYAML) + _, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create Pool") + + By("waiting for Pool to have available pods") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, + "-o", "jsonpath={.status.available}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("3")) + }).Should(Succeed()) + + By("creating multiple BatchSandboxes") + bsbxNames := []string{"batch-ops-bsbx-1", "batch-ops-bsbx-2", "batch-ops-bsbx-3"} + for _, bsbxName := range bsbxNames { + bsbxYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: BatchSandbox +metadata: + name: %s + namespace: %s +spec: + replicas: 1 + poolRef: %s +`, bsbxName, testNamespace, poolName) + cmd := exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(bsbxYAML) + _, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create BatchSandbox "+bsbxName) + } + + By("waiting for all BatchSandboxes to be allocated") + for _, bsbxName := range bsbxNames { + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxName, "-n", testNamespace, + "-o", "jsonpath={.status.allocated}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + } + + By("recording pod names before deletion") + podNames := make([]string, 0) + for _, bsbxName := range bsbxNames { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxName, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") + output, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred()) + var alloc controller.SandboxAllocation + Expect(json.Unmarshal([]byte(output), &alloc)).To(Succeed()) + podNames = append(podNames, alloc.Pods...) + } + Expect(podNames).To(HaveLen(3)) + + By("deleting all BatchSandboxes") + for _, bsbxName := range bsbxNames { + cmd := exec.Command("kubectl", "delete", "batchsandbox", bsbxName, "-n", testNamespace, "--timeout=60s") + _, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to delete BatchSandbox "+bsbxName) + } + + By("waiting for all pods to complete restart and be available") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, + "-o", "jsonpath={.status.available}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("3"), "All pods should be available after restart") + }).Should(Succeed()) + + By("verifying all original pods are still present (not deleted)") + for _, podName := range podNames { + cmd := exec.Command("kubectl", "get", "pod", podName, "-n", testNamespace, "--ignore-not-found", "-o", "name") + output, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred()) + Expect(output).To(ContainSubstring(podName), "Pod %s should still exist", podName) + } + + By("cleaning up") + cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace, "--timeout=30s") + _, _ = utils.Run(cmd) + }) + }) + + Context("Pool Recycle Finalizer", func() { + It("should block BatchSandbox deletion until pods are recycled", func() { + poolName := "finalizer-pool" + bsbxName := "finalizer-bsbx" + + By("creating Pool with Restart policy") + poolYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: Pool +metadata: + name: %s + namespace: %s +spec: + podRecyclePolicy: Restart + template: + spec: + containers: + - name: sandbox-container + image: task-executor:dev + command: ["/bin/sh", "-c", "trap 'exit 0' TERM; while true; do sleep 1; done"] + capacitySpec: + bufferMax: 1 + bufferMin: 1 + poolMax: 1 + poolMin: 1 +`, poolName, testNamespace) + cmd := exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(poolYAML) + _, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create Pool") + + By("waiting for Pool to have available pods") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, + "-o", "jsonpath={.status.available}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + + By("creating BatchSandbox") + bsbxYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: BatchSandbox +metadata: + name: %s + namespace: %s +spec: + replicas: 1 + poolRef: %s +`, bsbxName, testNamespace, poolName) + cmd = exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(bsbxYAML) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create BatchSandbox") + + By("waiting for BatchSandbox to be allocated") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxName, "-n", testNamespace, + "-o", "jsonpath={.status.allocated}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + + By("verifying pool-recycle finalizer is present") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxName, "-n", testNamespace, + "-o", "jsonpath={.metadata.finalizers}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(ContainSubstring("batch-sandbox.sandbox.opensandbox.io/pool-recycle")) + }).Should(Succeed()) + + By("deleting BatchSandbox") + cmd = exec.Command("kubectl", "delete", "batchsandbox", bsbxName, "-n", testNamespace, "--timeout=60s") + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to delete BatchSandbox") + + By("verifying BatchSandbox is deleted (finalizer removed after recycle)") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxName, "-n", testNamespace, "--ignore-not-found") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(BeEmpty(), "BatchSandbox should be deleted after finalizer is removed") + }).Should(Succeed()) + + By("cleaning up Pool") + cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace, "--timeout=30s") + _, _ = utils.Run(cmd) + }) + }) + + Context("Release Pod Allocation - Reallocating to Another BatchSandbox", func() { + It("should not affect pod already allocated to another BatchSandbox when original is deleted", func() { + poolName := "release-realloc-pool" + bsbxNameA := "release-realloc-bsbx-a" + bsbxNameB := "release-realloc-bsbx-b" + + By("creating Pool with Restart policy") + poolYAML := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: Pool +metadata: + name: %s + namespace: %s +spec: + podRecyclePolicy: Restart + template: + spec: + containers: + - name: sandbox-container + image: task-executor:dev + capacitySpec: + bufferMax: 0 + bufferMin: 0 + poolMax: 2 + poolMin: 2 +`, poolName, testNamespace) + cmd := exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(poolYAML) + _, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create Pool") + + By("waiting for Pool to have available pods") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, + "-o", "jsonpath={.status.available}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("2")) + }).Should(Succeed()) + + // Step 1: Create BatchSandbox A with a task that will release pod on completion + By("creating BatchSandbox A with task that releases pod on completion") + bsbxYAMLA := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: BatchSandbox +metadata: + name: %s + namespace: %s +spec: + replicas: 1 + poolRef: %s + taskResourcePolicyWhenCompleted: Release + taskTemplate: + spec: + process: + command: ["/bin/sh", "-c"] + args: ["echo hello && sleep 1"] +`, bsbxNameA, testNamespace, poolName) + cmd = exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(bsbxYAMLA) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create BatchSandbox A") + + By("waiting for BatchSandbox A to be allocated") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxNameA, "-n", testNamespace, + "-o", "jsonpath={.status.allocated}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + + By("getting the allocated pod name from BatchSandbox A") + cmd = exec.Command("kubectl", "get", "batchsandbox", bsbxNameA, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") + output, err := utils.Run(cmd) + Expect(err).NotTo(HaveOccurred()) + var allocA controller.SandboxAllocation + Expect(json.Unmarshal([]byte(output), &allocA)).To(Succeed()) + Expect(allocA.Pods).To(HaveLen(1)) + podNameA := allocA.Pods[0] + + // Step 2: Wait for task to complete and pod to be released + By("waiting for task to complete (succeed) and pod to be released") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxNameA, "-n", testNamespace, + "-o", "jsonpath={.status.taskSucceed}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1"), "Task should succeed") + }).Should(Succeed()) + + By("verifying pod has deallocated-from label after release") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podNameA, "-n", testNamespace, + "-o", "jsonpath={.metadata.labels.pool\\.opensandbox\\.io/deallocated-from}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).NotTo(BeEmpty(), "Pod should have deallocated-from label after release") + }).Should(Succeed()) + + By("verifying released pod is recorded in BatchSandbox A") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxNameA, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-release}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(ContainSubstring(podNameA), "Released pod should be recorded") + }).Should(Succeed()) + + // Step 3: Wait for pod recycle to complete (restart finished) + By("waiting for pod recycle-meta annotation to be cleared (restart in progress)") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podNameA, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/recycle-meta}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(BeEmpty(), "recycle-meta annotation should be cleared after restart completes") + }).Should(Succeed()) + + By("waiting for pod to be Ready again after restart") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podNameA, "-n", testNamespace, + "-o", "jsonpath={.status.conditions[?(@.type=='Ready')].status}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("True"), "Pod should be Ready after restart") + }).Should(Succeed()) + + By("waiting for deallocated-from label to be cleared (pod ready for reuse)") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podNameA, "-n", testNamespace, + "-o", "jsonpath={.metadata.labels.pool\\.opensandbox\\.io/deallocated-from}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(BeEmpty(), "deallocated-from label should be cleared for reuse") + }).Should(Succeed()) + + By("waiting for Pool available count to be restored") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pool", poolName, "-n", testNamespace, + "-o", "jsonpath={.status.available}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("2"), "Pool should have 2 available pods after recycle") + }).Should(Succeed()) + + // Step 4: Create BatchSandbox B to allocate the recycled pod + By("creating BatchSandbox B to allocate the recycled pod") + bsbxYAMLB := fmt.Sprintf(` +apiVersion: sandbox.opensandbox.io/v1alpha1 +kind: BatchSandbox +metadata: + name: %s + namespace: %s +spec: + replicas: 1 + poolRef: %s +`, bsbxNameB, testNamespace, poolName) + cmd = exec.Command("kubectl", "apply", "-f", "-") + cmd.Stdin = strings.NewReader(bsbxYAMLB) + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to create BatchSandbox B") + + By("waiting for BatchSandbox B to be allocated") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxNameB, "-n", testNamespace, + "-o", "jsonpath={.status.allocated}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(Equal("1")) + }).Should(Succeed()) + + By("verifying the same pod is allocated to BatchSandbox B") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxNameB, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + var allocB controller.SandboxAllocation + g.Expect(json.Unmarshal([]byte(output), &allocB)).To(Succeed()) + g.Expect(allocB.Pods).To(ContainElement(podNameA), "The same pod should be allocated to BatchSandbox B") + }).Should(Succeed()) + + // Step 5: Delete BatchSandbox A (the one that released the pod) + By("deleting BatchSandbox A") + cmd = exec.Command("kubectl", "delete", "batchsandbox", bsbxNameA, "-n", testNamespace, "--timeout=60s") + _, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred(), "Failed to delete BatchSandbox A") + + By("verifying BatchSandbox A is deleted") + Eventually(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "batchsandbox", bsbxNameA, "-n", testNamespace, "--ignore-not-found") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(BeEmpty(), "BatchSandbox A should be deleted") + }).Should(Succeed()) + + // Step 6: Verify the pod is NOT affected (not deleted, not labeled with deallocated-from) + By("verifying the pod is NOT deleted after BatchSandbox A deletion") + Consistently(func(g Gomega) { + cmd := exec.Command("kubectl", "get", "pod", podNameA, "-n", testNamespace, "--ignore-not-found", "-o", "name") + output, err := utils.Run(cmd) + g.Expect(err).NotTo(HaveOccurred()) + g.Expect(output).To(ContainSubstring(podNameA), "Pod should NOT be deleted when original BatchSandbox A is deleted") + }, 10*time.Second).Should(Succeed()) + + By("verifying the pod does NOT have deallocated-from label from BatchSandbox A") + cmd = exec.Command("kubectl", "get", "pod", podNameA, "-n", testNamespace, + "-o", "jsonpath={.metadata.labels.pool\\.opensandbox\\.io/deallocated-from}") + output, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred()) + Expect(output).To(BeEmpty(), "Pod should NOT have deallocated-from label after BatchSandbox A deletion") + + By("verifying BatchSandbox B still has the pod allocated") + cmd = exec.Command("kubectl", "get", "batchsandbox", bsbxNameB, "-n", testNamespace, + "-o", "jsonpath={.metadata.annotations.sandbox\\.opensandbox\\.io/alloc-status}") + output, err = utils.Run(cmd) + Expect(err).NotTo(HaveOccurred()) + var allocBCheck controller.SandboxAllocation + Expect(json.Unmarshal([]byte(output), &allocBCheck)).To(Succeed()) + Expect(allocBCheck.Pods).To(ContainElement(podNameA), "BatchSandbox B should still have the pod allocated") + + // Cleanup + By("cleaning up") + cmd = exec.Command("kubectl", "delete", "batchsandbox", bsbxNameB, "-n", testNamespace, "--timeout=60s") + _, _ = utils.Run(cmd) + cmd = exec.Command("kubectl", "delete", "pool", poolName, "-n", testNamespace, "--timeout=30s") + _, _ = utils.Run(cmd) + }) + }) +}) diff --git a/kubernetes/test/e2e/testdata/batchsandbox-pooled.yaml b/kubernetes/test/e2e/testdata/batchsandbox-pooled.yaml index a434145c7..62f2f86d8 100644 --- a/kubernetes/test/e2e/testdata/batchsandbox-pooled.yaml +++ b/kubernetes/test/e2e/testdata/batchsandbox-pooled.yaml @@ -6,4 +6,6 @@ metadata: spec: replicas: {{.Replicas}} poolRef: {{.PoolName}} - expireTime: "{{.ExpireTime}}" \ No newline at end of file +{{- if .ExpireTime }} + expireTime: "{{.ExpireTime}}" +{{- end }} \ No newline at end of file