Skip to content

Commit ec94252

Browse files
andreyvelichahg-g
andauthored
KEP-672: Implement the DependsOn API (#740)
* Add DependsOn API * Add JobSet controller changes * Add integration tests for DependsOn CEL validation * Add unit tests for DependsOn * Add controller integration tests for DependsOn * Fix go lint * Add test case with DependsOn and StartupPolicy: AnyOrder Improve API docs * Test case when job-2 depends on job-1 and job-3 depends on job-2 * Add manifests to the make generate * Add E2E test for the DependsOn API * Rename var to DependencyReady and DependencyComplete Rename func to dependencyReachedStatus * Update docs and add DependsOn example * Use startupProbe for launcher * Remove DependsOn rules from the docs * Add E2Es for Kubeflow usecases with DependsOn * Refactor dependencyReachedStatus to accept rJob and rJobReplicas Add info for Suspended Job * Don't check idx in webhook Improve docs * Add comment for e2e * Improve integration tests * Run generate * Update test/integration/controller/jobset_controller_test.go --------- Co-authored-by: Abdullah Gharaibeh <[email protected]>
1 parent 7bbc954 commit ec94252

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+1644
-168
lines changed

Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ manifests: controller-gen ## Generate WebhookConfiguration, ClusterRole and Cust
9090
paths="./pkg/..."
9191

9292
.PHONY: generate
93-
generate: controller-gen code-generator openapi-gen ## Generate code containing DeepCopy, DeepCopyInto, and DeepCopyObject method implementations and client-go libraries.
93+
generate: manifests controller-gen code-generator openapi-gen ## Generate code containing DeepCopy, DeepCopyInto, and DeepCopyObject method implementations and client-go libraries.
9494
$(CONTROLLER_GEN) object:headerFile="hack/boilerplate.go.txt" paths="./api/..."
9595
./hack/update-codegen.sh $(GO_CMD) $(PROJECT_DIR)/bin
9696
./hack/python-sdk/gen-sdk.sh

api/jobset/v1alpha2/jobset_types.go

+41
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,8 @@ const (
7979
)
8080

8181
// JobSetSpec defines the desired state of JobSet
82+
// +kubebuilder:validation:XValidation:rule="!(has(self.startupPolicy) && self.startupPolicy.startupPolicyOrder == 'InOrder' && self.replicatedJobs.exists(x, has(x.dependsOn)))",message="StartupPolicy and DependsOn APIs are mutually exclusive"
83+
// +kubebuilder:validation:XValidation:rule="!(has(self.replicatedJobs[0].dependsOn))",message="DependsOn can't be set for the first ReplicatedJob"
8284
type JobSetSpec struct {
8385
// ReplicatedJobs is the group of jobs that will form the set.
8486
// +listType=map
@@ -105,6 +107,7 @@ type JobSetSpec struct {
105107
FailurePolicy *FailurePolicy `json:"failurePolicy,omitempty"`
106108

107109
// StartupPolicy, if set, configures in what order jobs must be started
110+
// Deprecated: StartupPolicy is deprecated, please use the DependsOn API.
108111
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="Value is immutable"
109112
StartupPolicy *StartupPolicy `json:"startupPolicy,omitempty"`
110113

@@ -230,8 +233,46 @@ type ReplicatedJob struct {
230233
// Jobs names will be in the format: <jobSet.name>-<spec.replicatedJob.name>-<job-index>
231234
// +kubebuilder:default=1
232235
Replicas int32 `json:"replicas,omitempty"`
236+
237+
// DependsOn is an optional list that specifies the preceding ReplicatedJobs upon which
238+
// the current ReplicatedJob depends. If specified, the ReplicatedJob will be created
239+
// only after the referenced ReplicatedJobs reach their desired state.
240+
// The Order of ReplicatedJobs is defined by their enumeration in the slice.
241+
// Note, that the first ReplicatedJob in the slice cannot use the DependsOn API.
242+
// Currently, only a single item is supported in the DependsOn list.
243+
// If JobSet is suspended the all active ReplicatedJobs will be suspended. When JobSet is
244+
// resumed the Job sequence starts again.
245+
// This API is mutually exclusive with the StartupPolicy API.
246+
// +kubebuilder:validation:XValidation:rule="self == oldSelf",message="Value is immutable"
247+
// +kubebuilder:validation:MaxItems=1
248+
// +optional
249+
// +listType=map
250+
// +listMapKey=name
251+
DependsOn []DependsOn `json:"dependsOn,omitempty"`
252+
}
253+
254+
// DependsOn defines the dependency on the previous ReplicatedJob status.
255+
type DependsOn struct {
256+
// Name of the previous ReplicatedJob.
257+
Name string `json:"name"`
258+
259+
// Status defines the condition for the ReplicatedJob. Only Ready or Complete status can be set.
260+
// +kubebuilder:validation:Enum=Ready;Complete
261+
Status DependsOnStatus `json:"status"`
233262
}
234263

264+
type DependsOnStatus string
265+
266+
const (
267+
// DependencyReady means the Ready + Succeeded + Failed counter
268+
// equals the number of child Jobs of the dependant ReplicatedJob.
269+
DependencyReady DependsOnStatus = "Ready"
270+
271+
// DependencyComplete means the Succeeded counter
272+
// equals the number of child Jobs of the dependant ReplicatedJob.
273+
DependencyComplete DependsOnStatus = "Complete"
274+
)
275+
235276
type Network struct {
236277
// EnableDNSHostnames allows pods to be reached via their hostnames.
237278
// Pods will be reachable using the fully qualified pod hostname:

api/jobset/v1alpha2/openapi_generated.go

+55-2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

api/jobset/v1alpha2/zz_generated.deepcopy.go

+20
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

client-go/applyconfiguration/jobset/v1alpha2/dependson.go

+48
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

client-go/applyconfiguration/jobset/v1alpha2/replicatedjob.go

+17-3
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

client-go/applyconfiguration/utils.go

+2
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

config/components/crd/bases/jobset.x-k8s.io_jobsets.yaml

+46-2
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,43 @@ spec:
200200
set.
201201
items:
202202
properties:
203+
dependsOn:
204+
description: |-
205+
DependsOn is an optional list that specifies the preceding ReplicatedJobs upon which
206+
the current ReplicatedJob depends. If specified, the ReplicatedJob will be created
207+
only after the referenced ReplicatedJobs reach their desired state.
208+
The Order of ReplicatedJobs is defined by their enumeration in the slice.
209+
Note, that the first ReplicatedJob in the slice cannot use the DependsOn API.
210+
Currently, only a single item is supported in the DependsOn list.
211+
If JobSet is suspended the all active ReplicatedJobs will be suspended. When JobSet is
212+
resumed the Job sequence starts again.
213+
This API is mutually exclusive with the StartupPolicy API.
214+
items:
215+
description: DependsOn defines the dependency on the previous
216+
ReplicatedJob status.
217+
properties:
218+
name:
219+
description: Name of the previous ReplicatedJob.
220+
type: string
221+
status:
222+
description: Status defines the condition for the ReplicatedJob.
223+
Only Ready or Complete status can be set.
224+
enum:
225+
- Ready
226+
- Complete
227+
type: string
228+
required:
229+
- name
230+
- status
231+
type: object
232+
maxItems: 1
233+
type: array
234+
x-kubernetes-list-map-keys:
235+
- name
236+
x-kubernetes-list-type: map
237+
x-kubernetes-validations:
238+
- message: Value is immutable
239+
rule: self == oldSelf
203240
name:
204241
description: |-
205242
Name is the name of the entry and will be used as a suffix
@@ -8976,8 +9013,9 @@ spec:
89769013
- name
89779014
x-kubernetes-list-type: map
89789015
startupPolicy:
8979-
description: StartupPolicy, if set, configures in what order jobs
8980-
must be started
9016+
description: |-
9017+
StartupPolicy, if set, configures in what order jobs must be started
9018+
Deprecated: StartupPolicy is deprecated, please use the DependsOn API.
89819019
properties:
89829020
startupPolicyOrder:
89839021
description: |-
@@ -9039,6 +9077,12 @@ spec:
90399077
minimum: 0
90409078
type: integer
90419079
type: object
9080+
x-kubernetes-validations:
9081+
- message: StartupPolicy and DependsOn APIs are mutually exclusive
9082+
rule: '!(has(self.startupPolicy) && self.startupPolicy.startupPolicyOrder
9083+
== ''InOrder'' && self.replicatedJobs.exists(x, has(x.dependsOn)))'
9084+
- message: DependsOn can't be set for the first ReplicatedJob
9085+
rule: '!(has(self.replicatedJobs[0].dependsOn))'
90429086
status:
90439087
description: JobSetStatus defines the observed state of JobSet
90449088
properties:

0 commit comments

Comments
 (0)