-
Notifications
You must be signed in to change notification settings - Fork 459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sdk tests with papermill #2448
base: master
Are you sure you want to change the base?
Sdk tests with papermill #2448
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
/rerun-all |
@yehudit1987 Can you please fix these CI errors? |
@yehudit1987 Can you sign your commits with |
FYI, you can check this reference: https://github.com/kubeflow/katib/pull/2448/checks?check_run_id=32215445282 |
963d367
to
6633aa5
Compare
/rerun-all |
2 similar comments
/rerun-all |
/rerun-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your great contributions @yehudit1987! 🎉
I left some reviews for you, excluding notebooks. Will soon review other files :)
Btw, @andreyvelich @tenzen-y are busy with other projects now and will be back in the middle of November. Your PR will be merged then.
@@ -0,0 +1,28 @@ | |||
name: Run e2e sdk tests with papermill |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
name: Run e2e sdk tests with papermill | |
name: E2E Tests with Notebooks |
I guess it will be better to make the testcase's name consistent with others :)
cancel-in-progress: true | ||
|
||
jobs: | ||
create-katib-notebooks-test: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create-katib-notebooks-test: | |
e2e: |
# Loop through each algorithm in the array | ||
for algorithm_name in "${ALGORITHM_ARRAY[@]}"; do | ||
suggestion_image_name="$(algorithm_name=$algorithm_name yq eval '.runtime.suggestions.[] | select(.algorithmName == env(algorithm_name)) | .image' \ | ||
manifests/v1beta1/installs/katib-standalone/katib-config.yaml | cut -d: -f1)" | ||
suggestion_name="$(basename "$suggestion_image_name")" | ||
suggestions+=("$suggestion_name") | ||
done | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this loop is redundant with the loop in front of it:
katib/test/e2e/v1beta1/scripts/gh-actions/build-load.sh
Lines 77 to 89 in 706a6f2
# Search for Suggestion Images required for Trial. | |
for exp_name in "${EXPERIMENT_ARRAY[@]}"; do | |
exp_path=$(find examples/v1beta1 -name "${exp_name}.yaml") | |
algorithm_name="$(yq eval '.spec.algorithm.algorithmName' "$exp_path")" | |
suggestion_image_name="$(algorithm_name=$algorithm_name yq eval '.runtime.suggestions.[] | select(.algorithmName == env(algorithm_name)) | .image' \ | |
manifests/v1beta1/installs/katib-standalone/katib-config.yaml | cut -d: -f1)" | |
suggestion_name="$(basename "$suggestion_image_name")" | |
suggestions+=("$suggestion_name") | |
done |
Can we combine these two loops into a unified one by using the ALGORITHM
parameters with other e2e tests.
WDYT👀 @yehudit1987 @kubeflow/wg-automl-leads
echo "Papermill failed for notebook: $NOTEBOOK" | ||
exit 1 | ||
} | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done | |
done | |
A missing new line here
@@ -172,4 +182,4 @@ fi | |||
echo -e "\nCleanup Build Cache...\n" | |||
docker buildx prune -f | |||
|
|||
echo -e "\nAll Katib images with ${TAG} tag have been built successfully!\n" | |||
echo -e "\nAll Katib images with ${TAG} tag have been built successfully!\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
echo -e "\nAll Katib images with ${TAG} tag have been built successfully!\n" | |
echo -e "\nAll Katib images with ${TAG} tag have been built successfully!\n" | |
kubectl create namespace kubeflow-user-example-com | ||
fi | ||
|
||
exit 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exit 0 | |
exit 0 | |
|
||
echo "Start to setup Minikube Kubernetes Cluster" | ||
kubectl version | ||
kubectl cluster-info | ||
kubectl get nodes | ||
|
||
echo "Build and Load container images" | ||
./build-load.sh "$DEPLOY_KATIB_UI" "$TUNE_API" "$TRIAL_IMAGES" "$EXPERIMENTS" | ||
./build-load.sh "$DEPLOY_KATIB_UI" "$TUNE_API" "$TRIAL_IMAGES" "$EXPERIMENTS" "$ALGORITHMS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
./build-load.sh "$DEPLOY_KATIB_UI" "$TUNE_API" "$TRIAL_IMAGES" "$EXPERIMENTS" "$ALGORITHMS" | |
./build-load.sh "$DEPLOY_KATIB_UI" "$TUNE_API" "$TRIAL_IMAGES" "$EXPERIMENTS" "$ALGORITHMS" | |
- name: Setup Minikube Cluster | ||
shell: bash | ||
run: ./test/e2e/v1beta1/scripts/gh-actions/setup-minikube.sh true true "" "" "cmaes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we reuse template-setup-e2e-test
? I guess it will be better if we make full use of the existing template :)
Wait for you thoughts👀 @yehudit1987 @kubeflow/wg-automl-leads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That template being use as a pre template to template-e2e-test. We are using the second one for running yaml experiments by calling a shell script that calls a python script. In our case we just need to add to the job a step that run the notebook directly with papermill. I guess we can use template-setup-e2e-test but it will not prevent us from using the new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late response @yehudit1987. I left a few comments for you.
And I'm busy with my works now. I'll give reviews on Notebooks later:)
if [ -x "$(command -v apt-get)" ]; then | ||
echo "Upgrading Podman using apt-get..." | ||
sudo apt-get update | ||
sudo apt-get install -y podman | ||
elif [ -x "$(command -v dnf)" ]; then | ||
echo "Upgrading Podman using dnf..." | ||
sudo dnf upgrade podman -y | ||
else | ||
echo "Package manager not found. Skipping upgrade." | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please tell me why we need to use podman?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be better to change the dir name from template-notebook-test
to template-e2e-notebook-test
to be consistent with other dirs:)
/rerun-all |
/rerun-all |
Hi @yehudit1987, do you have time to finalize this PR ? |
Hi @andreyvelich, yes I have been waited for your approval regarding the script decisions as mentioned above. I will finalize this PR. |
2c0ce60
to
59af784
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1 similar comment
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Yehudit Kerido <[email protected]>
Signed-off-by: Yehudit Kerido <[email protected]>
70d149a
to
683608f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for doing this, I left initial comments.
/assign @Electronic-Waste @helenxie-bit @shashank-iitbhu @kubeflow/wg-training-leads @saileshd1402 Please help with review
@@ -0,0 +1,28 @@ | |||
name: E2E Tests with Notebooks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call this file:
e2e-test-notebooks.yaml
on: | ||
push: {} | ||
pull_request: {} | ||
workflow_dispatch: {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on: | |
push: {} | |
pull_request: {} | |
workflow_dispatch: {} | |
on: | |
- pull_request |
@@ -0,0 +1,54 @@ | |||
name: Notebook test template |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need template if we keep all Notebooks test jobs in the single file.
- name: Setup Minikube Cluster | ||
uses: medyagh/[email protected] | ||
with: | ||
network-plugin: cni | ||
cni: flannel | ||
driver: none | ||
kubernetes-version: v1.29.2 | ||
minikube-version: 1.34.0 | ||
start-args: --wait-timeout=120s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can re-use setup-e2e-test template for that, similar to this:
katib/.github/workflows/e2e-test-pytorch-mnist.yaml
Lines 20 to 25 in 683608f
- name: Setup Test Env | |
uses: ./.github/workflows/template-setup-e2e-test | |
with: | |
kubernetes-version: ${{ matrix.kubernetes-version }} | |
python-version: "3.10" | |
|
||
- name: Setup Minikube | ||
shell: bash | ||
run: ./test/e2e/v1beta1/scripts/gh-actions/setup-minikube.sh true true "" "" "cmaes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we re-use this template for this action, but passing the appropriate values to it:
https://github.com/kubeflow/katib/blob/683608f6a61e7b10218b7084f310af42334d8e65/.github/workflows/template-e2e-test/action.yaml
We can have one more input for notebooks
tests that triggers run-notebook.sh
.
echo "Options:" | ||
echo " -i Input notebook (required)" | ||
echo " -o Output notebook (required)" | ||
echo " -k Kubeflow Training Operator Python SDK (optional)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Katib SDK
case "$opt" in | ||
i) NOTEBOOK_INPUT="$OPTARG" ;; # -i for notebook input path | ||
o) NOTEBOOK_OUTPUT="$OPTARG" ;; # -o for notebook output path | ||
k) TRAINING_PYTHON_SDK="$OPTARG" ;; # -k for training operator python sdk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Katib SDK
exit 1 | ||
fi | ||
|
||
papermill_cmd="papermill $NOTEBOOK_INPUT $NOTEBOOK_OUTPUT -p training_python_sdk $TRAINING_PYTHON_SDK -p namespace $NAMESPACE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should install Katib SDK as a first cell in the Notebooks, similar to this:
https://github.com/kubeflow/trainer/blob/release-1.9/examples/pytorch/image-classification/create-pytorchjob.ipynb
exit 1 | ||
fi | ||
|
||
echo "Notebook execution completed successfully" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
echo "Notebook execution completed successfully" | |
echo "Notebook execution completed successfully" | |
shell: bash | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install papermill kubeflow-katib jupyter ipykernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubeflow-katib
SDK should be installed from source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @andreyvelich, thanks for the review!
Great points, I’ve fixed them. Let me know if anything else is needed.
Signed-off-by: Yehudit Kerido <[email protected]>
Signed-off-by: Yehudit Kerido <[email protected]>
/ok-to-test |
What this PR does / why we need it:
This PR creates E2E tests for katib examples to run with papermill.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #2417
Checklist: