Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add save_to_oci_registry ptyhon client method #800

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion clients/python/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,12 @@ deploy-latest-mr:
cd ../../ && IMG_VERSION=${IMG_VERSION} make image/build && LOCAL=1 ./scripts/deploy_on_kind.sh
kubectl port-forward -n kubeflow services/model-registry-service 8080:8080 &

.PHONY: deploy-local-registry
deploy-local-registry:
cd ../../ && ./scripts/deploy_local_kind_registry.sh

.PHONY: test-e2e
test-e2e: deploy-latest-mr
test-e2e: deploy-latest-mr deploy-local-registry
poetry run pytest --e2e -s -rA

.PHONY: test
Expand Down
8 changes: 8 additions & 0 deletions clients/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,4 +200,12 @@ You can use `make test` to execute `pytest`.

Check out our [recommendations on setting up your docker engine](https://github.com/kubeflow/model-registry/blob/main/CONTRIBUTING.md#docker-engine) on an ARM processor.

### Extras

Depending on your development flow, you need to install extra dependencies:

```
poetry install -E "olot"
```

<!-- github-only -->
120 changes: 117 additions & 3 deletions clients/python/poetry.lock

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions clients/python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,11 @@ nest-asyncio = "^1.6.0"
eval-type-backport = "^0.2.0"

huggingface-hub = { version = ">=0.20.1,<0.29.0", optional = true }
olot = { version = "^0.1.2", optional = true }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is following the same pattern by huggingface-hub above. This particular dependency is optional because if you choose not to use this new "save" method, you would never need this dependency. The README was updated to reflect this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


[tool.poetry.extras]
hf = ["huggingface-hub"]
olot = ["olot"]

[tool.poetry.group.docs]
optional = true
Expand Down
72 changes: 70 additions & 2 deletions clients/python/src/model_registry/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,13 @@
from __future__ import annotations

import os

import pathlib
from typing import List, Mapping, Union
from typing_extensions import overload

from ._utils import required_args
from .exceptions import MissingMetadata
from .exceptions import MissingMetadata, StoreError
from .types import SupportedTypes


@overload
Expand Down Expand Up @@ -90,3 +92,69 @@ def s3_uri_from(
# https://alexwlchan.net/2020/s3-keys-are-not-file-paths/ nor do they resolve to valid URls
# FIXME: is this safe?
return f"s3://{bucket}/{path}?endpoint={endpoint}&defaultRegion={region}"

def save_to_oci_registry(
base_image: str,
dest_dir: Union[str, os.PathLike],
oci_ref: str,
model_files: List[os.PathLike],
backend: str = 'skopeo',
modelcard: Union[os.PathLike, None] = None,
):
"""Appends a list of files to an OCI-based image.

Args:
base_image: The image to append model files to. This image will be downloaded to the location at `dest_dir`
dest_dir: The location to save the downloaded and extracted base image to.
oci_ref: Destination of where to push the newly layered image to
model_files: List of files to add to the base_image as layers
backend: The CLI tool to use to perform the oci image pull/push. One of: "skopeo", "oras"
modelcard: Optional, path to the modelcard to additionally include as a layer

Raises:
ValueError: If the chosen backend is not installed on the host
StoreError: If the chosen backend is an invalid option
StoreError: If `olot` is not installed as a python package
Returns:
None.
"""
try:
from olot.basics import oci_layers_on_top
except ImportError as e:
msg = """Package `olot` is not installed.
To save models to OCI compatible storage, start by installing the `olot` package, either directly or as an
extra (available as `model-registry[olot]`), e.g.:
```sh
!pip install --pre model-registry[olot]
```
or
```sh
!pip install olot
```
"""
raise StoreError(msg) from e

local_image_path = pathlib.Path(dest_dir)

if backend == 'skopeo':
from olot.backend.skopeo import is_skopeo, skopeo_pull, skopeo_push

if not is_skopeo():
raise ValueError('skopeo is selected, but it is not present on the machine. Please validate the skopeo cli is installed and available in the PATH')

skopeo_pull(base_image, local_image_path)
oci_layers_on_top(local_image_path, model_files, modelcard)
skopeo_push(dest_dir, oci_ref)

elif backend == 'oras':
from olot.backend.oras_cp import is_oras, oras_pull, oras_push
if not is_oras():
raise ValueError('oras is selected, but it is not present on the machine. Please validate the oras cli is installed and available in the PATH')

oras_pull(base_image, local_image_path)
oci_layers_on_top(local_image_path, model_files, modelcard)
oras_push(local_image_path, oci_ref)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose not to abstract this away into an interface/facade because the steps are very straightforward, and i think it would over-engineer what is effectively calling 3 separate functions.

If olot supports more backends, this will probably need to be refactored to include an interface of some kind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but as in #800 (comment)
I would consider passing to a monkeypatched method all params, so that at least we steer users to a provider mechanism of sort


else:
msg = f"Invalid backend chosen: '{backend}'"
raise StoreError(msg)
1 change: 1 addition & 0 deletions clients/python/tests/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
data/
21 changes: 20 additions & 1 deletion clients/python/tests/test_utils.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
import os
import pathlib

import pytest

from model_registry.exceptions import MissingMetadata
from model_registry.utils import s3_uri_from
from model_registry.utils import s3_uri_from, save_to_oci_registry


def test_s3_uri_builder():
Expand Down Expand Up @@ -71,3 +72,21 @@ def test_s3_uri_builder_with_complete_env():
os.environ["AWS_S3_ENDPOINT"] = "test-endpoint"
os.environ["AWS_DEFAULT_REGION"] = "test-region"
assert s3_uri_from("test-path") == s3_uri_from("test-path", "test-bucket")

@pytest.mark.e2e(type="oci")
def test_save_to_oci_registry_with_skopeo():
# TODO: We need a good source registry which is oci-compliant and very small in size
base_image = 'quay.io/mmortari/hello-world-wait:latest'
dest_dir = 'tests/data'
oci_ref = 'localhost:5001/foo/bar:latest'

# Create a sample file named README.md to be added to the registry
pathlib.Path(dest_dir).mkdir(parents=True, exist_ok=True)
readme_file_path = os.path.join(dest_dir, "README.md")
with open(readme_file_path, "w") as f:
f.write("")

model_files = [readme_file_path]
backend = 'skopeo'

save_to_oci_registry(base_image, dest_dir, oci_ref, model_files, backend)
13 changes: 13 additions & 0 deletions scripts/deploy_local_kind_registry.sh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added as a way to have the e2e tests include image download/upload. A registry is needed to perform the upload step, and so a local registry should be good enough to perform this test.

Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/bin/sh
set -o errexit

# Some of this copy-pasted from https://kind.sigs.k8s.io/docs/user/local-registry/

# 1. Create registry container unless it already exists
reg_name='local-mr-registry'
reg_port='5001'
if [ "$(docker inspect -f '{{.State.Running}}' "${reg_name}" 2>/dev/null || true)" != 'true' ]; then
docker run \
-d --restart=always -p "127.0.0.1:${reg_port}:5000" --network bridge --name "${reg_name}" \
registry:2
fi
2 changes: 1 addition & 1 deletion scripts/deploy_on_kind.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ source ./${DIR}/utils.sh

# modularity to allow re-use this script against a remote k8s cluster
if [[ -n "$LOCAL" ]]; then
CLUSTER_NAME="${CLUSTER_NAME:-kind}"
CLUSTER_NAME="${CLUSTER_NAME:-mr-e2e}"

echo 'Creating local Kind cluster and loading image'

Expand Down
Loading