Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add save_to_oci_registry ptyhon client method #800

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .github/workflows/python-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,11 @@ jobs:
pipx install --pip-args=--constraint=${{ github.workspace }}/.github/workflows/constraints.txt nox
pipx inject --pip-args=--constraint=${{ github.workspace }}/.github/workflows/constraints.txt nox nox-poetry
nox --version
- name: Install Storage Clients
# Oras should be available in the GH Action CI by default
# See: https://oras.land/docs/installation/#runner-machine-of-azure-devops-and-github-actions
run: |
sudo apt-get -y install skopeo
- name: Nox test
working-directory: clients/python
run: |
Expand Down
6 changes: 5 additions & 1 deletion clients/python/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,12 @@ deploy-latest-mr:
cd ../../ && IMG_VERSION=${IMG_VERSION} make image/build && LOCAL=1 ./scripts/deploy_on_kind.sh
kubectl port-forward -n kubeflow services/model-registry-service 8080:8080 &

.PHONY: deploy-local-registry
deploy-local-registry:
cd ../../ && ./scripts/deploy_local_kind_registry.sh

.PHONY: test-e2e
test-e2e: deploy-latest-mr
test-e2e: deploy-latest-mr deploy-local-registry
poetry run pytest --e2e -s -rA

.PHONY: test
Expand Down
8 changes: 8 additions & 0 deletions clients/python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,4 +200,12 @@ You can use `make test` to execute `pytest`.

Check out our [recommendations on setting up your docker engine](https://github.com/kubeflow/model-registry/blob/main/CONTRIBUTING.md#docker-engine) on an ARM processor.

### Extras

Depending on your development flow, you need to install extra dependencies:

```
poetry install -E "olot"
```

<!-- github-only -->
1 change: 1 addition & 0 deletions clients/python/noxfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ def e2e_tests(session: Session) -> None:
"coverage[toml]",
"pytest-cov",
"huggingface-hub",
"olot",
)
try:
session.run(
Expand Down
120 changes: 117 additions & 3 deletions clients/python/poetry.lock

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions clients/python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,11 @@ nest-asyncio = "^1.6.0"
eval-type-backport = "^0.2.0"

huggingface-hub = { version = ">=0.20.1,<0.29.0", optional = true }
olot = { version = "^0.1.2", optional = true }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is following the same pattern by huggingface-hub above. This particular dependency is optional because if you choose not to use this new "save" method, you would never need this dependency. The README was updated to reflect this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


[tool.poetry.extras]
hf = ["huggingface-hub"]
olot = ["olot"]

[tool.poetry.group.docs]
optional = true
Expand Down
109 changes: 108 additions & 1 deletion clients/python/src/model_registry/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,13 @@
from __future__ import annotations

import os
from pathlib import Path
from typing import Callable, TypedDict

from typing_extensions import overload

from ._utils import required_args
from .exceptions import MissingMetadata
from .exceptions import MissingMetadata, StoreError


@overload
Expand Down Expand Up @@ -90,3 +92,108 @@ def s3_uri_from(
# https://alexwlchan.net/2020/s3-keys-are-not-file-paths/ nor do they resolve to valid URls
# FIXME: is this safe?
return f"s3://{bucket}/{path}?endpoint={endpoint}&defaultRegion={region}"


class BackendDefinition(TypedDict):
"""Holds the 3 core callables for a backend:
- is_available() -> bool
- pull(base_image: str, dest_dir: Path) -> None
- push(local_image_path: Path, oci_ref: str) -> None.
"""
available: Callable[[], bool]
pull: Callable[[str, Path], None]
push: Callable[[Path, str], None]

# A dict mapping backend names to their definitions
BackendDict = dict[str, Callable[[], BackendDefinition]]


def get_skopeo_backend() -> BackendDefinition:
try:
from olot.backend.skopeo import is_skopeo, skopeo_pull, skopeo_push
except ImportError as e:
msg = "Could not import 'olot.backend.skopeo'. Ensure that 'olot' is installed if you want to use the 'skopeo' backend."
raise ImportError(msg) from e

return {
"is_available": is_skopeo,
"pull": skopeo_pull,
"push": skopeo_push
}

def get_oras_backend() -> BackendDefinition:
try:
from olot.backend.oras_cp import is_oras, oras_pull, oras_push
except ImportError as e:
msg = "Could not import 'olot.backend.oras_cp'. Ensure that 'olot' is installed if you want to use the 'oras_cp' backend."
raise ImportError(msg) from e

return {
"is_available": is_oras,
"pull": oras_pull,
"push": oras_push,
}

DEFAULT_BACKENDS = {
"skopeo": get_skopeo_backend,
"oras": get_oras_backend,
}

def save_to_oci_registry(
base_image: str,
dest_dir: str | os.PathLike,
oci_ref: str,
model_files: list[os.PathLike],
backend: str = "skopeo",
modelcard: os.PathLike | None = None,
backend_registry: BackendDict | None = DEFAULT_BACKENDS,
):
"""Appends a list of files to an OCI-based image.

Args:
base_image: The image to append model files to. This image will be downloaded to the location at `dest_dir`
dest_dir: The location to save the downloaded and extracted base image to.
oci_ref: Destination of where to push the newly layered image to
model_files: List of files to add to the base_image as layers
backend: The CLI tool to use to perform the oci image pull/push. One of: "skopeo", "oras"
modelcard: Optional, path to the modelcard to additionally include as a layer

Raises:
ValueError: If the chosen backend is not installed on the host
StoreError: If the chosen backend is an invalid option
StoreError: If `olot` is not installed as a python package
Returns:
None.
"""
try:
from olot.basics import oci_layers_on_top
except ImportError as e:
msg = """Package `olot` is not installed.
To save models to OCI compatible storage, start by installing the `olot` package, either directly or as an
extra (available as `model-registry[olot]`), e.g.:
```sh
!pip install --pre model-registry[olot]
```
or
```sh
!pip install olot
```
"""
raise StoreError(msg) from e


if backend not in backend_registry:
msg = f"'{backend}' is not an available backend to use. Available backends: {backend_registry.keys()}"
raise ValueError(msg)

# Fetching the backend definition can throw an error, but it should bubble up as it has the appropriate messaging
backend_def = backend_registry[backend]()

if not backend_def["available"]():
msg = f"Backend '{backend}' is selected, but not available on the system. Ensure the dependencies for '{backend}' are installed in your environment."
raise ValueError(msg)

local_image_path = Path(dest_dir)
backend_def["pull"](base_image, local_image_path)
oci_layers_on_top(local_image_path, model_files, modelcard)
backend_def["push"](local_image_path, oci_ref)
1 change: 1 addition & 0 deletions clients/python/tests/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
data/
21 changes: 20 additions & 1 deletion clients/python/tests/test_utils.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
import os
import pathlib

import pytest

from model_registry.exceptions import MissingMetadata
from model_registry.utils import s3_uri_from
from model_registry.utils import s3_uri_from, save_to_oci_registry


def test_s3_uri_builder():
Expand Down Expand Up @@ -71,3 +72,21 @@ def test_s3_uri_builder_with_complete_env():
os.environ["AWS_S3_ENDPOINT"] = "test-endpoint"
os.environ["AWS_DEFAULT_REGION"] = "test-region"
assert s3_uri_from("test-path") == s3_uri_from("test-path", "test-bucket")

@pytest.mark.e2e(type="oci")
def test_save_to_oci_registry_with_skopeo():
# TODO: We need a good source registry which is oci-compliant and very small in size
base_image = "quay.io/mmortari/hello-world-wait:latest"
dest_dir = "tests/data"
oci_ref = "localhost:5001/foo/bar:latest"

# Create a sample file named README.md to be added to the registry
pathlib.Path(dest_dir).mkdir(parents=True, exist_ok=True)
readme_file_path = os.path.join(dest_dir, "README.md")
with open(readme_file_path, "w") as f:
f.write("")

model_files = [readme_file_path]
backend = "skopeo"

save_to_oci_registry(base_image, dest_dir, oci_ref, model_files, backend)
13 changes: 13 additions & 0 deletions scripts/deploy_local_kind_registry.sh
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added as a way to have the e2e tests include image download/upload. A registry is needed to perform the upload step, and so a local registry should be good enough to perform this test.

Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/bin/sh
set -o errexit

# Some of this copy-pasted from https://kind.sigs.k8s.io/docs/user/local-registry/

# 1. Create registry container unless it already exists
reg_name='local-mr-registry'
reg_port='5001'
if [ "$(docker inspect -f '{{.State.Running}}' "${reg_name}" 2>/dev/null || true)" != 'true' ]; then
docker run \
-d --restart=always -p "127.0.0.1:${reg_port}:5000" --network bridge --name "${reg_name}" \
registry:2
fi
2 changes: 1 addition & 1 deletion scripts/deploy_on_kind.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ source ./${DIR}/utils.sh

# modularity to allow re-use this script against a remote k8s cluster
if [[ -n "$LOCAL" ]]; then
CLUSTER_NAME="${CLUSTER_NAME:-kind}"
CLUSTER_NAME="${CLUSTER_NAME:-mr-e2e}"

echo 'Creating local Kind cluster and loading image'

Expand Down
Loading