USHIFT-5287: RHOAI Model Serving On MicroShift #1737

pmtk · 2025-01-17T11:51:15Z

No description provided.

openshift-ci-robot · 2025-01-17T11:51:19Z

@pmtk: This pull request references USHIFT-5287 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-01-17T11:51:31Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign syed for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pmtk · 2025-01-20T08:35:16Z

/uncc @jhjaggars @LalatenduMohanty

DanielFroehlich · 2025-01-23T17:02:48Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+> - If you want to use graphics processing units (GPUs) with your model server, you have enabled GPU support in OpenShift AI.
+> - To use the vLLM runtime, you have enabled GPU support in OpenShift AI and have installed and configured the Node Feature Discovery operator on your cluster
+
+Neither GPU Operator, NFD, or Intel Gaudi AI Accelerator Operator are supported on MicroShift.


I thought we had enabled specific GPU support for MicroShift, e.g. https://docs.nvidia.com/datacenter/cloud-native/edge/latest/nvidia-gpu-with-device-edge.html

Well, that should help then. At least with NVIDIA.
So one piece would be part of the procedure you linked, other part would be using nvidia triton

DanielFroehlich · 2025-01-23T17:04:13Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+
+Neither GPU Operator, NFD, or Intel Gaudi AI Accelerator Operator are supported on MicroShift.
+
+Should users be instructed to use upstream releases of these components and configure them on their own?


Rather not - I would prefer everything being fully supported. I hate it if we point to upstream community stuff. I would rather work with partners to get them support MicroShift (like we do with NVIDIA)

Link in your previous comment suggests we should be good to go with nvidia, so I don't think it's a problem (not sure about intel)

enhancements/microshift/rhoai-model-serving-on-microshift.md

pmtk · 2025-01-24T11:56:28Z

/cc @lburgazzoli

openshift-ci · 2025-01-29T10:05:40Z

@pmtk: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

danielezonca

Thanks for the document, very well detailed.

I have some comments to clarify the scope but I don't see any particular issue with the proposal

danielezonca · 2025-01-29T17:52:24Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+- Provide RHOAI supported ServingRuntimes CRs so that users can use them.
+  - [List of supported model-serving runtimes](https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.16/html/serving_models/serving-large-models_serving-large-models#supported-model-serving-runtimes_serving-large-models)
+    (not all might be suitable for MicroShift - e.g. intended for multi model serving)


I would start with vLLM-cuda (there is an image per accelerator) to cover LLM and OpenVINO as second priority.

danielezonca · 2025-01-29T17:57:00Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+
+### Non-Goals
+
+- Deploying full RHOAI on MicroShift.


Does it mean "bare KServe" without RHOAI operator at all?

Yes. MicroShift is intended for edge deployments so we neither have resources nor needs for whole suite of tools. Just serving the models.

danielezonca · 2025-01-29T18:01:29Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+At the time of writing this document, kserve's Raw Deployment mode is not fully
+supported by the RHOAI. For for reason, this feature will start as Tech Preview
+and only advance to GA when RHOAI starts supporting Raw Deployment mode.


This is going to change in like 1 release so I think we can already assume here to have it fully supported.

danielezonca · 2025-01-29T18:09:39Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+we shall work with partners to achieve that. We want to avoid directing users
+to generic upstream information without any support.
+
+### Do we need ODH Model Controller?


It is something to double check probably: for example this controller has some logic about "ODH Connection" conversion (that is an "opinionated secret") to configure model storage credentials, or many other features ( see the different reconcilers as reference ).

danielezonca · 2025-01-29T18:16:03Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+- Include ServingRuntimes in a RPM as a kustomization manifest (might be 
+  `microshift-kserve`, `microshift-rhoai-runtimes`, or something else) using a
+  specific predefined namespace.
+  - This will force users to either use that namespace for serving, or they can
+    copy the SRs to their namespace (either at runtime, or by including it in
+    their manifests).


I think this is the easiest way to start with

danielezonca · 2025-01-29T18:16:20Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+- Change ServingRuntimes to ClusterServingRuntimes and include them in an RPM,
+  so they're accessible from any namespace (MicroShift is intended for
+  single-user operation anyway).


From a MicroShift user perspective this is probably the best option and we are considering to adopt ClusterServingRuntime in the future but not sure when

Yup, I also think it'd be the most convenient

danielezonca · 2025-01-29T18:24:04Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+
+While it might be best to have RPM with RHOAI version, because RHOAI does not
+follow OpenShift's release schedule and the RPM will live in the MicroShift
+repository, it will be versioned together with MicroShift.


What is the release frequency of MicroShift?

MicroShift releases together with OpenShift, so I think around every 4 months

danielezonca · 2025-01-29T18:25:33Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+First, a smoke test in MicroShift's test harness:
+- Stand up MicroShift device/cluster
+- Install kserve
+- Create an InferenceService using some basic model that can be easily handled by CPU


Ok if this test case is CPU only I would cover only OpenVINO

danielezonca · 2025-01-29T18:28:18Z

enhancements/microshift/rhoai-model-serving-on-microshift.md

+For the most part, RHOAI's and/or kserve support procedures are to be followed.
+
+Although there might some cases where debugging MicroShift might be required.
+One example is Ingress and Routes, as this is the element that kserve will


What about endpoint security?
In RHOAI we use ServiceMesh + Authorino in the Serverless configuration while for RawDeployment we expect to use oauth-proxy (it will be replaced by something else that has not been defined yet)

RHOAI Model Serving On MicroShift

d36236f

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 17, 2025

openshift-ci bot requested review from jhjaggars and LalatenduMohanty January 17, 2025 11:51

openshift-ci bot removed request for jhjaggars and LalatenduMohanty January 20, 2025 08:35

DanielFroehlich reviewed Jan 23, 2025

View reviewed changes

pmtk added 4 commits January 24, 2025 09:37

Future testing on jetson orin

55d2ae1

Setting up NVIDIA GPU and other acc's

ff73ea4

Note on rebasing RHOAI manifests

3c66a77

Add RHOAI Expert reviewer

fda86fc

openshift-ci bot requested a review from lburgazzoli January 24, 2025 11:56

pmtk added 2 commits January 28, 2025 09:56

rebase procedure in detail

b24707b

Add Q about Model Controller

f820782

danielezonca reviewed Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

USHIFT-5287: RHOAI Model Serving On MicroShift #1737

USHIFT-5287: RHOAI Model Serving On MicroShift #1737

pmtk commented Jan 17, 2025

openshift-ci-robot commented Jan 17, 2025 •

edited by openshift-ci bot

Loading

openshift-ci bot commented Jan 17, 2025

pmtk commented Jan 20, 2025

DanielFroehlich Jan 23, 2025

pmtk Jan 24, 2025

DanielFroehlich Jan 23, 2025

pmtk Jan 24, 2025

pmtk commented Jan 24, 2025

openshift-ci bot commented Jan 29, 2025

danielezonca left a comment

danielezonca Jan 29, 2025

danielezonca Jan 29, 2025

pmtk Jan 30, 2025

danielezonca Jan 29, 2025

danielezonca Jan 29, 2025

danielezonca Jan 29, 2025

danielezonca Jan 29, 2025

pmtk Jan 30, 2025

danielezonca Jan 29, 2025

pmtk Jan 30, 2025

danielezonca Jan 29, 2025

danielezonca Jan 29, 2025


		Neither GPU Operator, NFD, or Intel Gaudi AI Accelerator Operator are supported on MicroShift.

		Should users be instructed to use upstream releases of these components and configure them on their own?

USHIFT-5287: RHOAI Model Serving On MicroShift #1737

Are you sure you want to change the base?

USHIFT-5287: RHOAI Model Serving On MicroShift #1737

Conversation

pmtk commented Jan 17, 2025

openshift-ci-robot commented Jan 17, 2025 • edited by openshift-ci bot Loading

openshift-ci bot commented Jan 17, 2025

pmtk commented Jan 20, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pmtk commented Jan 24, 2025

openshift-ci bot commented Jan 29, 2025

danielezonca left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-ci-robot commented Jan 17, 2025 •

edited by openshift-ci bot

Loading