Skip to content

[Enhancement] Enhance validation for create connector API - Part 3 #3967

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

akolarkunnu
Copy link
Contributor

@akolarkunnu akolarkunnu commented Jul 8, 2025

Description

This change will address the third part of validation "validation of pre and post embeddings processing functions".
If the text "embed" exists in the url, make sure pre and post processing functions are an embedding functions. If the check is false then shows a warning message in the log.
Pre and post processing functions are corresponding to llm service is already validated as part of #3579.
Added the test case for negative scenario, positive scenarios already coverer in the existing tests of ConnectorActionTest.

Related Issues

Resolves #2993

Check List

  • New functionality includes testing.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

This change will address the third part of validation "embeddings pre and post processing function validation".

Resolves opensearch-project#2993

Signed-off-by: Abdul Muneer Kolarkunnu <[email protected]>
@akolarkunnu akolarkunnu requested a deployment to ml-commons-cicd-env-require-approval July 8, 2025 07:32 — with GitHub Actions Waiting
@akolarkunnu akolarkunnu requested a deployment to ml-commons-cicd-env-require-approval July 8, 2025 07:32 — with GitHub Actions Waiting
@akolarkunnu akolarkunnu requested a deployment to ml-commons-cicd-env-require-approval July 8, 2025 07:32 — with GitHub Actions Waiting
@akolarkunnu akolarkunnu requested a deployment to ml-commons-cicd-env-require-approval July 8, 2025 07:32 — with GitHub Actions Waiting
This change will address the third part of validation "embeddings pre and post processing function validation".

Resolves opensearch-project#2993

Signed-off-by: Abdul Muneer Kolarkunnu <[email protected]>
@akolarkunnu akolarkunnu requested a deployment to ml-commons-cicd-env-require-approval July 8, 2025 07:38 — with GitHub Actions Waiting
@akolarkunnu akolarkunnu requested a deployment to ml-commons-cicd-env-require-approval July 8, 2025 07:38 — with GitHub Actions Waiting
@akolarkunnu akolarkunnu requested a deployment to ml-commons-cicd-env-require-approval July 8, 2025 07:38 — with GitHub Actions Waiting
@akolarkunnu akolarkunnu requested a deployment to ml-commons-cicd-env-require-approval July 8, 2025 07:38 — with GitHub Actions Waiting
This change will address the third part of validation "embeddings pre and post processing function validation".

Resolves opensearch-project#2993

Signed-off-by: Abdul Muneer Kolarkunnu <[email protected]>
@akolarkunnu akolarkunnu temporarily deployed to ml-commons-cicd-env-require-approval July 8, 2025 07:43 — with GitHub Actions Inactive
@akolarkunnu akolarkunnu temporarily deployed to ml-commons-cicd-env-require-approval July 8, 2025 07:43 — with GitHub Actions Inactive
@akolarkunnu akolarkunnu had a problem deploying to ml-commons-cicd-env-require-approval July 8, 2025 07:43 — with GitHub Actions Error
@akolarkunnu akolarkunnu had a problem deploying to ml-commons-cicd-env-require-approval July 8, 2025 07:43 — with GitHub Actions Failure
@@ -232,7 +234,7 @@ public static String getRemoteServerFromURL(String url) {
return SUPPORTED_REMOTE_SERVERS_FOR_DEFAULT_ACTION_TYPES.stream().filter(url::contains).findFirst().orElse("");
}

private void validateProcessFunctions(String remoteServer, String processFunction, String funcNameForWarnText) {
private void validateProcessFunctions(String endPointUrl, String remoteServer, String processFunction, String funcNameForWarnText) {
if (isInBuiltProcessFunction(processFunction)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the issue, I believe we should block the connector creation if the input process function doesn't exist. But based on your PRs I didn't find such validation, am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zane-neo for review.
Already two PRs done for fixing this bug and this is third part of it.
#3260
#3579.

So validation of required parameters for connector creation is done as part of #3260 . Out of that, process functions are optional field according to https://docs.opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/#configuration-parameters and we didn't add any validation for it. All other parameters are also discussed in that PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand it's an optional field but the point is: if it shows up it's either a painless script or a build-in process function, not any other value. So I believe the validation should be try to compile the value to painless script, if exception occurred, then check if it's a valid value in Pre/PostProcessFunction. Without this validation, user still can pass any value to the process_function field and not error message letting them know it's not a correct value, during runtime they found the function not working at all not knowing the root cause.

Copy link
Contributor Author

@akolarkunnu akolarkunnu Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zane-neo I agree with that, that's the perfect validation here. But there is a technical limitation to implement that. As per my code walkthrough ScriptService instance is not available during the connector creation code flow. It will be available during the actual processing of pre or post process functions(inside the class DefaultPreProcessFunction). Are there anyway I can get ScriptService instance in ConnectorActions class ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement] Enhance validation for create connector API
2 participants