-
Notifications
You must be signed in to change notification settings - Fork 168
[Enhancement] Enhance validation for create connector API - Part 3 #3967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This change will address the third part of validation "embeddings pre and post processing function validation". Resolves opensearch-project#2993 Signed-off-by: Abdul Muneer Kolarkunnu <[email protected]>
This change will address the third part of validation "embeddings pre and post processing function validation". Resolves opensearch-project#2993 Signed-off-by: Abdul Muneer Kolarkunnu <[email protected]>
This change will address the third part of validation "embeddings pre and post processing function validation". Resolves opensearch-project#2993 Signed-off-by: Abdul Muneer Kolarkunnu <[email protected]>
@@ -232,7 +234,7 @@ public static String getRemoteServerFromURL(String url) { | |||
return SUPPORTED_REMOTE_SERVERS_FOR_DEFAULT_ACTION_TYPES.stream().filter(url::contains).findFirst().orElse(""); | |||
} | |||
|
|||
private void validateProcessFunctions(String remoteServer, String processFunction, String funcNameForWarnText) { | |||
private void validateProcessFunctions(String endPointUrl, String remoteServer, String processFunction, String funcNameForWarnText) { | |||
if (isInBuiltProcessFunction(processFunction)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the issue, I believe we should block the connector creation if the input process function doesn't exist. But based on your PRs I didn't find such validation, am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @zane-neo for review.
Already two PRs done for fixing this bug and this is third part of it.
#3260
#3579.
So validation of required parameters for connector creation is done as part of #3260 . Out of that, process functions are optional field according to https://docs.opensearch.org/docs/latest/ml-commons-plugin/remote-models/blueprints/#configuration-parameters and we didn't add any validation for it. All other parameters are also discussed in that PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand it's an optional field but the point is: if it shows up it's either a painless script or a build-in process function, not any other value. So I believe the validation should be try to compile the value to painless script, if exception occurred, then check if it's a valid value in Pre/PostProcessFunction. Without this validation, user still can pass any value to the process_function field and not error message letting them know it's not a correct value, during runtime they found the function not working at all not knowing the root cause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zane-neo I agree with that, that's the perfect validation here. But there is a technical limitation to implement that. As per my code walkthrough ScriptService instance is not available during the connector creation code flow. It will be available during the actual processing of pre or post process functions(inside the class DefaultPreProcessFunction). Are there anyway I can get ScriptService instance in ConnectorActions class ?
Description
This change will address the third part of validation "validation of pre and post embeddings processing functions".
If the text "embed" exists in the url, make sure pre and post processing functions are an embedding functions. If the check is false then shows a warning message in the log.
Pre and post processing functions are corresponding to llm service is already validated as part of #3579.
Added the test case for negative scenario, positive scenarios already coverer in the existing tests of ConnectorActionTest.
Related Issues
Resolves #2993
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.