Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python: Fixes to Cosmos DB NoSQL query syntax generation. #10373

Merged
merged 6 commits into from
Feb 6, 2025

Conversation

davidatorres
Copy link
Contributor

Motivation and Context

Review information:

  1. Why is this change required?

There were several SQL query syntax generation issue for both text_search() and vectorized_search() methods

  1. What problem does it solve?

a) In _build_where_clauses_from_filter() the WHERE clause creation did not quote string values and did not properly handle list[] data model attribures,
b) for _build_vector_query() method the WHERE clause was placed after the ORDER BY clause causing query syntax errors,
c) for _build_search_text_query() method the data_model_definition items were not being interrogated, thus added CONTAINS() in the WHERE clause.

  1. What scenario does it contribute to?

The ability to use Azure Cosmos DB NoSQL for text_search() and vectorized_search().

  1. Issue resolution:
    Python: Bug: AzureCosmosDBNoSQLCollection text_search() and vectorized_search() produce incorrect results due to malformed query strings. #10368

Description

The errors noted in the Issue and the bug fixes noted above correct the SQL query syntax generation for the text_search() and vectorized_search() methods and now produce accurate results for performing both types of queries with and without filters (aka WHERE clause).

Contribution Checklist

@davidatorres davidatorres requested a review from a team as a code owner February 3, 2025 10:37
@markwallace-microsoft markwallace-microsoft added python Pull requests for the Python Semantic Kernel memory labels Feb 3, 2025
@github-actions github-actions bot changed the title Fixes to Cosmos DB NoSQL query syntax generation. Python: Fixes to Cosmos DB NoSQL query syntax generation. Feb 3, 2025
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Feb 3, 2025

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
semantic_kernel/connectors/memory/azure_cosmos_db
   azure_cosmos_db_no_sql_collection.py1716761%98–99, 140–147, 158–172, 178–186, 192–196, 219–243, 247, 251, 275, 303–304, 308–313
TOTAL17838201589% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
3135 4 💤 0 ❌ 0 🔥 1m 26s ⏱️

@moonbox3
Copy link
Contributor

moonbox3 commented Feb 3, 2025

Hi @davidatorres, thanks for the PR! Have you installed the pre-commit locally? There's a rough error that needs to be fixed on your machine before pushing upstream. If you haven't installed the pre-commit, you can do so by running:

uv run pre-commit install -c python/.pre-commit-config.yaml

Now, when you running git commit -m "Your message" it will run the required steps, and alert you that there is a ruff formatting fix required.

@moonbox3
Copy link
Contributor

moonbox3 commented Feb 5, 2025

Are there any further unit tests we need to exercise the new code?

@davidatorres
Copy link
Contributor Author

No new tests needed. Updates correct the generation of the SQL queries to ensure that they are populated and formatted correctly.

@moonbox3 moonbox3 enabled auto-merge February 6, 2025 04:01
@moonbox3 moonbox3 added this pull request to the merge queue Feb 6, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 6, 2025
@moonbox3 moonbox3 added this pull request to the merge queue Feb 6, 2025
Merged via the queue into microsoft:main with commit 3461ed6 Feb 6, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
memory python Pull requests for the Python Semantic Kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants