Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

for same query_text refresh just execution once #7295

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

gaecoli
Copy link
Member

@gaecoli gaecoli commented Jan 24, 2025

What type of PR is this?

  • Refactor
  • Feature
  • Bug Fix
  • New Query Runner (Data Source)
  • New Alert Destination
  • Other

Description

How is this tested?

  • Unit tests (pytest, jest)
  • E2E Tests (Cypress)
  • Manually
  • N/A

Related Tickets & Documents

Mobile & Desktop Screenshots/Recordings (if there are UI changes)

@gaecoli gaecoli requested review from arikfr and justinclift January 24, 2025 03:46
@gaecoli
Copy link
Member Author

gaecoli commented Jan 24, 2025

This PR introduces a mechanism to prevent duplicate execution of the same SQL query in a distributed environment. By implementing a distributed locking mechanism using Redis, we ensure that only one process or thread can execute a specific SQL query at a given time, avoiding unnecessary load on the database and ensuring consistent query results.

@gaecoli gaecoli requested a review from eradman January 24, 2025 04:13
@@ -3,7 +3,7 @@ on:
push:
branches:
- master
pull_request_target:
pull_request:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gaecoli I did this change to match changes on master. Not sure why it shows up in diff, but it's fine.

@arikfr
Copy link
Member

arikfr commented Feb 4, 2025

@gaecoli I wonder if we really need this? I mean the chance of the exact same query being executed at the same time are usually low vs. the complexity and potential issues this might add.

Do you have this happen frequently?

@gaecoli
Copy link
Member Author

gaecoli commented Feb 5, 2025

@gaecoli I wonder if we really need this? I mean the chance of the exact same query being executed at the same time are usually low vs. the complexity and potential issues this might add.

Do you have this happen frequently?

OK, i will consider that! Thank you! @arikfr

@gaecoli
Copy link
Member Author

gaecoli commented Feb 5, 2025

@gaecoli I wonder if we really need this? I mean the chance of the exact same query being executed at the same time are usually low vs. the complexity and potential issues this might add.

Do you have this happen frequently?

When i add the same query at a dashboard, the query has lot of visualizations, table, chart..., when i add this at a dashboard, i refresh dashboard, the same query text will all execution many times in query engine (Presto, MySQL, doris...).

At my company, 1000+ people maybe refresh same dashboard, so you know what i say.

@eradman
Copy link
Collaborator

eradman commented Feb 6, 2025

Unfortunately this does happen more often than I would have guessed. Any time multiple visualizations of the same query is included on a dashboard there is a high probability of duplicate queries

SELECT regexp_matches(query, 'Query Hash: [a-z0-9]+') FROM pg_stat_activity WHERE state='active';
\watch 1
...
                  regexp_matches
--------------------------------------------------
 {"Query Hash: 5aa874345926f2b18ecf197d3200a602"}
 {"Query Hash: 5aa874345926f2b18ecf197d3200a602"}
(2 rows)

For very long queries (not uncommon for my users!) this will cause unnecessary load.

@arikfr
Copy link
Member

arikfr commented Feb 6, 2025

Maybe the right thing will be to make dashboards refreshes smarter and reuse the same query invocation for different visualizations that depend on it? I think it will be more robust and address the core issue instead of trying to address it in an indirect way.

@eradman
Copy link
Collaborator

eradman commented Feb 6, 2025

Maybe the right thing will be to make dashboards refreshes smarter and reuse the same query invocation for different visualizations that depend on it

That seems like a good approach, unless this also becomes complex. My guess is that a dashboard is the only API user that would normally hit this race condition.

@gaecoli
Copy link
Member Author

gaecoli commented Feb 7, 2025

Maybe the right thing will be to make dashboards refreshes smarter and reuse the same query invocation for different visualizations that depend on it? I think it will be more robust and address the core issue instead of trying to address it in an indirect way.

This is a good approach, but I think such a change would make the code more complex.

@eradman
Copy link
Collaborator

eradman commented Feb 7, 2025

Tested this change manually, it does seem to work. Also I see the log messages

server-1 | [2025-02-07 13:56:32,160][PID:19][INFO][redash.utils.locks] Lock released successfully, lock_name=[lock:query_hash_job:1:d6338e9508dd103771b69483fb17d4a5], identifier=[73683a39-6aea-4f29-b0d3-02e4c0385a0b]

@gaecoli
Copy link
Member Author

gaecoli commented Feb 8, 2025

Tested this change manually, it does seem to work. Also I see the log messages

server-1 | [2025-02-07 13:56:32,160][PID:19][INFO][redash.utils.locks] Lock released successfully, lock_name=[lock:query_hash_job:1:d6338e9508dd103771b69483fb17d4a5], identifier=[73683a39-6aea-4f29-b0d3-02e4c0385a0b]

Yes, because it's work well at my company's Redash!

@yoshiokatsuneo
Copy link
Contributor

yoshiokatsuneo commented Feb 22, 2025

@gaecoli

I think the code does not need to use/check "identifier" as the lock and release are always done by the same process synchronously , as it is locked. ( I revmoed the comment because of the reason mentions in next comment.)

@yoshiokatsuneo
Copy link
Contributor

@gaecoli

I think the code does not need to use/check "identifier" as the lock and release are always done by the same process synchronously , as it is locked.

I read the following article and understand that "identifier" is needed for safety.
https://redis.io/docs/latest/develop/use/patterns/distributed-locks/

@gaecoli
Copy link
Member Author

gaecoli commented Mar 8, 2025

@gaecoli
I think the code does not need to use/check "identifier" as the lock and release are always done by the same process synchronously , as it is locked.

I read the following article and understand that "identifier" is needed for safety. https://redis.io/docs/latest/develop/use/patterns/distributed-locks/

Yeah!

@yoshiokatsuneo
Copy link
Contributor

I think it is nice if backend-lint error is fixed by running "ruff check ." .

https://github.com/getredash/redash/actions/runs/13733542335/job/38414390891?pr=7295

image

@gaecoli
Copy link
Member Author

gaecoli commented Mar 10, 2025

I think it is nice if backend-lint error is fixed by running "ruff check ." .

https://github.com/getredash/redash/actions/runs/13733542335/job/38414390891?pr=7295

image

I tried it. But failed!

@yoshiokatsuneo
Copy link
Contributor

I think it is nice if backend-lint error is fixed by running "ruff check ." .

I tried it. But failed!

I could run "ruff check . --fix" to fix the issue by running ruff installed by homebrew without problem like below:

 (redis-lock)$ ruff --version
ruff 0.8.3
 (redis-lock)$ which ruff
/opt/homebrew/bin/ruff
 (redis-lock)$ ruff check . --fix
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'mccabe' -> 'lint.mccabe'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
Found 1 error (1 fixed, 0 remaining).
 (redis-lock)$ git diff
diff --git a/redash/utils/locks.py b/redash/utils/locks.py
index 126496096..78ad2693c 100644
--- a/redash/utils/locks.py
+++ b/redash/utils/locks.py
@@ -1,10 +1,12 @@
+import logging
 import random
 import time
 import uuid
-import logging
-from redash import redis_connection
+
 from redis import WatchError
 
+from redash import redis_connection
+
 logger = logging.getLogger(__name__)
 
 
 (redis-lock)$ ruff check .
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'mccabe' -> 'lint.mccabe'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
All checks passed!

@gaecoli
Copy link
Member Author

gaecoli commented Mar 10, 2025

I think it is nice if backend-lint error is fixed by running "ruff check ." .

I tried it. But failed!

I could run "ruff check . --fix" to fix the issue by running ruff installed by homebrew without problem like below:

 (redis-lock)$ ruff --version
ruff 0.8.3
 (redis-lock)$ which ruff
/opt/homebrew/bin/ruff
 (redis-lock)$ ruff check . --fix
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'mccabe' -> 'lint.mccabe'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
Found 1 error (1 fixed, 0 remaining).
 (redis-lock)$ git diff
diff --git a/redash/utils/locks.py b/redash/utils/locks.py
index 126496096..78ad2693c 100644
--- a/redash/utils/locks.py
+++ b/redash/utils/locks.py
@@ -1,10 +1,12 @@
+import logging
 import random
 import time
 import uuid
-import logging
-from redash import redis_connection
+
 from redis import WatchError
 
+from redash import redis_connection
+
 logger = logging.getLogger(__name__)
 
 
 (redis-lock)$ ruff check .
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'mccabe' -> 'lint.mccabe'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
All checks passed!

done

@yoshiokatsuneo
Copy link
Contributor

Thanks, Great !

@gaecoli
Copy link
Member Author

gaecoli commented Mar 11, 2025

Maybe the right thing will be to make dashboards refreshes smarter and reuse the same query invocation for different visualizations that depend on it? I think it will be more robust and address the core issue instead of trying to address it in an indirect way.

@yoshiokatsuneo I think this is good advice!

@yoshiokatsuneo
Copy link
Contributor

@gaecoli

I'm not sure which is "smarter" solution to handle race condition where on server-side or frontend-side. For me, your PR is reasonable and smart enough.

If we want to make the PR shorter, I think one option is to use LUA script to handle atomic transaction. But, I'm not sure whether it is better or not.

@gaecoli
Copy link
Member Author

gaecoli commented Mar 12, 2025

@gaecoli

I'm not sure which is "smarter" solution to handle race condition where on server-side or frontend-side. For me, your PR is reasonable and smart enough.

If we want to make the PR shorter, I think one option is to use LUA script to handle atomic transaction. But, I'm not sure whether it is better or not.

OK, i hope this PR will help you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants