for same query_text refresh just execution once #7295

gaecoli · 2025-01-24T03:42:32Z

What type of PR is this?

Description

How is this tested?

Unit tests (pytest, jest)
E2E Tests (Cypress)
Manually
N/A

Related Tickets & Documents

Mobile & Desktop Screenshots/Recordings (if there are UI changes)

gaecoli · 2025-01-24T03:47:39Z

This PR introduces a mechanism to prevent duplicate execution of the same SQL query in a distributed environment. By implementing a distributed locking mechanism using Redis, we ensure that only one process or thread can execute a specific SQL query at a given time, avoiding unnecessary load on the database and ensuring consistent query results.

arikfr · 2025-01-31T08:30:39Z

.github/workflows/ci.yml

@@ -3,7 +3,7 @@ on:
  push:
    branches:
      - master
-  pull_request_target:
+  pull_request:


@gaecoli I did this change to match changes on master. Not sure why it shows up in diff, but it's fine.

arikfr · 2025-02-04T09:28:00Z

@gaecoli I wonder if we really need this? I mean the chance of the exact same query being executed at the same time are usually low vs. the complexity and potential issues this might add.

Do you have this happen frequently?

gaecoli · 2025-02-05T01:47:01Z

@gaecoli I wonder if we really need this? I mean the chance of the exact same query being executed at the same time are usually low vs. the complexity and potential issues this might add.

Do you have this happen frequently?

OK, i will consider that! Thank you! @arikfr

gaecoli · 2025-02-05T04:20:18Z

@gaecoli I wonder if we really need this? I mean the chance of the exact same query being executed at the same time are usually low vs. the complexity and potential issues this might add.

Do you have this happen frequently?

When i add the same query at a dashboard, the query has lot of visualizations, table, chart..., when i add this at a dashboard, i refresh dashboard, the same query text will all execution many times in query engine (Presto, MySQL, doris...).

At my company, 1000+ people maybe refresh same dashboard, so you know what i say.

eradman · 2025-02-06T15:48:16Z

Unfortunately this does happen more often than I would have guessed. Any time multiple visualizations of the same query is included on a dashboard there is a high probability of duplicate queries

SELECT regexp_matches(query, 'Query Hash: [a-z0-9]+') FROM pg_stat_activity WHERE state='active';

\watch 1
...
                  regexp_matches
--------------------------------------------------
 {"Query Hash: 5aa874345926f2b18ecf197d3200a602"}
 {"Query Hash: 5aa874345926f2b18ecf197d3200a602"}
(2 rows)

For very long queries (not uncommon for my users!) this will cause unnecessary load.

arikfr · 2025-02-06T21:24:20Z

Maybe the right thing will be to make dashboards refreshes smarter and reuse the same query invocation for different visualizations that depend on it? I think it will be more robust and address the core issue instead of trying to address it in an indirect way.

eradman · 2025-02-06T22:09:47Z

Maybe the right thing will be to make dashboards refreshes smarter and reuse the same query invocation for different visualizations that depend on it

That seems like a good approach, unless this also becomes complex. My guess is that a dashboard is the only API user that would normally hit this race condition.

gaecoli · 2025-02-07T03:24:39Z

Maybe the right thing will be to make dashboards refreshes smarter and reuse the same query invocation for different visualizations that depend on it? I think it will be more robust and address the core issue instead of trying to address it in an indirect way.

This is a good approach, but I think such a change would make the code more complex.

eradman · 2025-02-07T13:59:32Z

Tested this change manually, it does seem to work. Also I see the log messages

server-1 | [2025-02-07 13:56:32,160][PID:19][INFO][redash.utils.locks] Lock released successfully, lock_name=[lock:query_hash_job:1:d6338e9508dd103771b69483fb17d4a5], identifier=[73683a39-6aea-4f29-b0d3-02e4c0385a0b]

gaecoli · 2025-02-08T02:49:46Z

Tested this change manually, it does seem to work. Also I see the log messages

server-1 | [2025-02-07 13:56:32,160][PID:19][INFO][redash.utils.locks] Lock released successfully, lock_name=[lock:query_hash_job:1:d6338e9508dd103771b69483fb17d4a5], identifier=[73683a39-6aea-4f29-b0d3-02e4c0385a0b]

Yes, because it's work well at my company's Redash!

yoshiokatsuneo · 2025-02-22T05:49:31Z

@gaecoli

~~I think the code does not need to use/check "identifier" as the lock and release are always done by the same process synchronously , as it is locked.~~ ( I revmoed the comment because of the reason mentions in next comment.)

yoshiokatsuneo · 2025-03-07T16:03:53Z

@gaecoli

I think the code does not need to use/check "identifier" as the lock and release are always done by the same process synchronously , as it is locked.

I read the following article and understand that "identifier" is needed for safety.
https://redis.io/docs/latest/develop/use/patterns/distributed-locks/

gaecoli · 2025-03-08T03:25:39Z

@gaecoli
I think the code does not need to use/check "identifier" as the lock and release are always done by the same process synchronously , as it is locked.

I read the following article and understand that "identifier" is needed for safety. https://redis.io/docs/latest/develop/use/patterns/distributed-locks/

Yeah!

yoshiokatsuneo · 2025-03-10T07:29:22Z

I think it is nice if backend-lint error is fixed by running "ruff check ." .

https://github.com/getredash/redash/actions/runs/13733542335/job/38414390891?pr=7295

gaecoli · 2025-03-10T08:18:49Z

I think it is nice if backend-lint error is fixed by running "ruff check ." .

https://github.com/getredash/redash/actions/runs/13733542335/job/38414390891?pr=7295

I tried it. But failed!

yoshiokatsuneo · 2025-03-10T09:15:08Z

I think it is nice if backend-lint error is fixed by running "ruff check ." .

I tried it. But failed!

I could run "ruff check . --fix" to fix the issue by running ruff installed by homebrew without problem like below:

 (redis-lock)$ ruff --version
ruff 0.8.3
 (redis-lock)$ which ruff
/opt/homebrew/bin/ruff
 (redis-lock)$ ruff check . --fix
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'mccabe' -> 'lint.mccabe'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
Found 1 error (1 fixed, 0 remaining).
 (redis-lock)$ git diff
diff --git a/redash/utils/locks.py b/redash/utils/locks.py
index 126496096..78ad2693c 100644
--- a/redash/utils/locks.py
+++ b/redash/utils/locks.py
@@ -1,10 +1,12 @@
+import logging
 import random
 import time
 import uuid
-import logging
-from redash import redis_connection
+
 from redis import WatchError
 
+from redash import redis_connection
+
 logger = logging.getLogger(__name__)
 
 
 (redis-lock)$ ruff check .
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'mccabe' -> 'lint.mccabe'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
All checks passed!

gaecoli · 2025-03-10T10:10:57Z

I think it is nice if backend-lint error is fixed by running "ruff check ." .

I tried it. But failed!

I could run "ruff check . --fix" to fix the issue by running ruff installed by homebrew without problem like below:

 (redis-lock)$ ruff --version
ruff 0.8.3
 (redis-lock)$ which ruff
/opt/homebrew/bin/ruff
 (redis-lock)$ ruff check . --fix
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'mccabe' -> 'lint.mccabe'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
Found 1 error (1 fixed, 0 remaining).
 (redis-lock)$ git diff
diff --git a/redash/utils/locks.py b/redash/utils/locks.py
index 126496096..78ad2693c 100644
--- a/redash/utils/locks.py
+++ b/redash/utils/locks.py
@@ -1,10 +1,12 @@
+import logging
 import random
 import time
 import uuid
-import logging
-from redash import redis_connection
+
 from redis import WatchError
 
+from redash import redis_connection
+
 logger = logging.getLogger(__name__)
 
 
 (redis-lock)$ ruff check .
warning: The top-level linter settings are deprecated in favour of their counterparts in the `lint` section. Please update the following options in `pyproject.toml`:
  - 'ignore' -> 'lint.ignore'
  - 'select' -> 'lint.select'
  - 'mccabe' -> 'lint.mccabe'
  - 'per-file-ignores' -> 'lint.per-file-ignores'
All checks passed!

done

yoshiokatsuneo · 2025-03-10T11:30:11Z

Thanks, Great !

gaecoli · 2025-03-11T06:17:34Z

Maybe the right thing will be to make dashboards refreshes smarter and reuse the same query invocation for different visualizations that depend on it? I think it will be more robust and address the core issue instead of trying to address it in an indirect way.

@yoshiokatsuneo I think this is good advice!

yoshiokatsuneo · 2025-03-11T09:18:36Z

@gaecoli

I'm not sure which is "smarter" solution to handle race condition where on server-side or frontend-side. For me, your PR is reasonable and smart enough.

If we want to make the PR shorter, I think one option is to use LUA script to handle atomic transaction. But, I'm not sure whether it is better or not.

gaecoli · 2025-03-12T02:22:41Z

@gaecoli

I'm not sure which is "smarter" solution to handle race condition where on server-side or frontend-side. For me, your PR is reasonable and smart enough.

If we want to make the PR shorter, I think one option is to use LUA script to handle atomic transaction. But, I'm not sure whether it is better or not.

OK, i hope this PR will help you!

guyu added 2 commits January 24, 2025 11:41

for same query_text refresh just execution once

af496fe

format

f841b21

gaecoli requested review from arikfr and justinclift January 24, 2025 03:46

fix

06c9a2b

gaecoli requested a review from eradman January 24, 2025 04:13

Update ci.yml to match latest master

5cfa6bc

arikfr reviewed Jan 31, 2025

View reviewed changes

Merge branch 'master' into redis-lock

a50ea05

Merge branch 'master' into redis-lock

34e9325

fix

9e0e128

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

for same query_text refresh just execution once #7295

for same query_text refresh just execution once #7295

gaecoli commented Jan 24, 2025 •

edited by eradman

Loading

gaecoli commented Jan 24, 2025

arikfr Jan 31, 2025

arikfr commented Feb 4, 2025

gaecoli commented Feb 5, 2025

gaecoli commented Feb 5, 2025

eradman commented Feb 6, 2025

arikfr commented Feb 6, 2025

eradman commented Feb 6, 2025

gaecoli commented Feb 7, 2025

eradman commented Feb 7, 2025

gaecoli commented Feb 8, 2025

yoshiokatsuneo commented Feb 22, 2025 •

edited

Loading

yoshiokatsuneo commented Mar 7, 2025

gaecoli commented Mar 8, 2025

yoshiokatsuneo commented Mar 10, 2025

gaecoli commented Mar 10, 2025

yoshiokatsuneo commented Mar 10, 2025

gaecoli commented Mar 10, 2025

yoshiokatsuneo commented Mar 10, 2025

gaecoli commented Mar 11, 2025

yoshiokatsuneo commented Mar 11, 2025

gaecoli commented Mar 12, 2025

for same query_text refresh just execution once #7295

Are you sure you want to change the base?

for same query_text refresh just execution once #7295

Conversation

gaecoli commented Jan 24, 2025 • edited by eradman Loading

What type of PR is this?

Description

How is this tested?

Related Tickets & Documents

Mobile & Desktop Screenshots/Recordings (if there are UI changes)

gaecoli commented Jan 24, 2025

arikfr Jan 31, 2025

Choose a reason for hiding this comment

arikfr commented Feb 4, 2025

gaecoli commented Feb 5, 2025

gaecoli commented Feb 5, 2025

eradman commented Feb 6, 2025

arikfr commented Feb 6, 2025

eradman commented Feb 6, 2025

gaecoli commented Feb 7, 2025

eradman commented Feb 7, 2025

gaecoli commented Feb 8, 2025

yoshiokatsuneo commented Feb 22, 2025 • edited Loading

yoshiokatsuneo commented Mar 7, 2025

gaecoli commented Mar 8, 2025

yoshiokatsuneo commented Mar 10, 2025

gaecoli commented Mar 10, 2025

yoshiokatsuneo commented Mar 10, 2025

gaecoli commented Mar 10, 2025

yoshiokatsuneo commented Mar 10, 2025

gaecoli commented Mar 11, 2025

yoshiokatsuneo commented Mar 11, 2025

gaecoli commented Mar 12, 2025

gaecoli commented Jan 24, 2025 •

edited by eradman

Loading

yoshiokatsuneo commented Feb 22, 2025 •

edited

Loading