-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
for same query_text refresh just execution once #7295
base: master
Are you sure you want to change the base?
Conversation
This PR introduces a mechanism to prevent duplicate execution of the same SQL query in a distributed environment. By implementing a distributed locking mechanism using Redis, we ensure that only one process or thread can execute a specific SQL query at a given time, avoiding unnecessary load on the database and ensuring consistent query results. |
.github/workflows/ci.yml
Outdated
@@ -3,7 +3,7 @@ on: | |||
push: | |||
branches: | |||
- master | |||
pull_request_target: | |||
pull_request: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gaecoli I did this change to match changes on master. Not sure why it shows up in diff, but it's fine.
@gaecoli I wonder if we really need this? I mean the chance of the exact same query being executed at the same time are usually low vs. the complexity and potential issues this might add. Do you have this happen frequently? |
When i add the same query at a dashboard, the query has lot of visualizations, table, chart..., when i add this at a dashboard, i refresh dashboard, the same query text will all execution many times in query engine (Presto, MySQL, doris...). At my company, 1000+ people maybe refresh same dashboard, so you know what i say. |
Unfortunately this does happen more often than I would have guessed. Any time multiple visualizations of the same query is included on a dashboard there is a high probability of duplicate queries SELECT regexp_matches(query, 'Query Hash: [a-z0-9]+') FROM pg_stat_activity WHERE state='active';
For very long queries (not uncommon for my users!) this will cause unnecessary load. |
Maybe the right thing will be to make dashboards refreshes smarter and reuse the same query invocation for different visualizations that depend on it? I think it will be more robust and address the core issue instead of trying to address it in an indirect way. |
That seems like a good approach, unless this also becomes complex. My guess is that a dashboard is the only API user that would normally hit this race condition. |
This is a good approach, but I think such a change would make the code more complex. |
Tested this change manually, it does seem to work. Also I see the log messages
|
Yes, because it's work well at my company's Redash! |
What type of PR is this?
Description
How is this tested?
Related Tickets & Documents
Mobile & Desktop Screenshots/Recordings (if there are UI changes)