Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't wait for remaining speculative retries if plan is exhausted and there is no running fibers #1177

Open
muzarski opened this issue Jan 22, 2025 · 1 comment

Comments

@muzarski
Copy link
Contributor

This issue is related to speculative_execution::execute function - especially this part in current version of the driver:

res = async_tasks.select_next_some() => {
if let Some(r) = res {
if !can_be_ignored(&r) {
return r;
} else {
last_error = Some(r)
}
}
if async_tasks.is_empty() && retries_remaining == 0 {
return last_error.unwrap_or({
Err(EMPTY_PLAN_ERROR)
});
}
}

There is a small issue with this code. If fiber returns None (i.e., plan is exhausted) there still may be some remaining retries. The driver will wait for remaining retries. It is expected behaviour for most scenarios - there may be some other fibers still running, thus we do not return from this function prematurely. However, we could safely return from the function if both of the following conditions are met:

  • fiber returns None -> plan is empty/exhausted
  • there are no other running fibers in the meantime. If there are some, we should wait for their completion.

Second condition could be checked by introducing additional variable that keeps track of number of currently running fibers. From our experience, speculative_execution::execute's logic seems to be error-prone. This is why, in addition to the change mentioned in the issue, we should think of a way to rewrite the logic in a safer manner.

@Lorak-mmk
Copy link
Collaborator

Second condition could be checked by introducing additional variable that keeps track of number of currently running fibers.

No need for new variable, this is just async_tasks.len()

If we do this, a bit more tricky scenario is where we first exhaust the plan, but some executions are still pending.
In such case:

  • We should not start new executions
  • When running executions finish we should return.

I think all of this can be achieved just by setting retries_remaining to 0.

            res = async_tasks.select_next_some() => {
                if let Some(r) = res {
                    if !can_be_ignored(&r) {
                        return r;
                    } else {
                        last_error = Some(r)
                    }
                } else {
                    retries_remaining = 0
                }
                if async_tasks.is_empty() && retries_remaining == 0 {
                    return last_error.unwrap_or({
                        Err(EMPTY_PLAN_ERROR)
                    });
                }
            }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants