Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Early termination of jobs to avoid pulling too much data from workers
Checking if a job gets cancelled at rproc::InfileMerger while pulling and processing data from a worker. This is meant to reinforce an existing termination logic at qdisp::QueryRequest. NOTE: Upon further experiments with this change, it doesn't seem to make any difference. Apparently, a problem is that only one job at a time is able to ingest into the result table. Other jobs (there are 300 threads at USDF) are waiting on the corresponding mutex waiting for their turn to ingest rows into the table. The second issue is about (potentially) the huge amount of the result data that might be accumulated at workers since workers keep processing queries. This can't be prevented from Czar due to a lack of the back pressure like the memory-based one of the SSI streaming. Hence, a possible solution would be to introduce the result-size-based "break" at workers that would delay processing jobs of a query where there is a substantial (say more than 5 GB) amount of unclaimed files at the same worker. Another improvement is to add the result processing pool at Czar, a queue for worker responses, and the (priority + queryId)-based algorithm for feeding responses into the pool. The algorithm would give the higher priority for the interactive queries, and for the can queries it would give higher priority to the older (smaller queryId) queries. However, that should be done in a separate JIRA ticket.
- Loading branch information