Replace DispatchQueue and BookingQueue with HealthyThreadPool #1035

DiegoTavares · 2021-09-16T15:54:28Z

Replace DispatchQueue and BookingQueue with HealthyThreadPool

Queues will not inherit from ThreadPoolExecutor, instead they will manage
an instance of HealthThreadPool, which is a threadPoolExecutor that handles
healthChecks, termination and repeated tasks. With this the Booking queue
should be able to self-heal when locked threads happen.

Sorting jobs only by priority causes a situation where low priority jobs can get starved by a constant flow of high priority jobs. The new formula adds a modifier to the sorting rank to take into account the number of cores the job is requesting and also the number of days the job is waiting on the queue. Priorities numbers over 200 will mostly override the formula and work as a priority only based scheduling. sort = priority + (100 * (1 - (job.cores/job.int_min_cores))) + (age in days) Besides that, also take layer_int_cores_min into account when filtering folder_resourse limitations to avoid allocating more cores than the folder limits. (cherry picked from commit 566411aeeddc60983a30eabe121fd03263d05525)

This reverts commit 2eb4936

Queues will not inherit from ThreadPoolExecutor, instead they will manage an instance of HealthThreadPool, which is a threadPoolExecutor that handles healthChecks, termination and repeated tasks. With this the Booking queue should be able to self-heal when locked threads happen.

DiegoTavares · 2021-09-16T16:13:14Z

I still have some merge issues causing test errors to work on.

Use a guava cache to store only the last version of a HostReport per host.

splhack · 2021-11-04T16:38:13Z

I'd like to clarify three things.

Handle Repeated Tasks

This PR will handle repeated Host-reports by always keeping the latest version.
However, the Host-report is sent every 30 seconds (The interval is randomized but the average should be around 30 seconds). It means, Cuebot received a new host-report while the host-report-queue is still keeping the previous host-report from the same host for 30 seconds. Thus, Cuebot was not able to handle the previous host-report for 30 seconds and counting. IMHO, it should not happen.

The root cause of this situation is the host-report-queue contention. Quoting myself from #1008,

HostReportHandler ThreadPoolExecutor(reportQueue) only has 8 threads.
Default number of PostgreSQL connections is only 10 (HirakiCP default.)

With the default settings, if Cuebot is connected from a several 100 to 1,000 RQDs,

reportQueue filled up easily due to the very limited number of threads (8) and the PostgreSQL connections (10).
Thus, Cuebot cannot handle host-reports over 30 seconds even over 5 minutes. It marks many RQDs down because Cuebot detects RQDs did not send the host-report over 5 minutes.

So the solution is obvious, prevent host-report-queue contention at all.

Increase the number of host-report queue threads
Increase the number of HikariCP PostgreSQL connection

I posted the example settings in #1008.

If it is still not enough, the system needs an extra Cuebot instance to handle the number of RQDs. Eventually PostgreSQL would be the bottleneck. In that case, the system needs to adjust the RQD host-report interval to reduce the number of host-reports.

Healthy check

Handling repeated task and healthy check are not able to fix the root cause, queue contention. It might be able to reduce some of repeated host-reports, or discard the filled queue. But it is temporary, very short time. If the host-report-queue filled up one time, it will always happen in the system. Because of the combination - the number of host-report-queue threads, the HikariCP PostgreSQL connection, and the number of RQDs.

Could you try #1008 with the example settings?

locked threads?

Could you elaborate on what is the locked thread? I'm not aware of anything like that with #1008.

should be able to self-heal when locked threads happen

DiegoTavares · 2021-11-10T17:35:51Z

I'd like to clarify three things.

Handle Repeated Tasks

This PR will handle repeated Host-reports by always keeping the latest version. However, the Host-report is sent every 30 seconds (The interval is randomized but the average should be around 30 seconds). It means, Cuebot received a new host-report while the host-report-queue is still keeping the previous host-report from the same host for 30 seconds. Thus, Cuebot was not able to handle the previous host-report for 30 seconds and counting. IMHO, it should not happen.

The root cause of this situation is the host-report-queue contention. Quoting myself from #1008,

HostReportHandler ThreadPoolExecutor(reportQueue) only has 8 threads.

Default number of PostgreSQL connections is only 10 (HirakiCP default.)

With the default settings, if Cuebot is connected from a several 100 to 1,000 RQDs,

reportQueue filled up easily due to the very limited number of threads (8) and the PostgreSQL connections (10).

Thus, Cuebot cannot handle host-reports over 30 seconds even over 5 minutes. It marks many RQDs down because Cuebot detects RQDs did not send the host-report over 5 minutes.

So the solution is obvious, prevent host-report-queue contention at all.

Increase the number of host-report queue threads

Increase the number of HikariCP PostgreSQL connection

This solution is only "obvious" if we consider tasks will always finish in a timely fashion, which was not the case for our situation (See root cause explained bellow)

I posted the example settings in #1008.

If it is still not enough, the system needs an extra Cuebot instance to handle the number of RQDs. Eventually PostgreSQL would be the bottleneck. In that case, the system needs to adjust the RQD host-report interval to reduce the number of host-reports.

Healthy check

Handling repeated task and healthy check are not able to fix the root cause, queue contention. It might be able to reduce some of repeated host-reports, or discard the filled queue. But it is temporary, very short time. If the host-report-queue filled up one time, it will always happen in the system. Because of the combination - the number of host-report-queue threads, the HikariCP PostgreSQL connection, and the number of RQDs.

Could you try #1008 with the example settings?

Just to be clear, I don't think this PR show be used in favor of #1008. IMW both are complimentary. We're currently running more than 5k rqds with this configuration:

booking_queue.threadpool.health_threshold=10
booking_queue.threadpool.core_pool_size=10
booking_queue.threadpool.max_pool_size=14
booking_queue.threadpool.queue_capacity=2000
dispatch.threadpool.core_pool_size=6
dispatch.threadpool.max_pool_size=8
dispatch.threadpool.queue_capacity=2000
healthy_threadpool.health_threshold=6
healthy_threadpool.min_unhealthy_period_min=3

locked threads?

Could you elaborate on what is the locked thread? I'm not aware of anything like that with #1008.

should be able to self-heal when locked threads happen

My bad for mentioning locked threads without a context. There was a condition on the server->rqd communication where threads would get locked waiting for a grpc response without a timeout and it got fixed by #994. The changes on this PR, besides reorganizing the threadpools, worked as a temporary fix while we searched for the root cause, which wasn't contention on our case as the number of threads and Hiraki limits have already been tuned to our needs, but the mentioned starving condition.

The proposal of this PR is to not only have a "self-healing" threadpool as a protection for future starving conditions, but also reorganize the threadpools to facilitate maintenance.

splhack · 2021-11-12T02:03:42Z

We're currently running more than 5k rqds

I'm very happy to hear that! That is pretty much the same as ours. 5 Cuebots + 5.5K RQDs, and counting.

DiegoTavares · 2021-12-08T19:34:15Z

@bcipriano Can I get your review on this?

bcipriano

Looking good here -- a few minor comments, and this needs a merge/rebase from master.

cuebot/src/main/java/com/imageworks/spcue/dispatcher/HostReportQueue.java

cuebot/src/main/java/com/imageworks/spcue/dispatcher/HealthyThreadPool.java

Test doesn't make sense with the new threadpool and will also cause problems whenever an user changes a config property.

DiegoTavares · 2022-03-03T00:51:51Z

Alright, unit tests are passing now. @bcipriano Approval still missing.

DiegoTavares · 2022-03-11T18:05:22Z

@bcipriano Can you we consider this approved?

DiegoTavares · 2022-03-28T22:41:25Z

@bcipriano Can we have your final approval on this. This PR is causing several conflicts in our end and would be good to have merged ASAP

bcipriano

LGTM. Sorry for the delay.

…ySoftwareFoundation#1035) * Update dispatchQuery to use min_cores Sorting jobs only by priority causes a situation where low priority jobs can get starved by a constant flow of high priority jobs. The new formula adds a modifier to the sorting rank to take into account the number of cores the job is requesting and also the number of days the job is waiting on the queue. Priorities numbers over 200 will mostly override the formula and work as a priority only based scheduling. sort = priority + (100 * (1 - (job.cores/job.int_min_cores))) + (age in days) Besides that, also take layer_int_cores_min into account when filtering folder_resourse limitations to avoid allocating more cores than the folder limits. (cherry picked from commit 566411aeeddc60983a30eabe121fd03263d05525) * Revert "Update dispatchQuery to use min_cores" This reverts commit 2eb4936 * Replace DispatchQueue and BookingQueue with HealthyThreadPool Queues will not inherit from ThreadPoolExecutor, instead they will manage an instance of HealthThreadPool, which is a threadPoolExecutor that handles healthChecks, termination and repeated tasks. With this the Booking queue should be able to self-heal when locked threads happen. * Remove trackit reference * Refactor HostReportQueue to use guava Cache Use a guava cache to store only the last version of a HostReport per host. * Configure HostReportQueue on opencue.properties * Fix unit tests * Fix unit tests * This unit tests is not actually testing anything useful Test doesn't make sense with the new threadpool and will also cause problems whenever an user changes a config property. Co-authored-by: Roula O'Regan <[email protected]>

DiegoTavares and others added 9 commits September 30, 2020 09:51

Merge remote-tracking branch 'upstream/master'

dd3172a

Revert "Update dispatchQuery to use min_cores"

eaa4951

This reverts commit 2eb4936

Merge remote-tracking branch 'upstream/master'

73b7df8

Merge remote-tracking branch 'upstream/master'

6dd4566

Merge remote-tracking branch 'upstream/master'

cbdce13

Merge remote-tracking branch 'upstream/master'

3ffeb57

Merge branch 'AcademySoftwareFoundation:master' into master

634db8c

DiegoTavares requested review from bcipriano, gregdenton, IdrisMiles, jrray, larsbijl and smith1511 as code owners September 16, 2021 15:54

DiegoTavares added 2 commits September 16, 2021 09:44

Remove trackit reference

346c8f1

Refactor HostReportQueue to use guava Cache

0fa7438

Use a guava cache to store only the last version of a HostReport per host.

splhack mentioned this pull request Nov 4, 2021

Add thread pool properties #1008

Merged

DiegoTavares closed this Dec 1, 2021

DiegoTavares reopened this Dec 1, 2021

DiegoTavares self-assigned this Dec 8, 2021

Merge branch 'AcademySoftwareFoundation:master' into healthy_pool

3c705fc

bcipriano reviewed Feb 15, 2022

View reviewed changes

cuebot/src/main/java/com/imageworks/spcue/dispatcher/HostReportQueue.java Outdated Show resolved Hide resolved

cuebot/src/main/java/com/imageworks/spcue/dispatcher/HealthyThreadPool.java Show resolved Hide resolved

Configure HostReportQueue on opencue.properties

172c4e3

DiegoTavares requested a review from splhack as a code owner February 24, 2022 22:23

Merge branch 'master' into healthy_pool

c966d73

DiegoTavares force-pushed the healthy_pool branch from 31945a4 to c966d73 Compare March 2, 2022 22:42

DiegoTavares added 3 commits March 2, 2022 14:54

Fix unit tests

7006a09

Fix unit tests

bc22b5c

This unit tests is not actually testing anything useful

33a1654

Test doesn't make sense with the new threadpool and will also cause problems whenever an user changes a config property.

bcipriano approved these changes Mar 29, 2022

View reviewed changes

DiegoTavares merged commit 027d853 into AcademySoftwareFoundation:master Mar 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replace DispatchQueue and BookingQueue with HealthyThreadPool #1035

Replace DispatchQueue and BookingQueue with HealthyThreadPool #1035

Uh oh!

DiegoTavares commented Sep 16, 2021

Uh oh!

DiegoTavares commented Sep 16, 2021

Uh oh!

splhack commented Nov 4, 2021

Uh oh!

DiegoTavares commented Nov 10, 2021

Handle Repeated Tasks

Healthy check

locked threads?

Uh oh!

splhack commented Nov 12, 2021

Uh oh!

DiegoTavares commented Dec 8, 2021

Uh oh!

bcipriano left a comment

Uh oh!

Uh oh!

Uh oh!

DiegoTavares commented Mar 3, 2022

Uh oh!

DiegoTavares commented Mar 11, 2022

Uh oh!

DiegoTavares commented Mar 28, 2022

Uh oh!

bcipriano left a comment

Uh oh!

Uh oh!

Replace DispatchQueue and BookingQueue with HealthyThreadPool #1035

Replace DispatchQueue and BookingQueue with HealthyThreadPool #1035

Uh oh!

Conversation

DiegoTavares commented Sep 16, 2021

Uh oh!

DiegoTavares commented Sep 16, 2021

Uh oh!

splhack commented Nov 4, 2021

Handle Repeated Tasks

Healthy check

locked threads?

Uh oh!

DiegoTavares commented Nov 10, 2021

Handle Repeated Tasks

Healthy check

locked threads?

Uh oh!

splhack commented Nov 12, 2021

Uh oh!

DiegoTavares commented Dec 8, 2021

Uh oh!

bcipriano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

DiegoTavares commented Mar 3, 2022

Uh oh!

DiegoTavares commented Mar 11, 2022

Uh oh!

DiegoTavares commented Mar 28, 2022

Uh oh!

bcipriano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!