Skip to content

Allow for more flexible retry logic with Cloud Controller Manager #88902

Closed
@kamaln7

Description

@kamaln7

What would you like to be added:
Currently, CCM uses exponential backoff when failing to process a service change. It would be great if this behavior—not only the minimum and maximum delay values—could be adjusted if needed.

// How long to wait before retrying the processing of a service change.
// If this changes, the sleep in hack/jenkins/e2e.sh before downing a cluster
// should be changed appropriately.
minRetryDelay = 5 * time.Second
maxRetryDelay = 300 * time.Second

queue: workqueue.NewNamedRateLimitingQueue(workqueue.NewItemExponentialFailureRateLimiter(minRetryDelay, maxRetryDelay), "service"),

Why is this needed:
It is possible to change the minimum and maximum delay values but in some cases that's not enough. Changing those values affects all kinds of service changes, and it would be great if it were possible to adjust that behavior in specific circumstances.

Take for example a new Load Balancer being created. We might want to reduce the max retry delay to just a few seconds long so it's ready within the cluster with minimal delay, but for other types of sync failures, keep the max retry delay set to 5 minutes as to not overload the API endpoint that is being polled.

/sig cloud-provider

/cc @timoreimann — Timo, could you please take a look and make sure this is in line with what we discussed?

Metadata

Metadata

Assignees

Labels

kind/featureCategorizes issue or PR as related to a new feature.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.sig/cloud-providerCategorizes an issue or PR as relevant to SIG Cloud Provider.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions