Skip to content

feat: make CLE timers configurable #132433

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 30, 2025

Conversation

michaelasp
Copy link
Contributor

@michaelasp michaelasp commented Jun 20, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

Adds configurable timers to coordinated leader election, cleaning up TODOs.

Which issue(s) this PR is related to:

Special notes for your reviewer:

Adds flags to apiserver, wasn't entirely sure where to place this, looking for advice here.

Does this PR introduce a user-facing change?

Add configurable flags to kube-apiserver for coordinated leader election.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 20, 2025
@k8s-ci-robot
Copy link
Contributor

@michaelasp: The label(s) sig/####, sig/what, sig/this, sig/pr, sig/does, sig//, sig/why, sig/we, sig/need, sig/it: cannot be applied, because the repository doesn't have them.

In response to this:

What type of PR is this?

/kind feature
/sig

What this PR does / why we need it:

Adds configurable timers to coordinated leader election, cleaning up TODOs.

Which issue(s) this PR is related to:

Special notes for your reviewer:

Adds flags to apiserver, wasn't entirely sure where to place this, looking for advice here.

Does this PR introduce a user-facing change?

Add configurable flags to kube-apiserver for coordinated leader election.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 20, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @michaelasp. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 20, 2025
@michaelasp
Copy link
Contributor Author

/assign @Jefftree

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 20, 2025
@k8s-ci-robot
Copy link
Contributor

@michaelasp: The label(s) sig/apimachinery cannot be applied, because the repository doesn't have them.

In response to this:

/sig apimachinery

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@pacoxu
Copy link
Member

pacoxu commented Jun 23, 2025

/cc @Jefftree @jpbetz

@k8s-ci-robot k8s-ci-robot requested review from Jefftree and jpbetz June 23, 2025 07:28
@Jefftree
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 23, 2025
@@ -102,6 +102,11 @@ type Extra struct {
SystemNamespaces []string

VersionedInformers clientgoinformers.SharedInformerFactory

// Coordinated Leader Election timers
LeaseDuration time.Duration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We have many other leases in the apiserver that these variable names are probably a bit confusing. Perhaps prepend a CLE or CoordinatedLeaderElection prefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I wasn't sure what to name them exactly but agreed that this is too generic. I'll add CLE as a prefix to not make the variable name too long.

@@ -29,12 +29,11 @@ import (
"k8s.io/klog/v2"
)

var (
// TODO: Eventually these should be configurable
LeaseDuration = 15 * time.Second
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are also defined in a way that can be overridden in integration tests:
https://github.com/kubernetes/kubernetes/blob/master/test/integration/apiserver/coordinatedleaderelection/leaderelection_test.go#L50-L54

I'd imagine these tests will fail at the moment, can you update the references to set these up properly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, let me update those.

@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jun 23, 2025
Comment on lines 133 to 135
"--cle-lease-duration=10s",
"--cle-renew-deadline=5s",
"--cle-retry-period=1s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We typically avoid using acronyms in flag names. Can we find some prefix that's not horribly verbose as an alternative to "cle" here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, how about coordinated-leadership as the prefix?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@k8s-ci-robot k8s-ci-robot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jun 23, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Jun 23, 2025
@@ -202,6 +209,13 @@ func (s *Options) AddFlags(fss *cliflag.NamedFlagSets) {

fs.StringVar(&s.ServiceAccountSigningEndpoint, "service-account-signing-endpoint", s.ServiceAccountSigningEndpoint, ""+
"Path to socket where a external JWT signer is listening. This flag is mutually exclusive with --service-account-signing-key-file and --service-account-key-file. Requires enabling feature gate (ExternalServiceAccountTokenSigner)")

fs.DurationVar(&s.CoordinatedLeadershipLeaseDuration, "coordinated-leadership-lease-duration", s.CoordinatedLeadershipLeaseDuration,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also add validation for these flags under https://github.com/kubernetes/kubernetes/blob/master/pkg/controlplane/apiserver/options/validation.go and have corresponding tests in pkg/controlplane/apiserver/options/validation_test.go if it makes sense

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point, I think we have certain invariants like that we would need to renew the lease with a lower interval than the lease duration. Let me add those validations.

I don't think we can check whether the feature gate is enabled here though since we have default values and it's difficult to check whether the timers are set or not without the flags. IMO this should be fine since it's all feature gated regardless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, PTAL!

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 24, 2025
Copy link
Member

@Jefftree Jefftree left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

/lgtm
/approve

defaultLeaseDuration := leaderelection.LeaseDuration
defaultRenewDeadline := leaderelection.RenewDeadline
defaultRetryPeriod := leaderelection.RetryPeriod
defer func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love that we can get rid of the defer since this isn't shared anymore :)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 25, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2a74087f548d4994f4f7d3a8f7cd3222de21da24

@richabanker
Copy link
Contributor

/lgtm

need to squash commits?

@yongruilin
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 26, 2025
@michaelasp
Copy link
Contributor Author

/assign @jpbetz for final LGTM

Copy link
Contributor

@jpbetz jpbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Jefftree, jpbetz, michaelasp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 30, 2025
@michaelasp
Copy link
Contributor Author

michaelasp commented Jun 30, 2025

/retest

Unrelated, let me check if a flake is filed for this. #132506 seems to be the same issue, filed last week.

@k8s-ci-robot k8s-ci-robot merged commit 201325e into kubernetes:master Jun 30, 2025
12 of 13 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.34 milestone Jun 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants