Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KCCM]: have providerID trigger re-sync, but not be required for load balancer syncs #117602

Merged

Conversation

alexanderConstantinescu
Copy link
Member

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

#117388 left some open ended questions around "are we sure all cloud providers even set a providerID at all?"...we are not. Should a cloud provider not set the providerID but still expect the service controller to provision LBs: then this won't work for them, because #117388 enforces that the providerID is defined on all nodes that we pass to the cloud provider when configuring load balancers.

This PR revises that logic, and instead triggers a resync when any node gets the providerID assigned, but does not filter nodes based on it.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

[KCCM] drop filtering nodes for the providerID when syncing load balancers, but have changes to the field trigger a re-sync of load balancers. This should ensure that cloud providers which don't specify providerID, can still use the service controller implementation to provision load balancers.   

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


/assign @thockin

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 25, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Apr 25, 2023
@alexanderConstantinescu
Copy link
Member Author

/sig network
/sig cloud-provider

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Apr 25, 2023
// This predicate just validates if the providerID has been set and triggers a
// node sync. It is _not_ used when determining which nodes to use when
// configuring the load balancer's backend pool.
func nodeHasProviderIDPredicate(node *v1.Node) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only use this in shouldSyncUpdatedNode() now, and we test whether the predicate changed. IOW we handle "" to "value" and "value" to "", but not "foo" to "bar". Should we just compare that ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the providerID even change? I was under the impression it can't.

EDIT: it technically can obviously. And yeah: we cover more ground by just changing the logic to react to that. I'll update

@aojea
Copy link
Member

aojea commented Apr 29, 2023

/assign @nckturner @andrewsykim
seems we need to understand what are is the providerID contract with cloud-providers

@thockin
Copy link
Member

thockin commented May 30, 2023

Thanks!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 30, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 0a158f33766466206d4d0aceebd7d32e16b27c6f

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alexanderConstantinescu, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 30, 2023
@k8s-ci-robot k8s-ci-robot merged commit 12d3f5c into kubernetes:master May 30, 2023
12 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.28 milestone May 30, 2023
@alexanderKhaustov
Copy link

@alexanderConstantinescu
Hi
There seems to be a bug here.
Please see the attached log. It demonstrates the following scenario

  • a new node is added
  • first updateLoadBalancerHosts happens, finishes successfully, updates lastSyncedNodes with a list including the new node
  • new node gets providerID
  • second updateLoadBalancerHosts happens, but since lastSyncedNodes already contains the new node, cloud provider's UpdateLoadBalancer is not triggered

So after all there's no UpdateLoadBalancer after setting node's providerID

log1.txt

@alexanderConstantinescu
Copy link
Member Author

Hi @alexanderKhaustov

Which version of Kube is this? You might be using a version which is missing: #120943

@alexanderKhaustov
Copy link

Which version of Kube is this? You might be using a version which is missing: #120943

indeed, mine is before 1.28.4, thanks
Quite a dramatic story, split into 3 issues )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cloudprovider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants