-
Notifications
You must be signed in to change notification settings - Fork 40.9k
Fail Kubelet at startup if swap is configured with cgroup v1 #119327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail Kubelet at startup if swap is configured with cgroup v1 #119327
Conversation
Hi @iholder101. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
@@ -225,6 +225,10 @@ func NewContainerManager(mountUtil mount.Interface, cadvisorInterface cadvisor.I | |||
return nil, fmt.Errorf("running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained: %v", swapLines) | |||
} | |||
} | |||
} else { | |||
if utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) && !cgroups.IsCgroup2UnifiedMode() { | |||
return nil, fmt.Errorf("running swap with cgroups v1 is not supported. please either disable %s feature gate", kubefeatures.NodeSwap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, fmt.Errorf("running swap with cgroups v1 is not supported. please either disable %s feature gate", kubefeatures.NodeSwap) | |
return nil, fmt.Errorf("running swap with cgroups v1 is not supported. Please either disable %s feature gate or start using cgroup v2.", kubefeatures.NodeSwap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks :) Probably my mind had a context switch and didn't switch back..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not addressed yet but outdated for rebasing.
@@ -225,6 +225,10 @@ func NewContainerManager(mountUtil mount.Interface, cadvisorInterface cadvisor.I | |||
return nil, fmt.Errorf("running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained: %v", swapLines) | |||
} | |||
} | |||
} else { | |||
if utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) && !cgroups.IsCgroup2UnifiedMode() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't thinking about it before, but now I realize that the problem with this is that all tests running today with --fail-on-swap=false
will stop working as the feature gate will be enabled by default.
Even though it is a desirable behavior to fail fast long term, breaking many scenarios is not ideal.
One option is to change the default value for the feature gate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- @deads2k (PRR reviewer for the SWAP KEP) for the opinion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't thinking about it before, but now I realize that the problem with this is that all tests running today with --fail-on-swap=false will stop working as the feature gate will be enabled by default.
Yes this is exactly right, and was one of the reasons I asked to do so in a follow-up :)
I think you're right, and that the right approach moving forward is:
- Introduce a PR that would adapt the tests.
- Merge this PR after making sure it won't break any lanes.
One option is to change the default value for the feature gate.
What do you mean by that? Isn't the default behavior for the feature gate is to be disabled?
IIUC all we need to do is remove all jobs that are defined with cgroup v1 + NodeSwap enabled. We should keep (and maybe add more? but outside of this PR's scope) the jobs with swap + v2.
WDYT?
/test all Let's see exactly which tests fail to discuss our next steps. |
/test pull-kubernetes-cos-cgroupv1-containerd-node-e2e |
/test pull-kubernetes-node-swap-ubuntu-serial |
/test pull-kubernetes-node-kubelet-serial-crio-cgroupv2 |
/test pull-kubernetes-e2e-kind-ipv6 |
@SergeyKanzhelev Can you PTAL? I've also changed the way the functional test checks whether a feature gate is on. It seem to work nice, as the tests pass for several times now. note: |
/retest |
8c5e84f
to
afb6bab
Compare
afb6bab
to
2fa0a64
Compare
@SergeyKanzhelev I've moved the commits that were dealing with the feature gate check into another PR: #120139. This should be simpler to review now. PTAL. |
2fa0a64
to
dd3a825
Compare
/test pull-kubernetes-node-kubelet-serial-crio-cgroupv2 /test pull-kubernetes-node-swap-ubuntu-serial /test pull-kubernetes-node-swap-fedora |
@iholder101: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Signed-off-by: Itamar Holder <[email protected]>
Signed-off-by: Itamar Holder <[email protected]>
dd3a825
to
59fd923
Compare
@SergeyKanzhelev ping |
/cc |
@@ -224,6 +224,8 @@ func NewContainerManager(mountUtil mount.Interface, cadvisorInterface cadvisor.I | |||
return nil, fmt.Errorf("running with swap on is not supported, please disable swap! or set --fail-swap-on flag to false. /proc/swaps contained: %v", swapLines) | |||
} | |||
} | |||
} else if utilfeature.DefaultFeatureGate.Enabled(kubefeatures.NodeSwap) && !cgroups.IsCgroup2UnifiedMode() { | |||
return nil, fmt.Errorf("running swap with cgroups v1 is not supported. please either disable %s feature gate", kubefeatures.NodeSwap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, fmt.Errorf("running swap with cgroups v1 is not supported. please either disable %s feature gate", kubefeatures.NodeSwap) | |
return nil, fmt.Errorf("running swap with cgroups v1 is not supported. please disable %s feature gate", kubefeatures.NodeSwap) |
@iholder101 thank you for this change. I moved the code to #122241 so we can continue this work. /close |
@kannon92: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Follow-up to #118764 and #119726.
Since swap is no longer supported for cgroup v1, fail kubelet if it starts with swap and cgroup v1. This way, in future versions of k8s, we could turn NodeSwap on by default.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: