Skip to content

[KEP-2400] [Bugfix]: Ensure container-level swap metrics are collected #129486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

iholder101
Copy link
Contributor

@iholder101 iholder101 commented Jan 6, 2025

What type of PR is this?

/kind bug
/kind feature
/sig node

What this PR does / why we need it:

This PR is a follow-up to #118865.
In the above PR, swap stats were added, but wrongly only to the pod-level metrics and not to the container-level metrics. This was found out during testing the swap evictions POC [1].

This PR ensures container-level swap metrics are collected.
It also adds unit tests for container-level metrics, as well as missing unit tests for ensuing swap stats are included in cadvisor's CPU and memory stats.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Bugfix: Ensure container-level swap metrics are collected

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 6, 2025
@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 6, 2025
@kannon92
Copy link
Contributor

kannon92 commented Jan 6, 2025

LGTM. Please fix the linter hints.

Should we consider adding a e2e test for swap tests that covers this?

@kannon92
Copy link
Contributor

kannon92 commented Jan 6, 2025

/priority important-soon
/triage accepted

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 6, 2025
@iholder101 iholder101 force-pushed the bugfix/swap-container-cri-stats branch 2 times, most recently from f7577e7 to c7fa1cb Compare January 7, 2025 09:14
@iholder101
Copy link
Contributor Author

iholder101 commented Jan 7, 2025

Thanks @kannon92 for your review!

LGTM. Please fix the linter hints.

Done!

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 7, 2025
@iholder101
Copy link
Contributor Author

Should we consider adding a e2e test for swap tests that covers this?

Done as well :)

/test pull-kubernetes-node-swap-conformance-fedora-serial
/test pull-kubernetes-node-swap-conformance-ubuntu-serial

@iholder101 iholder101 force-pushed the bugfix/swap-container-cri-stats branch from d94f981 to fbfd48e Compare January 7, 2025 12:49
@iholder101
Copy link
Contributor Author

Should we consider adding a e2e test for swap tests that covers this?

Done as well :)

/test pull-kubernetes-node-swap-conformance-fedora-serial /test pull-kubernetes-node-swap-conformance-ubuntu-serial

Fedora lane failed for not using enough swap, so bumped the stress size by a bit.

/test pull-kubernetes-node-swap-conformance-fedora-serial
/test pull-kubernetes-node-swap-conformance-ubuntu-serial

iholder101 added a commit to iholder101/kubernetes that referenced this pull request Jan 12, 2025
This fix needs to be removed before merge
after the following PR gets in:
kubernetes#129486

Signed-off-by: Itamar Holder <[email protected]>
@iholder101 iholder101 force-pushed the bugfix/swap-container-cri-stats branch from 808c2f6 to 5e1edb0 Compare January 27, 2025 11:13
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 27, 2025
@iholder101
Copy link
Contributor Author

Rebased.

/test pull-kubernetes-node-swap-conformance-fedora-serial
/test pull-kubernetes-node-swap-conformance-ubuntu-serial

@iholder101
Copy link
Contributor Author

@kannon92 can you PTAL? Anything else from your side?

@iholder101 iholder101 force-pushed the bugfix/swap-container-cri-stats branch 2 times, most recently from 8358d27 to 051490f Compare January 27, 2025 12:51
Signed-off-by: Itamar Holder <[email protected]>
@iholder101 iholder101 force-pushed the bugfix/swap-container-cri-stats branch from 051490f to 617c094 Compare January 27, 2025 13:44
@iholder101
Copy link
Contributor Author

/test pull-kubernetes-node-swap-conformance-fedora-serial
/test pull-kubernetes-node-swap-conformance-ubuntu-serial

@kannon92
Copy link
Contributor

/retest

/lgtm
/assign @mrunalp

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 27, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 687790e2b41fcc53fd177dd1bf7eb621a7294c21

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: iholder101, mrunalp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 4, 2025
@k8s-ci-robot k8s-ci-robot merged commit f82439f into kubernetes:master Feb 4, 2025
15 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.33 milestone Feb 4, 2025
HirazawaUi pushed a commit to HirazawaUi/kubernetes that referenced this pull request Feb 7, 2025
…wap-container-cri-stats"

This reverts commit f82439f, reversing
changes made to 7f9fdd6.
HirazawaUi pushed a commit to HirazawaUi/kubernetes that referenced this pull request Feb 7, 2025
…wap-container-cri-stats"

This reverts commit f82439f, reversing
changes made to 7f9fdd6.
HirazawaUi pushed a commit to HirazawaUi/kubernetes that referenced this pull request Feb 7, 2025
…wap-container-cri-stats"

This reverts commit f82439f, reversing
changes made to 7f9fdd6.
HirazawaUi pushed a commit to HirazawaUi/kubernetes that referenced this pull request Feb 7, 2025
…wap-container-cri-stats"

This reverts commit f82439f, reversing
changes made to 7f9fdd6.
HirazawaUi added a commit to HirazawaUi/kubernetes that referenced this pull request Feb 7, 2025
…wap-container-cri-stats"

This reverts commit f82439f, reversing
changes made to 7f9fdd6.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants