Skip to content

[KEP-2400] kubectl top: add a --show-swap option #129458

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 4, 2025

Conversation

iholder101
Copy link
Contributor

@iholder101 iholder101 commented Jan 2, 2025

What type of PR is this?

/kind feature

Background

A while ago, swap stats were added to summary and metric APIs as part of #118865. Although these stats are extremely valuable, they're mostly designed to be consumed by programs (like HPAs or metrics solutions), not by humans. This raised the request for a better way of understanding which nodes have swap, the swap capacity and usage, which pods consume swap space and how much, etc.

I've tried to address this need in two past attempts.
In the first one, #123963, I've tried to add swap as another ReasourceName, hence to display it as part of kubectl describe node's status. This attempt was eventually denied with the main concern being that adding swap as a ResourceName is confusing since it cannot be directly/explicitly consumed (since the swap limit is auto-calculated).

In the second one, #125278, I've tried to add a swap condition to nodes. This PR was also denied with the main concern being that a condition is not suited to report such data, and that a field is recommended in this case, although it wasn't easy thinking of a good field that will benefit the API.

Eventually we agreed to defer this to the future. However, while attempting to GA the swap feature there was a hard push-back against GA without addressing these needs.

Therefore, this is my third attempt to address it. My hope is that it will suffice the need of debugability and will pave the way for the upcoming GA.

What this PR does / why we need it:

This PR adds adds the --show-swap option to the kubectl top subcommands.

Behind the scenes, this purely relies on consuming the stats added in #118865, making them human-readable.

Example output:

    > kubectl top node --show-swap
    NAME     CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   SWAP(bytes)   SWAP(%)   SWAP CAPACITY(bytes)
    node01   500m         8%       2836Mi          60%         0Mi           0%        0Mi
    node02   260m         5%       2206Mi          47%         102Mi         10%       1023Mi
    > kubectl top pod -n kube-system --show-swap
    NAME                                      CPU(cores)   MEMORY(bytes)   SWAP(bytes)
    coredns-58d5bc5cdb-5nbk4                  2m           19Mi            0Mi
    coredns-58d5bc5cdb-jsh26                  3m           37Mi            0Mi
    etcd-node01                               51m          143Mi           0Mi
    kube-apiserver-node01                     98m          824Mi           0Mi
    kube-controller-manager-node01            20m          135Mi           0Mi
    kube-proxy-ffgs2                          1m           24Mi            0Mi
    kube-proxy-fhvwx                          1m           39Mi            0Mi
    kube-scheduler-node01                     13m          69Mi            0Mi
    metrics-server-8598789fdb-d2kcj           5m           26Mi            0Mi

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

I recommend to review per-commit.

Does this PR introduce a user-facing change?

Added a `--show-swap` option to `kubectl top` subcommands

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jan 2, 2025
@k8s-ci-robot k8s-ci-robot added area/kubectl sig/cli Categorizes an issue or PR as relevant to SIG CLI. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 2, 2025
@iholder101 iholder101 changed the title kubectl top: add a --show-swap option [KEP-2400] kubectl top: add a --show-swap option Jan 2, 2025
@iholder101 iholder101 marked this pull request as ready for review January 2, 2025 17:25
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 2, 2025
@sftim
Copy link
Contributor

sftim commented Jan 7, 2025

Changelog suggestion

-Add a "--show-swap" option to kubectl top subcommands
+Added a `--show-swap` option to `kubectl top` subcommands

@aojea
Copy link
Member

aojea commented Jan 17, 2025

LGTM

@iholder101
Copy link
Contributor Author

iholder101 commented Mar 30, 2025

If the swap feature gate is off, we should probably not display swap column.

@kannon92 I tend to disagree. I don't see a reason to disallow using the explicit --show-swap flag if the feature gate is turned off. I even think that it's beneficial, especially after turning the FG on and off (or switching between swap behaviors), to validate that indeed workloads aren't using swap. It can also be used to inspect node's swap capacity even if it is not accessible to k8s workloads.

@iholder101
Copy link
Contributor Author

iholder101 commented Mar 31, 2025

currently you're always reporting 0 no matter if swap data is available or not, which might be misleading.

@soltysh Done. It will now look like the following:
With swap:

NAME    CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   SWAP(bytes)   SWAP(%)   
node1   1m           10%      2Mi             10%         1Mi           0%        
node2   5m           10%      6Mi             10%         2Mi           0%        
node3   3m           10%      4Mi             10%         0Mi           0%        

Withous swap:

NAME    CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   SWAP(bytes)   SWAP(%)       
node1   1m           10%      2Mi             10%         unavailable   unavailable   
node2   5m           10%      6Mi             10%         unavailable   unavailable   
node3   3m           10%      4Mi             10%         unavailable   unavailable   

I prefer to leave the top pod as-is TBH. It could be a useful way of ensuring pods aren't using swap after switching swap on/off, as the node will set swap capacity to nil if the feature gate is dropped:

if utilfeature.DefaultFeatureGate.Enabled(features.NodeSwap) && info.SwapCapacity != 0 {
node.Status.NodeInfo.Swap = &v1.NodeSwapStatus{
Capacity: ptr.To(int64(info.SwapCapacity)),
}
}

In addition, currently the pod printer relies solely on metrics, which only say that the pod is not using any swap. To gather information about nodes we'd need to add an addition list node API call.

@iholder101 iholder101 force-pushed the feature/top-swap branch 2 times, most recently from 51d6881 to 88b66ab Compare March 31, 2025 15:15
@iholder101
Copy link
Contributor Author

currently you're always reporting 0 no matter if swap data is available or not, which might be misleading.

@soltysh Done. It will now look like the following: With swap:

NAME    CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   SWAP(bytes)   SWAP(%)   
node1   1m           10%      2Mi             10%         1Mi           0%        
node2   5m           10%      6Mi             10%         2Mi           0%        
node3   3m           10%      4Mi             10%         0Mi           0%        

Withous swap:

NAME    CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   SWAP(bytes)   SWAP(%)       
node1   1m           10%      2Mi             10%         unavailable   unavailable   
node2   5m           10%      6Mi             10%         unavailable   unavailable   
node3   3m           10%      4Mi             10%         unavailable   unavailable   

I prefer to leave the top pod as-is TBH. It could be a useful way of ensuring pods aren't using swap after switching swap on/off, as the node will set swap capacity to nil if the feature gate is dropped:

if utilfeature.DefaultFeatureGate.Enabled(features.NodeSwap) && info.SwapCapacity != 0 {
node.Status.NodeInfo.Swap = &v1.NodeSwapStatus{
Capacity: ptr.To(int64(info.SwapCapacity)),
}
}

In addition, currently the pod printer relies solely on metrics, which only say that the pod is not using any swap. To gather information about nodes we'd need to add an addition list node API call.

@soltysh WDYT? :)

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two final comments, and lets land this. @iholder101 ping me on slack when you get that updated and I'll tag.

@@ -198,6 +200,13 @@ func (o TopNodeOptions) RunTopNode() error {
} else {
availableResources[n.Name] = n.Status.Capacity
}

swapCapacity := int64(0)
if n.Status.NodeInfo.Swap != nil && n.Status.NodeInfo.Swap.Capacity != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Itamar is correct here.

fmt.Fprintf(out, "%vMi", quantity.Value()/(1024*1024))
default:
fmt.Fprintf(out, "%v", quantity.Value())
}
}

func printPodResourcesSum(out io.Writer, total v1.ResourceList, columnWidth int) {
func printSingleMissingResource(out io.Writer) {
const unavailableStr = "unavailable"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const unavailableStr = "unavailable"
const unavailableStr = "<unkown>"

is what we're mostly using, in kubectl decribe output, for example. So let's be consistent with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! 👍

Example output:
> kubectl top node --show-swap
NAME     CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   SWAP(bytes)   SWAP(%)
node01   500m         8%       2836Mi          60%         0Mi           0%
node02   260m         5%       2206Mi          47%         512Mi         50%

Signed-off-by: Itamar Holder <[email protected]>
Example output:
> kubectl top pod -n kube-system --show-swap
NAME                                      CPU(cores)   MEMORY(bytes)   SWAP(bytes)
coredns-58d5bc5cdb-5nbk4                  2m           19Mi            0Mi
coredns-58d5bc5cdb-jsh26                  3m           37Mi            0Mi
etcd-node01                               51m          143Mi           0Mi
kube-apiserver-node01                     98m          824Mi           0Mi
kube-controller-manager-node01            20m          135Mi           0Mi
kube-proxy-ffgs2                          1m           24Mi            0Mi
kube-proxy-fhvwx                          1m           39Mi            0Mi
kube-scheduler-node01                     13m          69Mi            0Mi
metrics-server-8598789fdb-d2kcj           5m           26Mi            0Mi

Signed-off-by: Itamar Holder <[email protected]>
@iholder101
Copy link
Contributor Author

Thanks @soltysh!
I've addressed your feedback and rebased. PTAL :)

@iholder101
Copy link
Contributor Author

/test pull-kubernetes-unit

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 4, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 3b79e3a41287c21dd673ee47fc3ffd7d72a2f40a

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: iholder101, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 8d3fb9e into kubernetes:master Jun 4, 2025
14 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.34 milestone Jun 4, 2025
@github-project-automation github-project-automation bot moved this from In Progress to Done in SIG CLI Jun 4, 2025
ania-borowiec pushed a commit to ania-borowiec/kubernetes that referenced this pull request Jun 9, 2025
* top, refactor: turn package-exposed variables to unexpose struct fields

Signed-off-by: Itamar Holder <[email protected]>

* kubectl top node: add the --show-swap option

Example output:
> kubectl top node --show-swap
NAME     CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   SWAP(bytes)   SWAP(%)
node01   500m         8%       2836Mi          60%         0Mi           0%
node02   260m         5%       2206Mi          47%         512Mi         50%

Signed-off-by: Itamar Holder <[email protected]>

* kubectl top pod: add the --show-swap option

Example output:
> kubectl top pod -n kube-system --show-swap
NAME                                      CPU(cores)   MEMORY(bytes)   SWAP(bytes)
coredns-58d5bc5cdb-5nbk4                  2m           19Mi            0Mi
coredns-58d5bc5cdb-jsh26                  3m           37Mi            0Mi
etcd-node01                               51m          143Mi           0Mi
kube-apiserver-node01                     98m          824Mi           0Mi
kube-controller-manager-node01            20m          135Mi           0Mi
kube-proxy-ffgs2                          1m           24Mi            0Mi
kube-proxy-fhvwx                          1m           39Mi            0Mi
kube-scheduler-node01                     13m          69Mi            0Mi
metrics-server-8598789fdb-d2kcj           5m           26Mi            0Mi

Signed-off-by: Itamar Holder <[email protected]>

* kubectl top node --show-swap: add unit tests

Signed-off-by: Itamar Holder <[email protected]>

* kubectl top pod --show-swap: Add unit tests

Signed-off-by: Itamar Holder <[email protected]>

* Explicitly mark swap as unavailable when necessary

Signed-off-by: Itamar Holder <[email protected]>

---------

Signed-off-by: Itamar Holder <[email protected]>
ania-borowiec pushed a commit to ania-borowiec/kubernetes that referenced this pull request Jun 11, 2025
* top, refactor: turn package-exposed variables to unexpose struct fields

Signed-off-by: Itamar Holder <[email protected]>

* kubectl top node: add the --show-swap option

Example output:
> kubectl top node --show-swap
NAME     CPU(cores)   CPU(%)   MEMORY(bytes)   MEMORY(%)   SWAP(bytes)   SWAP(%)
node01   500m         8%       2836Mi          60%         0Mi           0%
node02   260m         5%       2206Mi          47%         512Mi         50%

Signed-off-by: Itamar Holder <[email protected]>

* kubectl top pod: add the --show-swap option

Example output:
> kubectl top pod -n kube-system --show-swap
NAME                                      CPU(cores)   MEMORY(bytes)   SWAP(bytes)
coredns-58d5bc5cdb-5nbk4                  2m           19Mi            0Mi
coredns-58d5bc5cdb-jsh26                  3m           37Mi            0Mi
etcd-node01                               51m          143Mi           0Mi
kube-apiserver-node01                     98m          824Mi           0Mi
kube-controller-manager-node01            20m          135Mi           0Mi
kube-proxy-ffgs2                          1m           24Mi            0Mi
kube-proxy-fhvwx                          1m           39Mi            0Mi
kube-scheduler-node01                     13m          69Mi            0Mi
metrics-server-8598789fdb-d2kcj           5m           26Mi            0Mi

Signed-off-by: Itamar Holder <[email protected]>

* kubectl top node --show-swap: add unit tests

Signed-off-by: Itamar Holder <[email protected]>

* kubectl top pod --show-swap: Add unit tests

Signed-off-by: Itamar Holder <[email protected]>

* Explicitly mark swap as unavailable when necessary

Signed-off-by: Itamar Holder <[email protected]>

---------

Signed-off-by: Itamar Holder <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubectl cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cli Categorizes an issue or PR as relevant to SIG CLI. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants