Open
Description
What happened?
NodeResourcesFit plugin gives incorrect score for pods, because somehow it incorrectly computes node requested resources. Even more suspicious is a fact that NodeResourcesBalancedAllocation plugin computes them properly.
We already got this issue reported on slack.
Example:
Node spec:
$ kubectl describe node node1
...
Allocatable:
cpu: 7910m
memory: 29077440Ki
pods: 110
...
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 335m (4%) 2 (25%)
memory 397860096 (1%) 4285048320 (14%)
...
Creating a pod with resources:
requests:
cpu: 200m
memory: 1Gi
kube-scheduler logs:
I0226 11:28:40.956301 13 resource_allocation.go:76] "Listed internal info for allocatable resources, requested resources and score" logger="Score.NodeResourcesFit" pod="default/pod-jlqv8" node="node1" resourceAllocationScorer="LeastAllocated" allocatableResource=[7910,29775298560] requestedResource=[630,1855032320] resourceScore=92
I0226 11:28:40.956328 13 resource_allocation.go:76] "Listed internal info for allocatable resources, requested resources and score" logger="Score.NodeResourcesBalancedAllocation" pod="default/pod-jlqv8" node="node1" resourceAllocationScorer="NodeResourcesBalancedAllocation" allocatableResource=[7910,29775298560] requestedResource=[535,1471601920] resourceScore=99
As we can see, requestedResource
in second log (for NodeResourcesBalancedAllocation
plugin) is correctly computed as allocated + requested, but for NodeResourcesFit
, the numbers are different.
Score for this node should be:
100 * ((7910 - 535) / 7910 + (29775298560 - 1471601920) / 29775298560) / 2 = 94
But it is 92.
/sig scheduling
What did you expect to happen?
requestedResource
be computed correctly as well as the score.
How can we reproduce it (as minimally and precisely as possible)?
Schedule pods in a cluster and check detailed kube-scheduler logs (on verbosity 10).
Anything else we need to know?
No response
Kubernetes version
$ kubectl version
Server Version: v1.32.1-gke.1489001
Cloud provider
Tested on GKE
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here