bugfix(hpa): introduce buildQuantity helper for consistent resource quantity #132351

googs1025 · 2025-06-17T11:40:38Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

This PR introduces a new helper function, buildQuantity in the HPA controller.

Previously, quantities were created inline using NewMilliQuantity directly, which made it harder to:

Ensure consistency between CPU and memory handling

Now, all resource quantity creation goes through buildQuantity:

For memory: raw bytes are converted to KiB and use BinarySI
For CPU or other resources: milli-units are used with DecimalSI

Which issue(s) this PR is related to:

Fix: #130584

Special notes for your reviewer:

Does this PR introduce a user-facing change?

HPA status now displays memory metrics using Ki

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None

googs1025 · 2025-06-17T11:42:06Z

before change:

➜  ~ kubectl get hpa
NAME       REFERENCE                    TARGETS                              MINPODS   MAXPODS   REPLICAS   AGE
test-hpa   Deployment/test-deployment   cpu: 25%/50%, memory: 71421952/1Gi   1         10        1          4h
➜  ~ kubectl get hpa -oyaml
apiVersion: v1
items:
- apiVersion: autoscaling/v2
  kind: HorizontalPodAutoscaler
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"autoscaling/v2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"test-hpa","namespace":"default"},"spec":{"maxReplicas":10,"metrics":[{"resource":{"name":"cpu","target":{"averageUtilization":50,"type":"Utilization"}},"type":"Resource"},{"resource":{"name":"memory","target":{"averageValue":"1Gi","type":"AverageValue"}},"type":"Resource"}],"minReplicas":1,"scaleTargetRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"test-deployment"}}}
    creationTimestamp: "2025-06-17T07:16:03Z"
    name: test-hpa
    namespace: default
    resourceVersion: "3841389"
    uid: b8b9a7d2-32e2-4b29-861c-7a489ecc5c5f
  spec:
    maxReplicas: 10
    metrics:
    - resource:
        name: cpu
        target:
          averageUtilization: 50
          type: Utilization
      type: Resource
    - resource:
        name: memory
        target:
          averageValue: 1Gi
          type: AverageValue
      type: Resource
    minReplicas: 1
    scaleTargetRef:
      apiVersion: apps/v1
      kind: Deployment
      name: test-deployment
  status:
    conditions:
    - lastTransitionTime: "2025-06-17T10:55:21Z"
      message: recommended size matches current size
      reason: ReadyForNewScale
      status: "True"
      type: AbleToScale
    - lastTransitionTime: "2025-06-17T11:06:24Z"
      message: the HPA was able to successfully calculate a replica count from cpu
        resource utilization (percentage of request)
      reason: ValidMetricFound
      status: "True"
      type: ScalingActive
    - lastTransitionTime: "2025-06-17T07:16:33Z"
      message: the desired count is within the acceptable range
      reason: DesiredWithinRange
      status: "False"
      type: ScalingLimited
    currentMetrics:
    - resource:
        current:
          averageUtilization: 25
          averageValue: 38m
        name: cpu
      type: Resource
    - resource:
        current:
          averageValue: "71421952"
        name: memory
      type: Resource
    currentReplicas: 1
    desiredReplicas: 1
    lastScaleTime: "2025-06-17T10:11:17Z"
kind: List
metadata:
  resourceVersion: ""

after changed:

➜  ~ kubectl get hpa
NAME       REFERENCE                    TARGETS                              MINPODS   MAXPODS   REPLICAS   AGE
test-hpa   Deployment/test-deployment   cpu: 36%/50%, memory: 108236Ki/1Gi   1         10        1          4h2m
➜  ~ kubectl get hpa -oyaml
apiVersion: v1
items:
- apiVersion: autoscaling/v2
  kind: HorizontalPodAutoscaler
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"autoscaling/v2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"test-hpa","namespace":"default"},"spec":{"maxReplicas":10,"metrics":[{"resource":{"name":"cpu","target":{"averageUtilization":50,"type":"Utilization"}},"type":"Resource"},{"resource":{"name":"memory","target":{"averageValue":"1Gi","type":"AverageValue"}},"type":"Resource"}],"minReplicas":1,"scaleTargetRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"test-deployment"}}}
    creationTimestamp: "2025-06-17T07:16:03Z"
    name: test-hpa
    namespace: default
    resourceVersion: "3841487"
    uid: b8b9a7d2-32e2-4b29-861c-7a489ecc5c5f
  spec:
    maxReplicas: 10
    metrics:
    - resource:
        name: cpu
        target:
          averageUtilization: 50
          type: Utilization
      type: Resource
    - resource:
        name: memory
        target:
          averageValue: 1Gi
          type: AverageValue
      type: Resource
    minReplicas: 1
    scaleTargetRef:
      apiVersion: apps/v1
      kind: Deployment
      name: test-deployment
  status:
    conditions:
    - lastTransitionTime: "2025-06-17T10:55:21Z"
      message: recommended size matches current size
      reason: ReadyForNewScale
      status: "True"
      type: AbleToScale
    - lastTransitionTime: "2025-06-17T11:06:24Z"
      message: the HPA was able to successfully calculate a replica count from cpu
        resource utilization (percentage of request)
      reason: ValidMetricFound
      status: "True"
      type: ScalingActive
    - lastTransitionTime: "2025-06-17T07:16:33Z"
      message: the desired count is within the acceptable range
      reason: DesiredWithinRange
      status: "False"
      type: ScalingLimited
    currentMetrics:
    - resource:
        current:
          averageUtilization: 36
          averageValue: 55m
        name: cpu
      type: Resource
    - resource:
        current:
          averageValue: 108236Ki
        name: memory
      type: Resource
    currentReplicas: 1
    desiredReplicas: 1
    lastScaleTime: "2025-06-17T10:11:17Z"
kind: List
metadata:
  resourceVersion: ""

googs1025 · 2025-06-17T12:00:20Z

/test pull-kubernetes-e2e-autoscaling-hpa-cm

googs1025 · 2025-06-17T12:00:59Z

/retest

googs1025 · 2025-06-17T13:50:53Z

/retest

pkg/controller/podautoscaler/horizontal_test.go

omerap12 · 2025-06-17T14:35:07Z

pkg/controller/podautoscaler/horizontal.go

+func buildQuantity(resourceName v1.ResourceName, rawProposal int64) resource.Quantity {
+	if resourceName == v1.ResourceMemory {
+		// Convert bytes to KiB
+		kib := rawProposal / 1000


I think this should be divided by 1024, not 1000

This is a very interesting question. 🤔 I looked at it for some time and found that the metrics are all obtained using the MilliValue(), so 1000 is not 1024 (binary). In addition, after debugging locally, I used /1000 and it was also the correct value.

kubernetes/pkg/controller/podautoscaler/horizontal.go

Lines 602 to 613 in ced19ff

if target.AverageValue != nil {

var rawProposal int64

replicaCountProposal, rawProposal, timestampProposal, err := a.replicaCalc.GetRawResourceReplicas(ctx, currentReplicas, target.AverageValue.MilliValue(), resourceName, tolerances, namespace, selector, container)

if err != nil {

return 0, nil, time.Time{}, "", condition, fmt.Errorf("failed to get %s usage: %v", resourceName, err)

}

metricNameProposal = fmt.Sprintf("%s resource", resourceName.String())

status := autoscalingv2.MetricValueStatus{

AverageValue: resource.NewMilliQuantity(rawProposal, resource.DecimalSI),

}

return replicaCountProposal, &status, timestampProposal, metricNameProposal, autoscalingv2.HorizontalPodAutoscalerCondition{}, nil

}

kubernetes/pkg/controller/podautoscaler/metrics/client.go

Lines 110 to 131 in ced19ff

func getPodMetrics(ctx context.Context, rawMetrics []metricsapi.PodMetrics, resource v1.ResourceName) PodMetricsInfo {

res := make(PodMetricsInfo, len(rawMetrics))

for _, m := range rawMetrics {

podSum := int64(0)

missing := len(m.Containers) == 0

for _, c := range m.Containers {

resValue, found := c.Usage[resource]

if !found {

missing = true

klog.FromContext(ctx).V(2).Info("Missing resource metric", "resourceMetric", resource, "pod", klog.KRef(m.Namespace, m.Name))

break

}

podSum += resValue.MilliValue()

}

if !missing {

res[m.Name] = PodMetric{

Timestamp: m.Timestamp.Time,

Window: m.Window.Duration,

Value: podSum,

}

}

}

➜ ~ kubectl get PodMetrics NAME CPU MEMORY WINDOW test-deployment-6d56d679c5-4zmtl 28395987n 69600Ki 15.023s

apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"autoscaling/v2","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"test-hpa","namespace":"default"},"spec":{"maxReplicas":10,"metrics":[{"resource":{"name":"cpu","target":{"averageUtilization":50,"type":"Utilization"}},"type":"Resource"},{"resource":{"name":"memory","target":{"averageUtilization":50,"type":"Utilization"}},"type":"Resource"}],"minReplicas":1,"scaleTargetRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"test-deployment"}}} creationTimestamp: "2025-06-17T14:45:26Z" name: test-hpa namespace: default resourceVersion: "3849312" uid: d3cd047c-54af-40fd-af10-2cc15011658e spec: maxReplicas: 10 metrics: - resource: name: cpu target: averageUtilization: 50 type: Utilization type: Resource - resource: name: memory target: averageUtilization: 50 type: Utilization type: Resource minReplicas: 1 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: test-deployment status: conditions: - lastTransitionTime: "2025-06-17T14:46:37Z" message: the HPA controller was able to update the target scale to 3 reason: SucceededRescale status: "True" type: AbleToScale - lastTransitionTime: "2025-06-17T15:07:38Z" message: the HPA was able to successfully calculate a replica count from memory resource utilization (percentage of request) reason: ValidMetricFound status: "True" type: ScalingActive - lastTransitionTime: "2025-06-17T14:46:52Z" message: the desired count is within the acceptable range reason: DesiredWithinRange status: "False" type: ScalingLimited currentMetrics: - resource: current: averageUtilization: 19 averageValue: 29m name: cpu type: Resource - resource: current: averageUtilization: 106 averageValue: 69600Ki name: memory type: Resource currentReplicas: 1 desiredReplicas: 3 lastScaleTime: "2025-06-17T15:07:38Z"

Thanks for the explanation. Yeah I think you are right :)

If the returned value from metrics client is MiliValue() the above example by @googs1025 clearly points to that, I don't think that dividing by 1000 is necessary here. You should just invoke resource.NewMilliQuantity... just like it was done previously.

The one change I do agree with is the differentiation between memory (being BinarySI) and cpu (being DecimalSI) which matches what the metrics server returns here https://github.com/kubernetes-sigs/metrics-server/blob/55b4961bc1eceffd0a37809dc271e9ae38de9deb/pkg/storage/types.go#L63-L64.

Iow. I believe this method should look like this:

func buildQuantity(resourceName v1.ResourceName, rawProposal int64) resource.Quantity { format := resource.DecimalSI // to match what we return in the metrics server, see https://github.com/kubernetes-sigs/metrics-server/blob/55b4961bc1eceffd0a37809dc271e9ae38de9deb/pkg/storage/types.go#L63-L64 if resourceName == v1.ResourceMemory { format = resource.BinarySI } return *resource.NewMilliQuantity(rawProposal, format) }

Thank you for the explanation. I test it locally and it will return correctly. 😄

soltysh · 2025-06-24T14:11:34Z

pkg/controller/podautoscaler/horizontal.go

+func buildQuantity(resourceName v1.ResourceName, rawProposal int64) resource.Quantity {
+	if resourceName == v1.ResourceMemory {
+		// Convert bytes to KiB
+		kib := rawProposal / 1000


If the returned value from metrics client is MiliValue() the above example by @googs1025 clearly points to that, I don't think that dividing by 1000 is necessary here. You should just invoke resource.NewMilliQuantity... just like it was done previously.

The one change I do agree with is the differentiation between memory (being BinarySI) and cpu (being DecimalSI) which matches what the metrics server returns here https://github.com/kubernetes-sigs/metrics-server/blob/55b4961bc1eceffd0a37809dc271e9ae38de9deb/pkg/storage/types.go#L63-L64.

Iow. I believe this method should look like this:

func buildQuantity(resourceName v1.ResourceName, rawProposal int64) resource.Quantity { format := resource.DecimalSI // to match what we return in the metrics server, see https://github.com/kubernetes-sigs/metrics-server/blob/55b4961bc1eceffd0a37809dc271e9ae38de9deb/pkg/storage/types.go#L63-L64 if resourceName == v1.ResourceMemory { format = resource.BinarySI } return *resource.NewMilliQuantity(rawProposal, format) }

soltysh · 2025-06-24T14:15:17Z

pkg/controller/podautoscaler/horizontal_test.go

+				t.Errorf("expected quantity %v, got %v", tt.expected.String(), q.String())
+			}
+		})
+	}


This test isn't sufficient, since it passes with and without your change.

Modify it as follows:

if !q.Equal(tt.expected) || (q.Format != tt.expected.Format) { t.Errorf("expected quantity %v (Format: %v), got %v (Format: %v)", tt.expected.String(), tt.expected.Format, q.String(), q.Format) }

But I'm not sure if there are other ways to test 🤔

Yup, this is better now.

…uantity creation Signed-off-by: googs1025 <[email protected]>

googs1025 · 2025-06-25T03:03:33Z

/test pull-kubernetes-e2e-autoscaling-hpa-cpu

soltysh

/approve

@omerap12 has the final tag

soltysh · 2025-06-26T10:48:07Z

/triage accepted
/priority backlog

omerap12

Thanks!
/lgtm

k8s-ci-robot · 2025-06-26T16:52:28Z

LGTM label has been added.

Git tree hash: 208371e2590f632142d1453f9e45b61af0af6fe0

k8s-ci-robot · 2025-06-26T16:52:35Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: googs1025, omerap12, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/controller/podautoscaler/OWNERS~~ [soltysh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot requested review from mwielgus and omerap12 June 17, 2025 11:41

k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 17, 2025

github-project-automation bot added this to SIG Apps Jun 17, 2025

github-project-automation bot moved this to Needs Triage in SIG Apps Jun 17, 2025

omerap12 reviewed Jun 17, 2025

View reviewed changes

googs1025 force-pushed the fix/hpa_memory branch from 59e4fc9 to 4cc1dc4 Compare June 18, 2025 00:51

googs1025 changed the title ~~bugfix(hpa): introduce buildQuantity helper for consistent resource quantity creation~~ bugfix(hpa): introduce buildQuantity helper for consistent resource quantity Jun 19, 2025

googs1025 mentioned this pull request Jun 24, 2025

The resources.limits.memory unit is automatically converted. Why cannot the unit be displayed as configured? kubernetes/kubectl#1729

Open

soltysh requested changes Jun 24, 2025

View reviewed changes

github-project-automation bot moved this from Needs Triage to In Progress in SIG Apps Jun 24, 2025

googs1025 force-pushed the fix/hpa_memory branch from 4cc1dc4 to 513d4ee Compare June 25, 2025 00:05

bugfix(hpa): introduce buildQuantity helper for consistent resource q…

b50d508

…uantity creation Signed-off-by: googs1025 <[email protected]>

googs1025 force-pushed the fix/hpa_memory branch from 513d4ee to b50d508 Compare June 25, 2025 00:24

soltysh approved these changes Jun 26, 2025

View reviewed changes

omerap12 approved these changes Jun 26, 2025

View reviewed changes

k8s-ci-robot assigned omerap12 Jun 26, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 26, 2025

k8s-ci-robot merged commit efd2a0d into kubernetes:master Jun 26, 2025
15 checks passed

k8s-ci-robot added this to the v1.34 milestone Jun 26, 2025

github-project-automation bot moved this from In Progress to Done in SIG Apps Jun 26, 2025

	if target.AverageValue != nil {
	var rawProposal int64
	replicaCountProposal, rawProposal, timestampProposal, err := a.replicaCalc.GetRawResourceReplicas(ctx, currentReplicas, target.AverageValue.MilliValue(), resourceName, tolerances, namespace, selector, container)
	if err != nil {
	return 0, nil, time.Time{}, "", condition, fmt.Errorf("failed to get %s usage: %v", resourceName, err)
	}
	metricNameProposal = fmt.Sprintf("%s resource", resourceName.String())
	status := autoscalingv2.MetricValueStatus{
	AverageValue: resource.NewMilliQuantity(rawProposal, resource.DecimalSI),
	}
	return replicaCountProposal, &status, timestampProposal, metricNameProposal, autoscalingv2.HorizontalPodAutoscalerCondition{}, nil
	}

	func getPodMetrics(ctx context.Context, rawMetrics []metricsapi.PodMetrics, resource v1.ResourceName) PodMetricsInfo {
	res := make(PodMetricsInfo, len(rawMetrics))
	for _, m := range rawMetrics {
	podSum := int64(0)
	missing := len(m.Containers) == 0
	for _, c := range m.Containers {
	resValue, found := c.Usage[resource]
	if !found {
	missing = true
	klog.FromContext(ctx).V(2).Info("Missing resource metric", "resourceMetric", resource, "pod", klog.KRef(m.Namespace, m.Name))
	break
	}
	podSum += resValue.MilliValue()
	}
	if !missing {
	res[m.Name] = PodMetric{
	Timestamp: m.Timestamp.Time,
	Window: m.Window.Duration,
	Value: podSum,
	}
	}
	}

bugfix(hpa): introduce buildQuantity helper for consistent resource quantity #132351

bugfix(hpa): introduce buildQuantity helper for consistent resource quantity #132351

Conversation

googs1025 commented Jun 17, 2025

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR is related to:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Uh oh!

googs1025 commented Jun 17, 2025

Uh oh!

googs1025 commented Jun 17, 2025

Uh oh!

googs1025 commented Jun 17, 2025

Uh oh!

googs1025 commented Jun 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

omerap12 Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

googs1025 Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

omerap12 Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

soltysh Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

googs1025 Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

soltysh Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

soltysh Jun 24, 2025

Choose a reason for hiding this comment

Uh oh!

googs1025 Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

soltysh Jun 26, 2025

Choose a reason for hiding this comment

Uh oh!

googs1025 commented Jun 25, 2025

Uh oh!

soltysh left a comment

Choose a reason for hiding this comment

Uh oh!

soltysh commented Jun 26, 2025

Uh oh!

omerap12 left a comment

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Jun 26, 2025

Uh oh!

k8s-ci-robot commented Jun 26, 2025

Uh oh!

Uh oh!

Uh oh!

omerap12 Jun 17, 2025 •

edited

Loading

googs1025 Jun 17, 2025 •

edited

Loading

googs1025 Jun 25, 2025 •

edited

Loading