-
Notifications
You must be signed in to change notification settings - Fork 40.9k
feature(kubectl): support --cpu, --memory flag to kubectl autoscale #129373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature(kubectl): support --cpu, --memory flag to kubectl autoscale #129373
Conversation
Skipping CI for Draft Pull Request. |
a3b9c5d
to
b926312
Compare
@@ -224,6 +275,11 @@ func (o *AutoscaleOptions) Run() error { | |||
if err := o.handleHPA(hpaV2); err != nil { | |||
klog.V(1).Infof("Encountered an error with the v2 HorizontalPodAutoscaler: %v. "+ | |||
"Falling back to try the v1 HorizontalPodAutoscaler", err) | |||
// check if the HPA can be created using v1 API. | |||
if !o.canCreateHPAV1() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we need to determine whether it can be created in V1, because there are different between v1 and v2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you apply my other suggestion to only add --cpu-percent
this check will be simpler, and will only check that one flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
ping @soltysh please help this :) |
@@ -129,6 +135,15 @@ func NewCmdAutoscale(f cmdutil.Factory, ioStreams genericiooptions.IOStreams) *c | |||
cmd.Flags().Int32Var(&o.Max, "max", -1, "The upper limit for the number of pods that can be set by the autoscaler. Required.") | |||
cmd.MarkFlagRequired("max") | |||
cmd.Flags().Int32Var(&o.CPUPercent, "cpu-percent", -1, "The target average CPU utilization (represented as a percent of requested CPU) over all the pods. If it's not specified or negative, a default autoscaling policy will be used.") | |||
cmd.Flags().Int32Var(&o.MemoryPercent, "mem-percent", -1, "The target average memory utilization (represented as a percent of requested memory) over all the pods. If it's not specified or negative, a default autoscaling policy will be used.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that in comment kubernetes/kubectl#1481 (comment) the OP mentioned all the possible values, but I'm definitely against such an approach. I haven't seen any particular preference between the percent and quantity so I'm leaning towards the --mem-percent
to match the pre-existing --cpu-percent
flag. So please drop all else and we'll only support that one addition. If users are interested in tweaking the generated HPA they are free to do it outside this command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds reasonable to me. done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for not thinking more about this, but I was considering a better approach for the future.
The original implementation utilizing --cpu-percent
was based entirely on autoscaling/v1 types.
But now that most of our users have the ability to use autoscaling/v2 where we can present both types I'm inclined to deprecate the current --cpu-percent
flag, and replace it with just --cpu
where it would accept either a percent value (10%
, or just a number 10
which would mean it's a percentage), or users could specify a quantity (in case of cpu that'd be 5m
). Now, in a similar fashion we'd add --memory
flag which would also accept a number (or with %
), or a quantity. For parsing the first number you can use intstr.IntOrString
type, and if that fails, it means you were given a quantity, which also have parsing utilities.
Sorry once again for the changes, but I believe that will better match all the possible use cases. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. This is more flexible on the flag, but we need to do more job of verifying the input params. I will adjust it and ping you again when I am done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about the design, maybe we can do this: we provide --cpu, --memory flags. Users can specify ":" to specify the metric type and support multiple units. For cases where units are not specified, the default configuration is used. 🤔 @soltysh PTAL
CPU Units:
- m,milli: milliCPU, e.g., 500m means 0.5 cores.
- core: core count, e.g., 1core or 1 means 1 core.
- Default to m if no unit is provided.
Memory Units:
- "Mi", "Mib", "MiB": mebibytes, e.g., 512Mi means 512 megabytes.
- "Gi", "Gib", "GiB": gibibytes, e.g., 1Gi means 1 gigabyte.
- Default to Mi if no unit is provided.
Percentage Format: If the value ends with %, it defaults to Utilization.
Quantity Format: If the value is a number or has a unit:
For --cpu, default to m if no unit is provided; parse directly if m is specified.
For --memory, default to Mi if no unit is provided; parse directly if Mi or Gi is specified.
Explicit Metric Type: Users can specify :utilization, :value, or :averagevalue explicitly.
Possible Cases
For --cpu
- Percentage with Default Type (Utilization):
- Example: --cpu=80%
- Description: CPU average utilization of 80%.
- Quantity with Default Type (Value):
- Example: --cpu=500 (defaults to 500m)
- Description: 500 milliCPU.
- Percentage with Explicit Utilization:
- Example: --cpu=80%:utilization
- Description: Explicitly specifies CPU average utilization of 80%.
- Quantity with Explicit Value:
- Example: --cpu=500m:value
- Description: Explicitly specifies 500 milliCPU.
- Example: --cpu=1core:value
- Description: Explicitly specifies 1 CPU core.
- Quantity with Explicit AverageValue:
- Example: --cpu=500m:averagevalue
- Description: Explicitly specifies an average of 500 milliCPU.
For --memory
- Percentage with Default Type (Utilization):
- Example: --memory=80%
- Description: Memory average utilization of 80%.
- Quantity with Default Type (Value):
- Example: --memory=512 (defaults to 512Mi)
- Description: 512 mebibytes of memory.
- Percentage with Explicit Utilization:
- Example: --memory=80%:utilization
- Description: Explicitly specifies memory average utilization of 80%.
- Quantity with Explicit Value:
- Example: --memory=512Mi:value
- Description: Explicitly specifies 512 mebibytes of memory.
- Example: --memory=1Gi:value
- Description: Explicitly specifies 1 gibibyte of memory.
- Quantity with Explicit AverageValue:
- Example: --memory=512Mi:averagevalue
- Description: Explicitly specifies an average of 512 mebibytes of memory.
- Example: --memory=1Gi:averagevalue
- Description: Explicitly specifies an average of 1 gibibyte of memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you wrote above is close to what I've meant, although introducing :
in the middle is not expected. I'm worried that it might overly complicate things, and I prefer something a bit simpler.
So:
- For both cpu and memory, a percentage number would automatically mean average value in autoscale's metric type (
.target.averageUtilization
). - For both cpu and memory passing a unit-based value (ie. with milicores or MiB, etc) or just a number will automatically means the user wants to specify a quantity, so that's the average value (
.target.averageValue
) in autoscale's metric type. This would match what we have in quantity API.
For no. 2 you don't need to invent any logic for parsing the quantities, since we already have k8s.io/apimachinery/pkg/api/resource#ParseQuantity
to handle that for you.
So looking at your examples:
1. Percentage with Default Type (Utilization): * Example: --cpu=80% * Description: CPU average utilization of 80%.
👍
2. Quantity with Default Type (Value): * Example: --cpu=500 (defaults to 500m) * Description: 500 milliCPU.
👍
3. Percentage with Explicit Utilization: * Example: --cpu=80%:utilization * Description: Explicitly specifies CPU average utilization of 80%.
--cpu=80%:utilization
no need to specify it's utilization. Just --cpu=80%
should be sufficient.
4. Quantity with Explicit Value: * Example: --cpu=500m:value * Description: Explicitly specifies 500 milliCPU. * Example: --cpu=1core:value * Description: Explicitly specifies 1 CPU core.
In both cases no need for :value
, --cpu=500m
is correct and ParseQuantity
will handle that. --cpu=1core
is not correct. That's not supported by our quantity API.
5. Quantity with Explicit AverageValue: * Example: --cpu=500m:averagevalue * Description: Explicitly specifies an average of 500 milliCPU.
Again, no need for :averagevalue
, just --cpu=500m
will be sufficient to express 500 milicores.
1. Percentage with Default Type (Utilization): * Example: --memory=80% * Description: Memory average utilization of 80%.
👍
2. Quantity with Default Type (Value): * Example: --memory=512 (defaults to 512Mi) * Description: 512 mebibytes of memory.
👍
3. Percentage with Explicit Utilization: * Example: --memory=80%:utilization * Description: Explicitly specifies memory average utilization of 80%.
No need for :utilization
, just --memory=80%
.
4. Quantity with Explicit Value: * Example: --memory=512Mi:value * Description: Explicitly specifies 512 mebibytes of memory. * Example: --memory=1Gi:value * Description: Explicitly specifies 1 gibibyte of memory.
No need for :value
, just --memory=512Mi
or --memory=1Gi
is sufficient.
5. Quantity with Explicit AverageValue: * Example: --memory=512Mi:averagevalue * Description: Explicitly specifies an average of 512 mebibytes of memory. * Example: --memory=1Gi:averagevalue * Description: Explicitly specifies an average of 1 gibibyte of memory.
No need for :averagevalue
, just --memory=512Mi
or --memory=1Gi
is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for feedback. I will handle these days 😄
@@ -224,6 +275,11 @@ func (o *AutoscaleOptions) Run() error { | |||
if err := o.handleHPA(hpaV2); err != nil { | |||
klog.V(1).Infof("Encountered an error with the v2 HorizontalPodAutoscaler: %v. "+ | |||
"Falling back to try the v1 HorizontalPodAutoscaler", err) | |||
// check if the HPA can be created using v1 API. | |||
if !o.canCreateHPAV1() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you apply my other suggestion to only add --cpu-percent
this check will be simpler, and will only check that one flag.
/triage accepted |
d5262b0
to
0acbf25
Compare
@@ -129,6 +135,8 @@ func NewCmdAutoscale(f cmdutil.Factory, ioStreams genericiooptions.IOStreams) *c | |||
cmd.Flags().Int32Var(&o.Max, "max", -1, "The upper limit for the number of pods that can be set by the autoscaler. Required.") | |||
cmd.MarkFlagRequired("max") | |||
cmd.Flags().Int32Var(&o.CPUPercent, "cpu-percent", -1, "The target average CPU utilization (represented as a percent of requested CPU) over all the pods. If it's not specified or negative, a default autoscaling policy will be used.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
--cpu-percent
should be marked as deprecated, we can't remove without a proper deprecation period, which lasts a year.
This still holds, I don't see that flag marked deprecated. Also while doing so make sure to update the example
kubernetes/staging/src/k8s.io/kubectl/pkg/cmd/autoscale/autoscale.go
Lines 59 to 60 in d23407b
# Auto scale a replication controller "foo", with the number of pods between 1 and 5, target CPU utilization at 80% | |
kubectl autoscale rc foo --max=5 --cpu-percent=80`)) |
Use: "autoscale (-f FILENAME | TYPE NAME | TYPE/NAME) [--min=MINPODS] --max=MAXPODS [--cpu-percent=CPU]", |
--cpu
instead.
unit = "Mi" | ||
default: | ||
return fmt.Errorf("unsupported resource type: %v", resourceType) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic here is not quite correct, what if I pass --memory=5Gi
, the above code will not work correctly, and will fail in the next step when parsing int, because 5Gi
is not the correct value. Why you want to manually parse this value, rather than just trust that ParseQuantity
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general our API accepts both value with units and without them. You can see that pattern being used even in our docs. So I'd drop the part where you're adding units by hand entirely, and just parse the part with %
, to handle the percentage utilization, and pass the other value as is straight into API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lastly, please add a unit tests covering all those cases to make sure we won't break anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code correctly handles various input formats:
- percentages like 70%
- numbers like 500 (auto-converted to 500m or 500Mi)
- Full quantities with units like 500m, 2Gi, and — even 5Gi is handled properly by Kubernetes' quantity parser.
The reason we need case 2 is that we want to distinguish and fill in the correct default unit.
// Case 1: Percentage
if strings.HasSuffix(input, "%") {
trimmed := strings.TrimSuffix(input, "%")
value, err := strconv.ParseInt(trimmed, 10, 64)
if err != nil || value < 0 {
return fmt.Errorf("invalid percentage value: %s", trimmed)
}
return nil
}
// Case 2: Try parse as pure integer and apply default unit
var unit string
switch resourceType {
case corev1.ResourceCPU:
unit = "m"
case corev1.ResourceMemory:
unit = "Mi"
default:
return fmt.Errorf("unsupported resource type: %v", resourceType)
}
if _, err := strconv.ParseFloat(input, 64); err == nil {
input += unit
}
// Case 3: Parse full quantity
if _, err := apiresource.ParseQuantity(input); err != nil {
return fmt.Errorf("invalid %s target: %w", strings.ToLower(string(resourceType)), err)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... maybe I was wrong when testing last time, but I remember it didn't work the way you're describing, but rather I run into the described issue. Anyway, currently it's working as expected.
@@ -215,15 +282,22 @@ func (o *AutoscaleOptions) Run() error { | |||
mapping := info.ResourceMapping() | |||
gvr := mapping.GroupVersionKind.GroupVersion().WithResource(mapping.Resource.Resource) | |||
if _, err := o.scaleKindResolver.ScaleForResource(gvr); err != nil { | |||
return fmt.Errorf("cannot autoscale a %v: %v", mapping.GroupVersionKind.Kind, err) | |||
return fmt.Errorf("cannot autoscale a %v: %w", mapping.GroupVersionKind.Kind, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
return fmt.Errorf("cannot autoscale a %v: %w", mapping.GroupVersionKind.Kind, err) | |
return fmt.Errorf("cannot autoscale a %s: %w", mapping.GroupVersionKind.Kind, err) |
.Kind
is a string, so it's better to match the print type with that 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
105f884
to
a0a8e7a
Compare
2226572
to
413bb80
Compare
/retest |
@@ -128,8 +139,12 @@ func NewCmdAutoscale(f cmdutil.Factory, ioStreams genericiooptions.IOStreams) *c | |||
cmd.Flags().Int32Var(&o.Min, "min", -1, "The lower limit for the number of pods that can be set by the autoscaler. If it's not specified or negative, the server will apply a default value.") | |||
cmd.Flags().Int32Var(&o.Max, "max", -1, "The upper limit for the number of pods that can be set by the autoscaler. Required.") | |||
cmd.MarkFlagRequired("max") | |||
cmd.Flags().Int32Var(&o.CPUPercent, "cpu-percent", -1, "The target average CPU utilization (represented as a percent of requested CPU) over all the pods. If it's not specified or negative, a default autoscaling policy will be used.") | |||
cmd.Flags().Int32Var(&o.CPUPercent, "cpu-percent", -1, "Flag deprecated: use --cpu instead with percentage or milliCPU value. The target average CPU utilization (represented as a percent of requested CPU) over all the pods. If it's not specified or negative, a default autoscaling policy will be used.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't change this line, the MarkDeprecated
is sufficient because it will hide this flag from --help
output and produce the necessary information for users still relying on that flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
cmd.Flags().StringVar(&o.Name, "name", "", i18n.T("The name for the newly created object. If not specified, the name of the input resource will be used.")) | ||
_ = cmd.Flags().MarkDeprecated("cpu-percent", | ||
"Flag deprecated. Use --cpu with percentage or resource quantity format (e.g., '70%' for utilization or '500m' for milliCPU).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Flag deprecated. Use --cpu with percentage or resource quantity format (e.g., '70%' for utilization or '500m' for milliCPU).") | |
"Use --cpu with percentage or resource quantity format (e.g., '70%' for utilization or '500m' for milliCPU).") |
What you currently have will produce:
Flag --cpu-percent has been deprecated, Flag deprecated. Use --cpu with percentage or resource quantity format (e.g., '70%' for utilization or '500m' for milliCPU).
The suggested change will give you:
Flag --cpu-percent has been deprecated, Use --cpu with percentage or resource quantity format (e.g., '70%' for utilization or '500m' for milliCPU).
So no need to repeat the information about deprecation 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got this. 😄 thanks
unit = "Mi" | ||
default: | ||
return fmt.Errorf("unsupported resource type: %v", resourceType) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... maybe I was wrong when testing last time, but I remember it didn't work the way you're describing, but rather I run into the described issue. Anyway, currently it's working as expected.
// | ||
// Note: In the validateResourceTarget method, we've already validated the resource target configurations, | ||
// Therefore, we do not return the error but proceed with fallback logic. | ||
hpaV2, _ := o.createHorizontalPodAutoscalerV2(info.Name, mapping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Either don't introduce that error returned from createHorizontalPodAutoscalerV2
or make sure to act on it. Swallowing errors is bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified to handle this error.
if err := o.handleHPA(hpaV2); err != nil { | ||
klog.V(1).Infof("Encountered an error with the v2 HorizontalPodAutoscaler: %v. "+ | ||
"Falling back to try the v1 HorizontalPodAutoscaler", err) | ||
// check if the HPA can be created using v1 API. | ||
if ok, err := o.canCreateHPAV1(); !ok { | ||
return fmt.Errorf("failed to create v2 HPA and the configuration is incompatible with v1: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return fmt.Errorf("failed to create v2 HPA and the configuration is incompatible with v1: %w", err) | |
return fmt.Errorf("failed to create autoscaling/v2 HPA and the configuration is incompatible with autoscaling/v1: %w", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if metricsType == autoscalingv2.UtilizationMetricType { | ||
isUtilizationMetricType = true | ||
} | ||
return (o.CPUPercent >= 0 && o.Memory == "") || (o.Memory == "" && isUtilizationMetricType), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return (o.CPUPercent >= 0 && o.Memory == "") || (o.Memory == "" && isUtilizationMetricType), nil | |
return (o.CPUPercent >= 0 && o.Memory == "") || (o.Memory == "" && metricsType == autoscalingv2.UtilizationMetricType), nil |
and you can drop that variable, it's simple enough to read 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// parseResourceInput parses the input string and resource type | ||
// to return a Quantity, value, MetricTargetType, and error. | ||
// The input string should be in the format of "<value>[<unit>]". | ||
func parseResourceInput(input string, resourceType corev1.ResourceName) (apiresource.Quantity, int32, autoscalingv2.MetricTargetType, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, you have two functions doing very similar checks, iow. you're duplicating code here and inside validateResourceTarget
. Can you combine the shared bits of code?
50a7d42
to
f731431
Compare
336d47b
to
9506e97
Compare
@soltysh /PTAL 😄 |
9506e97
to
d6e8048
Compare
…-value,mem-average-value flag to kubectl autoscale Signed-off-by: googs1025 <[email protected]>
d6e8048
to
6795d53
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
LGTM label has been added. Git tree hash: d760bc84d374eed9924b60e7b7b3f4559221ca46
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: googs1025, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@googs1025: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass. This bot retests PRs for certain kubernetes repos according to the following rules:
You can:
/retest |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes kubernetes/kubectl#1481
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: