🐛 Fix incorrect calculation for ResourceQuota with PriorityClass as its scope #117677

Huang-Wei · 2023-04-28T19:49:21Z

What type of PR is this?

/kind bug
/sig api-machinery

What this PR does / why we need it:

Fix incorrect calculation for ResourceQuota with PriorityClass as its scope. See #117676 for repro steps.

Which issue(s) this PR fixes:

Fixes #117676

Special notes for your reviewer:

The root cause is at

kubernetes/staging/src/k8s.io/apiserver/pkg/quota/v1/generic/evaluator.go

Line 202 in 4ca7bce

 innerMatch, err := scopeFunc(corev1.ScopedResourceSelectorRequirement{ScopeName: scope}, item) 

It would always return an error as the SelectorRequirement is invalid for PriorityClassScoped quota, and then the err is swollen and returned

kubernetes/staging/src/k8s.io/apiserver/pkg/quota/v1/generic/evaluator.go

Lines 203 to 205 in 4ca7bce

 if err != nil { 

 return result, nil 

 }

Not sure we need UT here as we don't seem to have UT for CalculateUsageStats at all.

Does this PR introduce a user-facing change?

Fix incorrect calculation for ResourceQuota with PriorityClass as its scope.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Huang-Wei · 2023-04-28T21:30:42Z

/retest

aojea · 2023-04-29T10:14:34Z

can we have a regression test covering this bug?

liggitt · 2023-04-29T13:39:46Z

This definitely needs tests demonstrating the issue and the fix

Huang-Wei · 2023-04-30T22:37:15Z

@aojea @liggitt Added a UT to repro the issue. Please check the sub-test "partial pods matching quotaScopeSelector - w/ scopeName specified" in test pkg/quota/v1/evaluator/core/pods_test.go#TestPodEvaluatorUsageStats()

staging/src/k8s.io/apiserver/pkg/quota/v1/generic/evaluator.go

cici37 · 2023-05-02T20:12:33Z

/triage accepted

liggitt · 2023-05-04T14:24:54Z

pkg/quota/v1/evaluator/core/pods.go

+ if len(selector.Operator) == 0 && selector.Values == nil {
+ // this is just checking for existence of a priorityClass on the pod
+ return len(pod.Spec.PriorityClassName) != 0, nil
+ }


digging deeper, I think we can even avoid this special case by making CalculateUsageStats consistent with getScopeSelectorsFromQuota:

kubernetes/staging/src/k8s.io/apiserver/pkg/quota/v1/generic/evaluator.go

Lines 171 to 182 in bbbf7fd

func getScopeSelectorsFromQuota(quota *corev1.ResourceQuota) []corev1.ScopedResourceSelectorRequirement {

selectors := []corev1.ScopedResourceSelectorRequirement{}

for _, scope := range quota.Spec.Scopes {

selectors = append(selectors, corev1.ScopedResourceSelectorRequirement{

ScopeName: scope,

Operator: corev1.ScopeSelectorOpExists})

}

if quota.Spec.ScopeSelector != nil {

selectors = append(selectors, quota.Spec.ScopeSelector.MatchExpressions...)

}

return selectors

}

Things in Scopes should be treated as an Exists selector requirement check:

diff --git a/staging/src/k8s.io/apiserver/pkg/quota/v1/generic/evaluator.go b/staging/src/k8s.io/apiserver/pkg/quota/v1/generic/evaluator.go index 55b31a745a0..e122248f861 100644 --- a/staging/src/k8s.io/apiserver/pkg/quota/v1/generic/evaluator.go +++ b/staging/src/k8s.io/apiserver/pkg/quota/v1/generic/evaluator.go @@ -199,7 +199,7 @@ func CalculateUsageStats(options quota.UsageStatsOptions, // need to verify that the item matches the set of scopes matchesScopes := true for _, scope := range options.Scopes { - innerMatch, err := scopeFunc(corev1.ScopedResourceSelectorRequirement{ScopeName: scope}, item) + innerMatch, err := scopeFunc(corev1.ScopedResourceSelectorRequirement{ScopeName: scope, Operator: corev1.ScopeSelectorOpExists}, item) if err != nil { return result, nil }

which would already work properly with podMatchesSelector

One question is: will this impose extra overhead on the underlying selector check (labelSelector.Matches)? If not, this solution is neater.

I think we should change the selector constructed in CalculateUsageStats either way

I guess we can still optimize the evaluation inside podMatchesScopeFunc if we want, but do it in a more principled way by actually looking to see if the operator is the Exists operator:

case corev1.ResourceQuotaScopePriorityClass: + if selector.Operator == corev1.ScopeSelectorOpExists { + // this is just checking for existence of a priorityClass on the pod, no need to take the overhead of selector parsing/evaluation + return len(pod.Spec.PriorityClassName) != 0, nil + } return podMatchesSelector(pod, selector)

I guess we can still optimize the evaluation inside podMatchesScopeFunc if we want

We really need to. Without this optimization, if we impose Exists operator, the benchmark test result is like:

⇒ go test ./pkg/quota/v1/evaluator/core/... -bench BenchmarkPodMatchesScopeFunc -run ^$ goos: darwin goarch: arm64 pkg: k8s.io/kubernetes/pkg/quota/v1/evaluator/core BenchmarkPodMatchesScopeFunc/PriorityClass_selector_w/o_operator-10 2401466 493.9 ns/op BenchmarkPodMatchesScopeFunc/PriorityClass_selector_w/_'Exists'_operator-10 1378538 872.0 ns/op BenchmarkPodMatchesScopeFunc/BestEfforts_selector_w/o_operator-10 2486842 472.4 ns/op BenchmarkPodMatchesScopeFunc/BestEfforts_selector_w/_'Exists'_operator-10 2300052 473.3 ns/op

And with this optimization, as it breaks early, the result is:

⇒ go test ./pkg/quota/v1/evaluator/core/... -bench BenchmarkPodMatchesScopeFunc -run ^$ goos: darwin goarch: arm64 pkg: k8s.io/kubernetes/pkg/quota/v1/evaluator/core BenchmarkPodMatchesScopeFunc/PriorityClass_selector_w/o_operator-10 2424650 493.7 ns/op BenchmarkPodMatchesScopeFunc/PriorityClass_selector_w/_'Exists'_operator-10 5513385 215.7 ns/op BenchmarkPodMatchesScopeFunc/BestEfforts_selector_w/o_operator-10 2347708 484.5 ns/op BenchmarkPodMatchesScopeFunc/BestEfforts_selector_w/_'Exists'_operator-10 2467605 471.6 ns/op PASS ok k8s.io/kubernetes/pkg/quota/v1/evaluator/core 6.815s

… scope

liggitt · 2023-05-05T00:17:53Z

/lgtm
/approve

k8s-ci-robot · 2023-05-05T00:18:00Z

LGTM label has been added.

Git tree hash: e9770c60854199a1952038d14f05ba14384cbeb5

k8s-ci-robot · 2023-05-05T00:18:14Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Huang-Wei, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/quota/v1/OWNERS~~ [liggitt]
~~staging/src/k8s.io/apiserver/pkg/quota/v1/OWNERS~~ [liggitt]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Huang-Wei · 2023-05-05T01:12:12Z

@liggitt do you think we need to backport this? (if not, it can be worked around though)

liggitt · 2023-05-05T14:42:47Z

@liggitt do you think we need to backport this? (if not, it can be worked around though)

I would backport this... it's very contained, performance impact is negligible, and makes two different parts of quota work consistently

Huang-Wei · 2023-05-05T22:13:11Z

I would backport this... it's very contained, performance impact is negligible, and makes two different parts of quota work consistently

@liggitt cherry-pick PRs are created. PTAL when you get a chance.

…17677-upstream-release-1.26 Automated cherry pick of #117677: Fix incorrect calculation for ResourceQuota with

…17677-upstream-release-1.25 Automated cherry pick of #117677: Fix incorrect calculation for ResourceQuota with

…17677-upstream-release-1.24 Automated cherry pick of #117677: Fix incorrect calculation for ResourceQuota with

…17677-upstream-release-1.27 Automated cherry pick of #117677: Fix incorrect calculation for ResourceQuota with

k8s-ci-robot requested review from derekwaynecarr and smarterclayton April 28, 2023 19:50

Huang-Wei force-pushed the fix-quota-priorityclass branch from c435776 to 184d8c3 Compare April 30, 2023 22:35

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Apr 30, 2023

liggitt reviewed May 1, 2023

View reviewed changes

staging/src/k8s.io/apiserver/pkg/quota/v1/generic/evaluator.go Outdated Show resolved Hide resolved

Huang-Wei force-pushed the fix-quota-priorityclass branch from 184d8c3 to c108bcc Compare May 1, 2023 18:22

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 2, 2023

liggitt mentioned this pull request May 3, 2023

ResourceQuota with "PriortyClass" as its scope is not calculated properly #117676

Closed

liggitt reviewed May 4, 2023

View reviewed changes

Huang-Wei added 2 commits May 4, 2023 17:02

Fix incorrect calculation for ResourceQuota with PriorityClass as its…

edd032e

… scope

benchmark test to evaluate the overhead of podMatchesScopeFunc

359bcec

Huang-Wei force-pushed the fix-quota-priorityclass branch from c108bcc to 359bcec Compare May 5, 2023 00:02

k8s-ci-robot assigned liggitt May 5, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 5, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 5, 2023

k8s-ci-robot merged commit dea1312 into kubernetes:master May 5, 2023
12 checks passed

k8s-ci-robot added this to the v1.28 milestone May 5, 2023

Huang-Wei deleted the fix-quota-priorityclass branch May 5, 2023 01:11

Huang-Wei mentioned this pull request May 9, 2023

Automated cherry pick of #117677: Fix incorrect calculation for ResourceQuota with #117891

Merged

k8s-ci-robot added a commit that referenced this pull request May 11, 2023

Merge pull request #117826 from Huang-Wei/automated-cherry-pick-of-#1…

0dd9c08

…17677-upstream-release-1.26 Automated cherry pick of #117677: Fix incorrect calculation for ResourceQuota with

k8s-ci-robot added a commit that referenced this pull request May 11, 2023

Merge pull request #117828 from Huang-Wei/automated-cherry-pick-of-#1…

4623aa4

…17677-upstream-release-1.25 Automated cherry pick of #117677: Fix incorrect calculation for ResourceQuota with

k8s-ci-robot added a commit that referenced this pull request May 11, 2023

Merge pull request #117891 from Huang-Wei/automated-cherry-pick-of-#1…

f8b3cc4

…17677-upstream-release-1.24 Automated cherry pick of #117677: Fix incorrect calculation for ResourceQuota with

k8s-ci-robot added a commit that referenced this pull request May 11, 2023

Merge pull request #117825 from Huang-Wei/automated-cherry-pick-of-#1…

c605065

…17677-upstream-release-1.27 Automated cherry pick of #117677: Fix incorrect calculation for ResourceQuota with

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 Fix incorrect calculation for ResourceQuota with PriorityClass as its scope #117677

🐛 Fix incorrect calculation for ResourceQuota with PriorityClass as its scope #117677

Huang-Wei commented Apr 28, 2023 •

edited

Huang-Wei commented Apr 28, 2023

aojea commented Apr 29, 2023

liggitt commented Apr 29, 2023

Huang-Wei commented Apr 30, 2023

cici37 commented May 2, 2023

liggitt May 4, 2023 •

edited

Huang-Wei May 4, 2023

liggitt May 4, 2023

Huang-Wei May 5, 2023

liggitt commented May 5, 2023

k8s-ci-robot commented May 5, 2023

k8s-ci-robot commented May 5, 2023

Huang-Wei commented May 5, 2023

liggitt commented May 5, 2023

Huang-Wei commented May 5, 2023

	func getScopeSelectorsFromQuota(quota *corev1.ResourceQuota) []corev1.ScopedResourceSelectorRequirement {
	selectors := []corev1.ScopedResourceSelectorRequirement{}
	for _, scope := range quota.Spec.Scopes {
	selectors = append(selectors, corev1.ScopedResourceSelectorRequirement{
	ScopeName: scope,
	Operator: corev1.ScopeSelectorOpExists})
	}
	if quota.Spec.ScopeSelector != nil {
	selectors = append(selectors, quota.Spec.ScopeSelector.MatchExpressions...)
	}
	return selectors
	}

🐛 Fix incorrect calculation for ResourceQuota with PriorityClass as its scope #117677

🐛 Fix incorrect calculation for ResourceQuota with PriorityClass as its scope #117677

Conversation

Huang-Wei commented Apr 28, 2023 • edited

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

Huang-Wei commented Apr 28, 2023

aojea commented Apr 29, 2023

liggitt commented Apr 29, 2023

Huang-Wei commented Apr 30, 2023

cici37 commented May 2, 2023

liggitt May 4, 2023 • edited

Choose a reason for hiding this comment

Huang-Wei May 4, 2023

Choose a reason for hiding this comment

liggitt May 4, 2023

Choose a reason for hiding this comment

Huang-Wei May 5, 2023

Choose a reason for hiding this comment

liggitt commented May 5, 2023

k8s-ci-robot commented May 5, 2023

k8s-ci-robot commented May 5, 2023

Huang-Wei commented May 5, 2023

liggitt commented May 5, 2023

Huang-Wei commented May 5, 2023

Huang-Wei commented Apr 28, 2023 •

edited

liggitt May 4, 2023 •

edited