Skip to content

e2e: serial: node cpumanager parity with the old suite #132498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ffromani
Copy link
Contributor

@ffromani ffromani commented Jun 24, 2025

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This PR aims completes (to the best to the author's knowledge) the feature parity of the new cpumanager testsuite with the old one. Most notably:

  • restore the full coverage of the cpu quota e2e tests; previously only the minimal set was ported out of practicality
  • ordered suites stops the test group at the first failure; this is most certainly not what we want as it masks failures worsening the signal. We can use the ginkgo ContinueOnFailure to avoid this while still keeping the benefits of the Ordered suite (see code comments for details)

Which issue(s) this PR is related to:

N/A

Special notes for your reviewer:

With this PR merged I think the rewrite is complete because all the items I had in my TODO are addressed. We can now have the final go/no-go conversation

Previous work

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/test sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Jun 24, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 24, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ffromani

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 24, 2025
@ffromani
Copy link
Contributor Author

@esotsal @wongchar FYI as you are affected most by this go/no-go conversation. Please CC other folks I may have missed

ffromani added 2 commits June 24, 2025 15:46
Initially we added minimal quota disablement e2e tests,
but since the emergence of kubevirt/kubevirt#14965
it becames clear that is better to have full coverage.

This PR restores coverage parity with the old test suite.

Signed-off-by: Francesco Romani <[email protected]>
The `ginkgo.ContinueOnFailure` decorator serves the usecase
of the new cpumanager tests perfectly:

https://onsi.github.io/ginkgo/#failure-handling-in-ordered-containers

"""
You can override this behavior by decorating an Ordered container with
ContinueOnFailure. This is useful in cases where Ordered is being used
to provide shared expensive set up for a collection of specs.
When ContinueOnFailure is set, Ginkgo will continue running specs even
if an earlier spec in the Ordered container has failed.
"""

And this is exactly the case at hand. Previously, without this
decorator, subsequent failures were masked, which is dangerous and not
what we want.

Signed-off-by: Francesco Romani <[email protected]>
@ffromani ffromani force-pushed the e2e-serial-node-cpumanager-fix-ordered branch from 3251144 to 3b0fd32 Compare June 24, 2025 13:46
@ffromani ffromani moved this from Triage to Not Sig Node in SIG Node: code and documentation PRs Jun 24, 2025
@ffromani ffromani moved this from Triage to PRs - Needs Reviewer in SIG Node CI/Test Board Jun 24, 2025
@ffromani
Copy link
Contributor Author

/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 24, 2025
@ffromani
Copy link
Contributor Author

sig-node CI meeting 20250625: agreement to add reduced (not 1:1) coverage for cgroup v1
once this is done, no remaining concerns about proceed and remove the old testsuite and depend on the new one.
I will file a followup PR to add the cgroup v1 coverage, then I'll send the final PR to remove the old tests.

@esotsal
Copy link
Contributor

esotsal commented Jun 26, 2025

Please CC other folks I may have missed

Thanks @ffromani , ping @pravk03 , @natasha41575 , @Chunxia202410 and @AnishShah

@pacoxu
Copy link
Member

pacoxu commented Jul 1, 2025

/cc @swatisehgal @natasha41575

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: PRs - Needs Reviewer
Development

Successfully merging this pull request may close these issues.

4 participants