Skip to content

Fix failing unit tests on systems with cgroup v2 unified mode. #119292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

brianpursley
Copy link
Member

What type of PR is this?

/kind failing-test

What this PR does / why we need it:

Since 4e20a8f, some of the kuberuntime unit tests fail when run on systems that have cgroup v2 unified mode (Ubuntu 22.04 in my case).

The test failures are not happening in the build pipeline, only when you are running them locally on a system with cgroup v2 unified mode (stat -fc %T /sys/fs/cgroup returns cgroup2fs )

This PR updates the failing unit tests to mock libcontainercgroups.IsCgroup2UnifiedMode so they can pass regardless of whether cgroup v2 unified mode is actually enabled on the system where the unit tests are running. It also enables the ability to write unit test cases which check the output in both scenarios (v1 and v2) of which I added a couple of new test cases.

Which issue(s) this PR fixes:

Special notes for your reviewer:

While I did add a couple of new unit test cases for cgroup v2, this PR is not intended to be comprehensive testing for cgroup v2 functionality. My goal is to enable the existing unit tests to pass.

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


Update unit tests to mock `libcontainercgroups.IsCgroup2UnifiedMode` so they can pass regardless of whether cgroup v2 unified mode is actually enabled on the system where the unit tests are running.
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 13, 2023
@k8s-ci-robot k8s-ci-robot requested review from dims and odinuge July 13, 2023 15:14
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: brianpursley
Once this PR has been reviewed and has the lgtm label, please assign dchen1107 for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@haircommander
Copy link
Contributor

part of me feels like we should skip the tests if they're not applicable to the host environment.

+1 for the new test cases and the robustness of the tests though.

@brianpursley
Copy link
Member Author

part of me feels like we should skip the tests if they're not applicable to the host environment.

+1 for the new test cases and the robustness of the tests though.

Sure, that makes sense too. Are you thinking something like #119329?

Comment on lines +305 to +318
{
name: "Request128MBLimit256MBCgroup2",
cpuReq: generateResourceQuantity("1"),
cpuLim: generateResourceQuantity("2"),
memLim: generateResourceQuantity("128Mi"),
cgroup2UnifiedMode: true,
expected: &runtimeapi.LinuxContainerResources{
CpuPeriod: 100000,
CpuQuota: 200000,
CpuShares: 1024,
MemoryLimitInBytes: 134217728,
Unified: map[string]string{"memory.oom.group": "1"},
},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think #119329 is good, but I think it's missing this section. I think we can also have tests that we skip for v1, and I think this case is a good addition @brianpursley

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same test case as Request128MBLimit256MB but with the cgroup2-specific map entry (memory.oom.group).

Are you thinking I should duplicate the test cases for CGroup V2 and conditionally skip either the CGroup 1 or CGroup 2 tests depending on whether the host system has CGroup 1/2?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 19, 2023
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@SergeyKanzhelev
Copy link
Member

/assign @ffromani

@bart0sh
Copy link
Contributor

bart0sh commented Jul 24, 2023

/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 24, 2023
@bart0sh
Copy link
Contributor

bart0sh commented Jul 24, 2023

@brianpursley please rebase, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants