-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a data race in TopologyCache #117249
Fix a data race in TopologyCache #117249
Conversation
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
the CI runs with diff --git a/pkg/controller/endpointslice/topologycache/topologycache_test.go b/pkg/controller/endpointslice/topologycache/topologycache_test.go
index 8c83f9ec9f6..5716b4cb336 100644
--- a/pkg/controller/endpointslice/topologycache/topologycache_test.go
+++ b/pkg/controller/endpointslice/topologycache/topologycache_test.go
@@ -625,6 +625,75 @@ func TestSetNodes(t *testing.T) {
}
}
+func TestTopologyCacheRace(t *testing.T) {
+ sliceInfo := &SliceInfo{
+ ServiceKey: "ns/svc",
+ AddressType: discovery.AddressTypeIPv4,
+ ToCreate: []*discovery.EndpointSlice{{
+ Endpoints: []discovery.Endpoint{{
+ Addresses: []string{"10.1.2.3"},
+ Zone: pointer.String("zone-a"),
+ Conditions: discovery.EndpointConditions{Ready: pointer.Bool(true)},
+ }, {
+ Addresses: []string{"10.1.2.4"},
+ Zone: pointer.String("zone-b"),
+ Conditions: discovery.EndpointConditions{Ready: pointer.Bool(true)},
+ }},
+ }}}
+ type nodeInfo struct {
+ zone string
+ cpu resource.Quantity
+ ready v1.ConditionStatus
+ labels map[string]string
+ }
+ nodesinfos := []nodeInfo{
+ {zone: "zone-a", cpu: resource.MustParse("1000m"), ready: v1.ConditionTrue},
+ {zone: "zone-a", cpu: resource.MustParse("1000m"), ready: v1.ConditionTrue},
+ {zone: "zone-a", cpu: resource.MustParse("1000m"), ready: v1.ConditionTrue},
+ {zone: "zone-a", cpu: resource.MustParse("2000m"), ready: v1.ConditionTrue},
+ {zone: "zone-b", cpu: resource.MustParse("3000m"), ready: v1.ConditionTrue},
+ {zone: "zone-b", cpu: resource.MustParse("1500m"), ready: v1.ConditionTrue},
+ {zone: "zone-c", cpu: resource.MustParse("500m"), ready: v1.ConditionTrue},
+ }
+
+ cache := NewTopologyCache()
+ nodes := []*v1.Node{}
+ for _, node := range nodesinfos {
+ labels := node.labels
+ if labels == nil {
+ labels = map[string]string{}
+ }
+ if node.zone != "" {
+ labels[v1.LabelTopologyZone] = node.zone
+ }
+ conditions := []v1.NodeCondition{{
+ Type: v1.NodeReady,
+ Status: node.ready,
+ }}
+ allocatable := v1.ResourceList{
+ v1.ResourceCPU: node.cpu,
+ }
+ nodes = append(nodes, &v1.Node{
+ ObjectMeta: metav1.ObjectMeta{
+ Labels: labels,
+ },
+ Status: v1.NodeStatus{
+ Allocatable: allocatable,
+ Conditions: conditions,
+ },
+ })
+ }
+
+ for i := 0; i < 50; i++ {
+ go func() {
+ cache.SetNodes(nodes)
+ }()
+ go func() {
+ cache.AddHints(sliceInfo)
+ }()
+ }
+}
+ |
The member variable `cpuRatiosByZone` should be accessed with the lock acquired as it could be be updated by `SetNodes` concurrently. Signed-off-by: Quan Tian <[email protected]> Co-authored-by: Antonio Ojea <[email protected]>
@aojea thanks for the suggestion. I have added the test with minor adjustments (removed some unused variables and verified it can be reproduced even executing only once). Added you as co-author if you don't mind. |
no need to, it was just a suggestion, but thanks /lgtm /test pull-kubernetes-e2e-gce Kubernetes e2e suite: [It] [sig-cli] Kubectl client Simple pod should contain last line of the log expand_less |
LGTM label has been added. Git tree hash: 70040d77e102e6182c2f7c3fbcd24aba0a172d7c
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, tnqn The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind bug
What this PR does / why we need it:
The member variable
cpuRatiosByZone
should be accessed with the lock acquired as it could be be updated bySetNodes
concurrently.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: