-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pods in statefulsets sometimes have a blank podIP annotation, causing network issues during termination #4710
Comments
My guess is that, because this is a stateful set:
We have explicit code in place to prevent exactly that scenario so it'll take a bit of debugging. I checked that v3.17.3 has the above fix (cni-plugin and node repos are pinned to a version with the above fix). @LeeHampton Can you check that the |
@fasaxc Yes, |
@LeeHampton Is it easy to repo? Could you repo the issue and take the note of the pod name involved (e.g. pod-A).
|
@song-jiang as far as we can tell, the new pod is only spun up after the old pod is torn down, at no point in time are there 2 pods with the exact name in play. |
We are seeing a similar issue. We hadn't observed the missing podIP annotation but will look for that in testing going forward. We were seeing problems with statefulset graceful shutdown and noticed that if we logged strace of the pid=1 in the container and We are using k8s 1.20.8 and calico v3.17.4. Update: Checking the statefulset that I had been testing - the podIP/podIPs annotations are empty. The same is true of the pods of another statefulset in the cluster but I checked a few pods that aren't in statefulsets and they have the correct annotations.) |
If we manually reset the empty annotations with the expected podIP/podIPs values then the graceful shutdown works as expected. |
That happens with us as well @hindessm . Our current workaround is to just periodically find pods without those annotations and re-annotate them. |
A colleague has monitored the annotations to see when they get cleared and the clearing of the annotations on the new pod seems to coincide with:
in the node/host log, /var/log/calico/cni/cni.log. |
To give a little more context, the timeline when deleting a pod, foobar-2, that is part of a statefulset looks like:
At this point, the annotations on the new pod have been cleared and at the next termination the network is torn down prematurely. |
@hindessm sent me a full log via DM. I think I see what's going on; the issue appears to be a bit of missing handling in the kubernetes API datastore. The CNI plugin does this to spot if we're racing between an ADD of a new pod and teardown of an old pod:
However, the kubernetes API datastore doesn't seem to populate the
However, with the current code, deletion is not a no-op, it sets podIPs to "" to signal that the IP has been removed from the pod. Hence, we're clobbering the value on the newly-created pod when the old one is torn down. @hindessm I notice in your log that the stateful set pod is deleted successfully once, then kubelet (or CRIO if you're using that) retries the DEL some time after the new pod has been created. If you can figure out why it's retrying (probabaly some error during pod teardown unrelated to CNI) you might be able to fix the immediate problem by stopping the retry.
^^ three CNI DELs for the same instance of the pod, I think. |
This should fix the following issue: projectcalico/calico#4710 beause the CNI plugin will now hit the container ID-based guard logic.
This should fix the following issue: projectcalico/calico#4710 beause the CNI plugin will now hit the container ID-based guard logic.
This should fix the following issue: projectcalico/calico#4710 beause the CNI plugin will now hit the container ID-based guard logic.
This should fix the following issue: projectcalico/calico#4710 beause the CNI plugin will now hit the container ID-based guard logic.
This should fix the following issue: projectcalico/calico#4710 beause the CNI plugin will now hit the container ID-based guard logic.
This should be fixed in Calico v3.20, which is due to be released this week. If you want to try the release candidate, it's available here via this nightly build of our docs: https://2021-07-29-v3-20-governai.docs.eng.tigera.net/ I also backported to the v3.19/18 release brnaches so it'll be in the next patch release for those minor releases. |
Closing: v3.20 is now released. Please re-open if you see this again after upgrading to v3.20. |
Summary
Seeing a strange issue where sometimes pods in a statefulset lose network connectivity when they're in a Terminating state (i.e., some pods make TCP calls when they're shutting down, but they get connection timeouts, keeping them hung until the shutdown grace period ends).
We've discovered that this happens when pods have an empty cni.projectcalico.org/podIP annotation. Most pods have an IP for that value, but some just have an empty string where the IP should be. Sometimes restarting pods gets them re-annotated, but not always.
The pods with empty podIP annotations have totally normal network connectivity when they are running, despite the blank annotation. They can be reached by other pods, reached from the public internet via NodePorts, and can communicate with other pods and external services like S3. It is only when they enter Terminating that they lose network connectivity.
Expected Behavior
We expect the pod to retain network connectivity throughout its Terminating state until the pod is fully torn down.
Steps to Reproduce (for bugs)
We have not found a way to reproduce the empty podIP annotation. But once a podIP annotation is empty, reproducing is straightforward.
Your Environment
The text was updated successfully, but these errors were encountered: