Description
What happened?
I deployed a CoreDNS server with a 5-second lameduck period (graceful shutdown). During this period, the pod enters the terminating state but can still receive in-flight traffic. However, I observed that clients consistently encounter errors during pod rotation.
After investigation, I identified the root cause as a race condition between pod termination and Conntrack table cleanup. KubeProxy only removes Conntrack entries when Serving=false. However, during the lameduck period, the pod is terminating but still reports Serving=true. This mismatch prevents timely cleanup of stale Conntrack entries.
As a result, once the pod fully terminates, there is very short period gap when Conntrack entries clean up.
Some traffic is still routed to it via stale Conntrack entries, causing request failures. This race condition between pod lifecycle state and Conntrack entries clean up is the underlying issue.
What did you expect to happen?
I expected the Conntrack entries be cleaned up when Pod is in terminating.
How can we reproduce it (as minimally and precisely as possible)?
Deploy a CoreDNS and rotate the CoreDNS pod.
Anything else we need to know?
No response
Kubernetes version
$ kubectl version
# paste output here
Cloud provider
OS version
# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here
# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here