Description
What happened:
The new k8s.io/kubernetes/cmd/kube-apiserver/app/testing.StartTestServerOrDie
function (introduced in #46865) returns a teardown function which doesn't cleanly shut down the test server. This results in the accumulation of goroutines and log spam which prevents effective use of the test server across multiple test functions within the same process.
What you expected to happen:
Calling the teardown function should gracefully terminate everything that started up when StartTestServerOrDie
was called.
How to reproduce it (as minimally and precisely as possible):
Using the following sample integration test code:
import (
"fmt"
"runtime"
"testing"
apitesting "k8s.io/kubernetes/cmd/kube-apiserver/app/testing"
)
func TestTeardown(t *testing.T) {
_, tearDown := apitesting.StartTestServerOrDie(t)
tearDown()
stack := make([]byte, 8196)
size := 0
for {
size = runtime.Stack(stack, true)
if size < len(stack) {
break
}
stack = make([]byte, len(stack)*2)
}
fmt.Printf("%s\n", string(stack[0:size]))
}
After tearDown()
returns, there are several lingering goroutines which forever attempt to maintain etcd connections to the terminated etcd instance. Here's an example stack dump:
https://gist.github.com/ironcladlou/52b3e3306948db3943b426c70ce7f85b
Among all the etcd connection threads, some things you'll notice are lingering Cacher
instances (which are created due to the default EnableWatchCache
storage setting) which seem to try and hold watches, and configuration_manager (which may or may not hold connections; I'm not sure yet). This seems to indicate various components started during apiserver setup aren't actually shutting down.
Anything else we need to know?:
This is important for enabling integration testing of custom resource garbage collection (#47665).
Environment:
- Kubernetes version (use
kubectl version
): master (as of 088141c) - Cloud provider or hardware configuration**:
- OS (e.g. from /etc/os-release): darwin/amd64
- Kernel (e.g.
uname -a
): - Install tools:
- Others:
/cc @sttts @caesarxuchao @deads2k @liggitt @kubernetes/sig-api-machinery-bugs
/kind bug