Description
What happened?
PostStartHook failed
E0122 06:54:22.018964 10 writers.go:131] apiserver was unable to write a fallback JSON response: http: Handler timeout W0122 06:54:22.457480 10 storage_scheduling.go:106] unable to get PriorityClass system-node-critical: Get "https://*.*.*.*:8443/apis/scheduling.k8s.io/v1/priorityclasses/system-node-critical": net/http: TLS handshake timeout. Retrying... F0122 06:54:22.457615 10 hooks.go:203] PostStartHook "scheduling/bootstrap-system-priority-classes" failed: unable to add default system priority classes: timed out waiting for the condition
What did you expect to happen?
PostStartHook not failed
How can we reproduce it (as minimally and precisely as possible)?
1.Apiserver is deployed using static Pods
2.The other service accesses Apiserver through Apiserver's serviceip
3.Limit the CPU of the Apiserver as much as possible
4.Simulate as many requests as possible to overload Apiserver's CPU
Anything else we need to know?
All Poststarthooks are called asynchronously by the go coroutine
Through reading the code, I found that the service and endpoint tuning process of Apiserver is also carried out by goroutine(pkg/controlplane/instance.go:508:)
his leads to the problem of executing other poststarthooks after the Apiserver has been placed on the endpoint back end to provide service. Since Apiserver has already provided services at this time, if there is a high concurrency and many requests, the load of Apiserver will be too high, and eventually the PostStartHook request will time out, and Apiserver will eventually kill itself
Kubernetes version
[root@master1 ~]# kubectl version
Client Version: v1.28.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.1
Cloud provider
nil
OS version
nil
Install tools
look:How can we reproduce it (as minimally and precisely as possible)?
Container runtime (CRI) and version (if applicable)
containerd
Related plugins (CNI, CSI, ...) and versions (if applicable)
CNI:calico