-
Notifications
You must be signed in to change notification settings - Fork 40.9k
etcd output interception #131350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etcd output interception #131350
Conversation
Skipping CI for Draft Pull Request. |
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Thu Apr 17 07:35:58 UTC 2025. |
Previously, etcd wrote to stderr in JSON format: {"level":"warn","ts":"2025-04-11T03:32:06.676527Z","caller":"embed/config.go:689","msg":"Running http and grpc server on single port. This is not recommended for production."} {"level":"warn","ts":"2025-04-11T03:32:06.676707Z","caller":"embed/config.go:689","msg":"Running http and grpc server on single port. This is not recommended for production."} {"level":"warn","ts":"2025-04-11T03:32:06.677056Z","caller":"etcdmain/etcd.go:146","msg":"failed to start etcd","error":"listen tcp 127.0.0.1:37803: bind: address already in use"} {"level":"fatal","ts":"2025-04-11T03:32:06.677104Z","caller":"etcdmain/etcd.go:204","msg":"discovery failed","error":"listen tcp 127.0.0.1:37803: bind: address already in use","stacktrace":"go.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:204\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\tgo.etcd.io/etcd/server/v3/etcdmain/main.go:40\nmain.main\n\tgo.etcd.io/etcd/server/v3/main.go:31\nruntime.main\n\truntime/proc.go:272"} This has several drawbacks: - Not very readable. - When used in tests which start etcd themselves (for example, scheduler_perf), the output is not associated with the current test. - Contains warnings that are confusing for developers who don't know that they are harmless. Intercepting output, parsing it and reformating makes the output nicer. Instead of a mixture of JSON messages (see above) and normal test output, we now get the etcd output embedded inside the test output. We can also filter out some known harmless messages. Cleaning up more output or avoiding it in the first place might be a good next step. With `go test -v ./test/integration/scheduler_perf/dra`: === RUN TestSchedulerPerf === RUN TestSchedulerPerf/SchedulingWithResourceClaimTemplate === RUN TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast I0411 13:21:03.353458 65212 feature_gate.go:385] feature gates: {map[SchedulerQueueingHints:false]} ... I0411 13:21:10.552975 65212 cidrallocator.go:210] stopping ServiceCIDR Allocator Controller I0411 13:21:10.567327 65212 etcd.go:210] "etcd output" logger="TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast" error="accept tcp 127.0.0.1:42245: use of closed network connection" level="warn" ts="2025-04-11T13:21:10.567045+0200" caller="embed/serve.go:160" msg="stopping insecure grpc server due to error" I0411 13:21:10.567398 65212 etcd.go:210] "etcd output" logger="TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast" ts="2025-04-11T13:21:10.567198+0200" caller="embed/serve.go:162" msg="stopped insecure grpc server due to error" error="accept tcp 127.0.0.1:42245: use of closed network connection" level="warn" I0411 13:21:15.567917 65212 etcd.go:227] "etcd didn't exit in 5 seconds, killing it" logger="TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast" I0411 13:21:15.567964 65212 etcd.go:234] "etcd exited" logger="TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast" err="signal: terminated" With per-test output `go test -v ./test/integration/scheduler_perf/dra -args -use-testing-log`: === RUN TestSchedulerPerf === RUN TestSchedulerPerf/SchedulingWithResourceClaimTemplate === RUN TestSchedulerPerf/SchedulingWithResourceClaimTemplate/fast I0411 13:19:03.540497 28645 feature_gate.go:385] feature gates: {map[DynamicResourceAllocation:true]} ... I0411 13:19:10.519994 28645 cidrallocator.go:210] stopping ServiceCIDR Allocator Controller etcd.go:210: I0411 13:19:10.533131] etcd output msg="stopping insecure grpc server due to error" error="accept tcp 127.0.0.1:46637: use of closed network connection" level="warn" ts="2025-04-11T13:19:10.532900+0200" caller="embed/serve.go:160" etcd.go:210: I0411 13:19:10.533274] etcd output msg="stopped insecure grpc server due to error" error="accept tcp 127.0.0.1:46637: use of closed network connection" level="warn" ts="2025-04-11T13:19:10.533022+0200" caller="embed/serve.go:162" etcd.go:227: I0411 13:19:15.533715] etcd didn't exit in 5 seconds, killing it etcd.go:234: I0411 13:19:15.533803] etcd exited err="signal: terminated"
/assign @serathius @aojea @BenTheElder The other two etcd enhancements landed after code thaw, let's continue with the output improvement. |
/triage accepted |
thanks @pohly this looks good to me, let's try it out! /approve |
LGTM label has been added. Git tree hash: 4e61fea50e7d7932569b4fb792934368c1aa8efd
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dims, pohly The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
etcd: better logging
Previously, etcd wrote to stderr in JSON format:
This has several drawbacks:
the output is not associated with the current test.
are harmless.
Intercepting output, parsing it and reformating makes the output nicer. Instead
of a mixture of JSON messages (see above) and normal test output, we now get
the etcd output embedded inside the test output. We can also filter out some
known harmless messages. Cleaning up more output or avoiding it in the first place might be a good
next step.
With
go test -v ./test/integration/scheduler_perf/dra
:With per-test output
go test -v ./test/integration/scheduler_perf/dra -args -use-testing-log
:Special notes for your reviewer:
Based on #131349 and #131259 which got merged.
Does this PR introduce a user-facing change?