-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix sync_proxy_rules_iptables_total metric #119140
Conversation
This required fixing a small bug in the metric, where it had previously been counting the "-X" lines that had been passed to iptables-restore to delete stale chains, rather than only counting the actual rules.
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Historically, IptablesRulesTotal could have been intepreted as either "the total number of iptables rules kube-proxy is responsible for" or "the number of iptables rules kube-proxy rewrote on the last sync". Post-MinimizeIPTablesRestore, these are very different things (and IptablesRulesTotal unintentionally became the latter). Fix IptablesRulesTotal (sync_proxy_rules_iptables_total) to be "the total number of iptables rules kube-proxy is responsible for" and add IptablesRulesLastSync (sync_proxy_rules_iptables_last) to be "the number of iptables rules kube-proxy rewrote on the last sync".
4472bbe
to
3dbea75
Compare
pkg/proxy/iptables/proxier.go
Outdated
@@ -852,6 +852,9 @@ func (proxier *Proxier) syncProxyRules() { | |||
proxier.natChains.Reset() | |||
proxier.natRules.Reset() | |||
|
|||
skippedNatChains := &proxyutil.LineBuffer{} | |||
skippedNatRules := &proxyutil.LineBuffer{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(yes, this is wasteful; fixed in the next commit)
/retest |
LGTM the only doubts I have is about the solution of switching the pointers to store the lines in a different buffer, it is not straightforward for future developments to realize there is some code doing that in the middle of the loop. /assign @thockin |
Yeah, I didn't love that approach, but it seemed simplest... I guess I could try doing it with separate pointers throughout, like (This would actually make one of the earlier refactorings (#110266) irrelevant; I'd carefully reorganized all of the code in the main sync loop so that the required rules all come first, and then the rules that can be skipped at the bottom. So maybe if we refactored it with |
pkg/proxy/iptables/proxier.go
Outdated
@@ -852,8 +852,8 @@ func (proxier *Proxier) syncProxyRules() { | |||
proxier.natChains.Reset() | |||
proxier.natRules.Reset() | |||
|
|||
skippedNatChains := &proxyutil.LineBuffer{} | |||
skippedNatRules := &proxyutil.LineBuffer{} | |||
skippedNatChains := proxyutil.NewDummyLineBuffer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment here would not hurt
pkg/proxy/util/linebuffer.go
Outdated
|
||
// NewDummyLineBuffer returns a dummy LineBuffer that counts the number of writes but | ||
// throws away the data. | ||
func NewDummyLineBuffer() LineBuffer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe NewLineCounter()
or NewDiscardLineBuffer()
would make it feel like less of a test-infra thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't hate the pointer flip. I wonder if we can insure against accidentally writing to proxier.natRules
by making some of these helper methods into free-functions, but that can be a followup, I think.
/lgtm /hold if you want to change the "dummy" name |
LGTM label has been added. Git tree hash: 7c15eef3f6bff22f74285b290ef3f7e262c02a4f
|
Rather than actually assembling all of the rules we aren't going to use, just count them and throw them away.
3dbea75
to
883d0c3
Compare
updated the name |
/lgtm |
LGTM label has been added. Git tree hash: d890cccb04781596c4c122d7396c2add48b39b03
|
@danwinship: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Reverts the definition of the
sync_proxy_rules_iptables_total
metric back to the generally-understood pre-MinimizeIPTablesRestore
meaning: "the total number of iptables rules that kube-proxy is responsible for". Also adds a new metric,sync_proxy_rules_iptables_last
, preserving the behavior thatsync_proxy_rules_iptables_total
had accidentally slipped into: "the number of iptables rules that kube-proxy reprogrammed on the last sync".Also fixes a bug noticed while added unit tests for this, which is that if syncProxyRules() deleted any stale service/endpoint chains, it would count each of those deletions as being a "rule" for purposes of the metric due to carelessness in how it was counting.
Which issue(s) this PR fixes:
Fixes #118978
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
/sig network
/priority important-soon
/assign @thockin @aojea