-
Notifications
You must be signed in to change notification settings - Fork 40.9k
Apiserver watch from storage without PrevKV option #131862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Welcome @shadowofs! |
Hi @shadowofs. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
05b79bc
to
df4c25e
Compare
@shadowofs: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
df4c25e
to
2c0699d
Compare
cc @serathius |
@shadowofs If |
test/compatibility_lifecycle/reference/versioned_feature_list.yaml
Outdated
Show resolved
Hide resolved
/retest |
staging/src/k8s.io/apiserver/pkg/storage/storagebackend/factory/etcd3.go
Outdated
Show resolved
Hide resolved
staging/src/k8s.io/apiserver/pkg/storage/storagebackend/factory/etcd3.go
Outdated
Show resolved
Hide resolved
Retest will not fix it.
|
I wasn't sure if Thinking whether before this PR we should structurize the keys used in storage. |
Fixed it. I also added a WatchWithoutPrevKV flag to storage.ListOptions. This flag only takes effect for etcd storage, allowing etcd watch to explicitly disable PrevKV via this flag. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: shadowofs The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-kubernetes-e2e-kind-alpha-beta-features |
staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go
Outdated
Show resolved
Hide resolved
staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go
Outdated
Show resolved
Hide resolved
e3ea326
to
663ad09
Compare
663ad09
to
d9f51b1
Compare
d9f51b1
to
b40d017
Compare
What type of PR is this?
/kind feature
What this PR does / why we need it:
Currently, the Kubernetes API server uses etcd’s
PrevKV
watch option to retrieve both the new and previous values of a key on each watch event. Under the hood, etcd performs a full range lookup to fetch the previous key-value pair, and all such operations contend on a single read–write lock protecting the internal treeIndex. As cluster scale grows, this coupling means that high write or watch load can starve other etcd requests, leading to increased watch-event latencies and overall pressure on etcd.When
watchCache
is enabled in the API server, the only consumer of a watch event’s “previous” value is theReflector
, which uses it simply to identify and remove deleted objects from the cache. However, etcd watch events already include the full key of a deletion event, making the extra value lookup superfluous.We validated this change on a 5,000‑node production cluster. Disabling

PrevKV
yielded ≈ 50 % reduction in MVCC range operations within etcd:We also ran benchmarks in a test environment. Disabling PrevKV showed ≈ 20 % increase in overall API throughput:

Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: