-
Notifications
You must be signed in to change notification settings - Fork 40.9k
[client-go #1415] Use transformer from provided store within internal stores in reflector to limit memory usage bursts #131799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[client-go #1415] Use transformer from provided store within internal stores in reflector to limit memory usage bursts #131799
Conversation
|
Welcome @valerian-roche! |
Hi @valerian-roche. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/assign |
Let's wait for the final decision what we will do with watch-list feature. Once we have the decision, I will review this PR. |
Given we're progressing with the feature (although with slightly adjusted scope), let me review this. /ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor comments, but overall this lgtm
@@ -143,6 +144,21 @@ func TestCache(t *testing.T) { | |||
doTestStore(t, NewStore(testStoreKeyFunc)) | |||
} | |||
|
|||
func TestCacheWithTransformer(t *testing.T) { | |||
informerCalled := &atomic.Bool{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please call it transformerCalled
instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to transformerCalled
return fw, nil | ||
}, | ||
// ListFunc should never be used in WatchList mode | ||
ListFunc: nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be used as fallback in some cases.
So to avoid panics, let's actually define a function and just call t.Error()
there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to fail the test and return an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to update to not fail here as this triggers a panic when the reflector shuts down (as some internal goroutine keeps working passed returning and calling this function)
Confirmed that returning an error here is sufficient to have the test failing in this case
} | ||
|
||
r := NewReflector(lw, &v1.Pod{}, store, 0) | ||
r.UseWatchList = ptr.To(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's being removed in #132453
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use the featureflag override instead
@@ -161,6 +161,8 @@ type cache struct { | |||
// keyFunc is used to make the key for objects stored in and retrieved from items, and | |||
// should be deterministic. | |||
keyFunc KeyFunc | |||
// Called with every object if non-nil. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: transformer is called for every object that is put into store.
@@ -303,6 +303,11 @@ func (f *DeltaFIFO) KeyOf(obj interface{}) (string, error) { | |||
return f.keyFunc(obj) | |||
} | |||
|
|||
// Transformer implements the TransformingStore interface. | |||
func (f *DeltaFIFO) Transformer() TransformFunc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to update the RealFIFO
too? staging/src/k8s.io/client-go/tools/cache/the_real_fifo.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch - we would regress again after promoting InOrderInformers
again..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for raising this part. I added this interface on it
|
||
clientfeaturestesting.SetFeatureDuringTest(t, clientfeatures.WatchListClient, true) | ||
r := NewReflector(lw, &v1.Pod{}, store, 0) | ||
// r.useWatchList = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: please remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry missed this. Working on ensuring it does not panic when not set
doneCh, stopCh := make(chan struct{}), make(chan struct{}) | ||
go func() { | ||
defer close(doneCh) | ||
//nolint:logcheck // Intentionally uses the old API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not do that. You can achieve everything with you need with context too - sth like:
ctx, cancel := context.WithCancel(context.Background())
go func() {
defer close(doneCh)
r.RunWithContext(ctx)
}()
...
cancel()
select {
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to use the new API. Missed it when copying another test
// NewStore returns a Store implemented simply with a map and a lock. | ||
func NewStore(keyFunc KeyFunc) Store { | ||
return &cache{ | ||
func NewStore(keyFunc KeyFunc, opts ...StoreOption) Store { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be wrong, but I do expect this API change to have no impact as it's a free standing function that should not be implementing an interface.
It's failing the API check though, so I can add a release note if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - I think this is fine.
@aojea - can you confirm as a second pair of eyes?
/lgtm /hold |
LGTM label has been added. Git tree hash: 2f7353f9dfe0e541715a8cce1f09370ad04f4107
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: valerian-roche, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@valerian-roche - this label "do-not-merge/contains-merge-commits" that was automatically added suggests that you need to somehow rebase this PR to unblock it maybe? |
Yes I will rebase and squash the PR as it's been fully reviewed now. Wanted to avoid confusion during the review |
waiting for the rebase |
…n internal stores in reflector to limit memory usage bursts Signed-off-by: Valerian Roche <[email protected]>
1bcf358
to
585ed0a
Compare
Force-pushed the squash and rebase |
@valerian-roche: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
/hold cancel /lgtm |
What type of PR is this?
/kind bug
What this PR does / why we need it:
This PR updates client-go. It ensures the reflector used in SharedInformers does not keep a full list of objects retrieved through watchList without transformation prior to passing the objects one at a time to the underlying Store (which may have a transformer defined).
Without this change, the informer ends up holding a full copy of unmutated objects in memory prior to being passed to the transformer, greatly limiting its value as a way to avoid large memory allocations. In our example (referring to the issue), we gain more than one OOM in the memory we need to provide a controller running a pod informer.
To not change the semantic of watchList (especially on not altering the backing store if the watch does not conclude), the code still uses a temporary store but applies the same transformer as the provided Store if applicable. This has the side effect of potentially running the transformer twice, but this side effect is already specifically mentioned within the TransformFunc comment, and users will not be impacted if not explicitly activating the WatchList feature flag in the client
Which issue(s) this PR fixes:
Fixes kubernetes/client-go#1415
Special notes for your reviewer:
Does this PR introduce a user-facing change?