Skip to content

Added missing 'time' for a field manager that server-side-applied same configuration #127939

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

waltforme
Copy link

@waltforme waltforme commented Oct 8, 2024

What type of PR is this?

/kind bug

What this PR does / why we need it:

The 'time' stanza in metadata.managedFields is missing under some circumstances after a server-side apply. This PR tries to get the missing 'time' stanza back. More details are documented in #127938.

Which issue(s) this PR fixes:

Fixes #127938

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fixed an issue in `metadata.managedFields`. Before this fix, the `time` stanza for a 2nd field manager was not set if that 2nd field manager server-side applies the same configuration as the 1st manager.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 8, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @waltforme!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @waltforme. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 8, 2024
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 8, 2024
@Jefftree
Copy link
Member

Jefftree commented Oct 8, 2024

/assign @seans3
/cc @jpbetz
/triage accepted

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Oct 8, 2024
@k8s-ci-robot k8s-ci-robot requested a review from jpbetz October 8, 2024 20:15
@k8s-ci-robot k8s-ci-robot removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 8, 2024
if object != nil {
managed.Times()[fieldManager] = &metav1.Time{Time: time.Now().UTC()}
} else {
if managerInFields && !managerInTimes {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not very familiar with this code, but it looks strange to put this logic in the branch that is calling RemoveObjectManagedFields ... are we sure this is the branch where a new manager is adding co-ownership of some fields?

this definitely needs really good unit tests around the scenario being exercised

Copy link
Author

@waltforme waltforme Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the prompt review!

I have the same feeling that the two branches are a little confusing to follow. Let me try my best to clarify.

  • Since managedFieldsUpdater implements the Manager interface, its method (f *managedFieldsUpdater) Apply should return ‘the new object with managedFields removed, and the object's new proposed managedFields separately’:
    // Apply is used when server-side apply is called, as it merges the
    // object and updates the managed fields.
    // * `liveObj` is not mutated by this function
    // * `newObj` may be mutated by this function
    // Returns the new object with managedFields removed, and the object's new
    // proposed managedFields separately.
    Apply(liveObj, appliedObj runtime.Object, managed Managed, fieldManager string, force bool) (runtime.Object, Managed, error)
  • The else branch says: Since the live object is not touched (signaled by object == nil, detailed below), the merged object should be the same as the live object. So first make a deep copy of liveObj, then remove its managedFields to fit the Manager interface.

Details on object == nil:
These lines say that nil signals the situation that a merged object and the current live object being the same.

if !s.returnInputOnNoop && value.EqualsUsing(value.NewFreelistAllocator(), liveObject.AsValue(), newObject.AsValue()) {
newObject = nil
}

In our use case, the 2nd field manager applies the same configuration as the 1st, so the merged object is indeed the same as the current live object, so a nil is indeed returned (and assigned to object) to signal that.

BTW I believe there was a typo in the go doc comments of the Manager interface’s Apply method. Latest push fixed the typo.

Fully agree that we need unit tests here. Will work on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok... I see object comes back as nil for no-ops

updater: merge.Updater{
Converter: newVersionConverter(typeConverter, objectConverter, hub), // This is the converter provided to SMD from k8s
IgnoreFilter: resetFields,
},

constructs this:

// Updater is the object used to compute updated FieldSets and also
// merge the object on Apply.
type Updater struct {
// Deprecated: This will eventually become private.
Converter Converter
// Deprecated: This will eventually become private.
IgnoreFilter map[fieldpath.APIVersion]fieldpath.Filter
returnInputOnNoop bool
}

leaving returnInputOnNoop set to false, so an apply that doesn't change anything returns nil for the object:

if !s.returnInputOnNoop && value.EqualsUsing(value.NewFreelistAllocator(), liveObject.AsValue(), newObject.AsValue()) {
newObject = nil
}
return newObject, managers, nil

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the logic in this PR is correct, but needs commenting to explain.

I'd suggest:

	if object != nil {
		// non-nil object means the apply operation modified the object, so update the manager's timestamp
		managed.Times()[fieldManager] = &metav1.Time{Time: time.Now().UTC()}
	} else {
		// nil object means the apply operation did not modify the input object
		// clone the input object and return it without managed fields
		object = liveObj.DeepCopyObject()
		RemoveObjectManagedFields(object)

		if _, managerInFields := managed.Fields()[fieldManager]; managerInFields {
			if _, managerInTimes := managed.Times()[fieldManager]; !managerInTimes {
				// if the manager owns fields, ensure it has an associated time.
				// a no-op apply can add co-ownership of existing field values, so record the time that occurred.
				managed.Times()[fieldManager] = &metav1.Time{Time: time.Now().UTC()}
			}
		}
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions! With the comments, we should save some time for the future visitors. I included them in latest push.

@sftim
Copy link
Contributor

sftim commented Oct 10, 2024

Would this PR need a changelog entry?

@MikeSpreitzer
Copy link
Member

@sftim: how much would be appropriate to put in the change log entry? Just "Fixed Issue 127938", or a description of the behavior change?

@sftim
Copy link
Contributor

sftim commented Oct 10, 2024

Try something like:

Fixed an issue in the frobnicator. Before this fix, the `time` field was not set correctly if your
time machine was travelling below 141 kilometers per hour.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Oct 10, 2024
@waltforme
Copy link
Author

Try something like:

Fixed an issue in the frobnicator. Before this fix, the `time` field was not set correctly if your
time machine was travelling below 141 kilometers per hour.

@sftim Added. Thanks for the nice example!

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Oct 11, 2024
@waltforme
Copy link
Author

waltforme commented Oct 11, 2024

As suggested by @liggitt, the latest push added unit test. The added test TestCoOwningManagedFieldsByApplyingSameObjResultsInNewManagerTime checks:

  • The 2nd field manager's server-side apply doesn't change the 1st field manager's managed fields;
  • The 2nd field manager co-owns the managed fields together with the 1st field manager;
  • The 2nd field manager has its own 'time' stanza.

Latest push also made a few small corrections to the previously existing unit tests.

@@ -335,7 +335,7 @@ func TestTakingOverManagedFieldsDuringUpdateDoesNotModifyPreviousManagerTime(t *
},
"data": {
"key_a": "value",
"key_b": value"
"key_b": "value"
Copy link
Member

@liggitt liggitt Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh... these pre-existing test typos are distressing... did you just notice these and fix them while you were here, or did fixing these reveal that these tests were broken?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking for this PR, but cc @jpbetz for visibility to how accepting yaml parsing is of malformed json input sent as yaml. This was parsing as:

{
  "apiVersion": "v1",
  "data": {
    "key_a": "value",
    "key_b": "value\""
  },
  "kind": "ConfigMap",
  "metadata": {
    "name": "configmap"
  }
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The former, I just noticed these so fixed them. I was wondering why these typos didn't break the testing, and now I get the answer from your explanations above. Thanks!

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Nov 27, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: waltforme
Once this PR has been reviewed and has the lgtm label, please assign apelisse for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@waltforme
Copy link
Author

The force-push to ce32cdc included the suggested changes from @liggitt, also squashed those changes to appropriate commit.

I messed up one unrelated commit in the previous force-push to 6d5c802, and the bot added area/kubelet and sig/node labels to this PR. I corrected git history afterwards, but I could not remove the two labels by myself. I'm so sorry if this introduces any confusion. Just want to clarify: This PR is not related to kubelet or sig-node.

@MikeSpreitzer
Copy link
Member

/test pull-kubernetes-e2e-gce-100-performance
(to get a baseline for comparison with #128087 and #128974)

@MikeSpreitzer
Copy link
Member

/remove-area kubelet

@MikeSpreitzer
Copy link
Member

/remove-sig node

@k8s-ci-robot k8s-ci-robot removed the sig/node Categorizes an issue or PR as relevant to SIG Node. label Nov 27, 2024
@MikeSpreitzer
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 27, 2024
@waltforme
Copy link
Author

Thanks @MikeSpreitzer !

object = liveObj.DeepCopyObject()
RemoveObjectManagedFields(object)

if _, managerInFields := managed.Fields()[fieldManager]; managerInFields {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpbetz how does partitioning of field manager among subresources work? does managed.Fields() only return entries relevant to the subresource for this request?

Copy link
Contributor

@jpbetz jpbetz Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field manager relies on the GetResetFields() (example) of the subresource to indicate which fields the subresource applies to.

'GetResetFieldsFilter()' was also added a in 1.32 to make it easier to declare the fields (example) for cases where the exclude set of GetResetFields() is not expressive enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specific test I'm thinking of is:

  1. manager foo writes to pod, gets a managedFields entry for itself
  2. manager bar writes to pods/status, gets a managedFields entry for itself under the status subresource
  3. manager bar does no-op write to pod (e.g. empty patch {}) so that it doesn't actually own any fields

Will the change in this PR make processing of step 3 see and get confused by the managed fields entry from 2?

if _, managerInTimes := managed.Times()[fieldManager]; !managerInTimes {
// if the manager owns fields, ensure it has an associated time.
// a no-op apply can add co-ownership of existing field values, so record the time that occurred.
managed.Times()[fieldManager] = &metav1.Time{Time: time.Now().UTC()}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpbetz for existing managed entries that got created without timestamps, once this bug fix releases, the next no-op update will set the time to now on a pre-existing managed field entry. That's a little weird, and probably ok, but I wanted to call it out explicitly as a side-effect of the fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We fixed a few spurious resourceVersion bumps issues this year:

So I'm keeping an eye out of anything that changes an apply request before the object is compared with the stored resource to decide if a write is needed. But this happens after fieldmanager.Apply() so appears safe, at least from that perspective.

@liggitt liggitt added this to @liggitt Mar 19, 2025
@MohammadAlavi1986
Copy link

I believe there is a case that is not covered in this PR. If a manager performs a server-side apply (SSA) with a configuration that does not change the object, and then later performs another SSA with a different configuration that also does not change the object (but changes the list of field ownership for this manager), the time field is not updated for the second SSA.

Here's an example:

k apply --server-side --field-manager m1 --show-managed-fields -o yaml  -f - <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
  name: test
  namespace: default
data:
  key1: value1
  key2: value2
EOF

# a new entry is added for m2 field manager with the correct time
k apply --server-side --field-manager m2 --show-managed-fields -o yaml  -f - <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
  name: test
  namespace: default
data:
  key1: value1
EOF

# list of fields owned by m2 manager has changed, but time is NOT updated for m2 field manager
k apply --server-side --field-manager m2 --show-managed-fields -o yaml  -f - <<'EOF'
apiVersion: v1
kind: ConfigMap
metadata:
  name: test
  namespace: default
data:
  key1: value1
  key2: value2
EOF

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

  • Mark this PR as fresh with /remove-lifecycle stale
  • Close this PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

bug: No 'time' added when server-side-applying the same yaml as a 2nd field manager
10 participants