-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: the volume is not detached after the pod and PVC objects are deleted #116138
Conversation
Hi @cvvz. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
cc @andyzhangx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
@Chaunceyctx Thanks for you comment. After analyzing the logs and code, I do find there could be a TOCTOU bug in unmountDetachDevices process. I will explain more specificly later. Anyway, I think it is worth to take a look at this PR first, since after there is no volume name in the global mount path, |
Here is the time-series chart, the left side is the reconcile loop, the right side is the async UnmountDevice operation, the red box indicates the first time of Hopefully I can make things clear. |
@cvvz thanks a lot for the diagram. Yes, this is the issue I was thinking about. There could be multiple places for fix this issue,
Should we consider add fixes for all those places? |
Thanks for putting together the diagram! What if after checking |
@@ -595,14 +595,13 @@ func (c *csiAttacher) UnmountDevice(deviceMountPath string) error { | |||
driverName = data[volDataKey.driverName] | |||
volID = data[volDataKey.volHandle] | |||
} else { | |||
klog.Error(log("UnmountDevice failed to load volume data file [%s]: %v", dataDir, err)) | |||
|
|||
// The volume might have been mounted by old CSI volume plugin. Fall back to the old behavior: read PV from API server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this logic to handle a scenario where someone didn't drain the node before upgrading? Is this safe to remove now? cc @mattcary @saikat-royc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think vol_data.json
was brought in #64882 5 years ago, so now no need to fall back to the old behavior when it does not exist anymore, it should be safe to remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we move this PR forward since it really solves the problem per our large scale test, there is no disk detach issue any more. @mattcary @saikat-royc @msau42
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#107065 is the PR where the new global path is introduced (1.24). The PR states that the node need to be drained explicitly in its release note. Change node staging path for csi driver to use a PV agnostic path. Nodes must be drained before updating the kubelet with this change.
So I think if vol_data.json does not exist, we can skip the else part of the code to parse the path. [Note: this also means that this PR can be cherry-picked to >= 1.24, since we already have a dependency to drain node when upgrading from < 1.24 to 1.24]. So +1 from my side to remove the fallback mountpath parsing
How about making deviceMountState as a pointer? So we won't get the stale state. |
Thanks for the diagram. Is removing the vol_data.json file last step of unmount? Will there be any failures after that step? |
Yes it is. kubelet will only remove vol_data.json after |
This issue was consistently reproducible after ~5 runs of a 1K disk attach/detach load test. However, with this PR, 100 test runs of the same load test were conducted in k8s 1.25.5 and the issue no longer occurred. |
/priority important-soon |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andyzhangx, cvvz, jsafrane The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@jsafrane shall we merge this PR in milestone v1.27? |
I'd wait for 1.27 GA and then open cherry-picks. |
Can we now proceed with the backports? Thanks in advance! |
@cvvz can we backport to all affected releases? Currently I see only a backport for the release-1.27 branch. |
I will backport to 1.25 and 1.26 |
@cvvz can we have the 1.24 backport too ? |
While I cannot answer if you "have to", I can only second that we see this issue massively on 1.24 clusters and a back port would be super welcomed 😊 |
sure, backport to 1.24 |
Thanks for the info! , I was looking into 1.24 code and was trying to confirm :) Thanks @cvvz for filing it ! 👍 |
…-origin-release-1.24 Automated cherry pick of #116138: fix: the volume is not detached after the pod and PVC objects
…-origin-release-1.26 Automated cherry pick of #116138: fix: the volume is not detached after the pod and PVC objects
…-origin-release-1.25 Automated cherry pick of #116138: fix: the volume is not detached after the pod and PVC objects
…-origin-release-1.27 Automated cherry pick of #116138: fix: the volume is not detached after the pod and PVC objects
FYI. this issue is fixed in 1.24.14, 1.25.10, 1.26.5, and has been fixed in v1.27.0 |
@cvvz can we have the 1.19 backport? Thanks |
It's only possible to backport to active k8s releases. Branches for end-of-life versions have been frozen in my understanding. So we can't backport past 1.24: https://kubernetes.io/releases/. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
When
vol_data.json
file does not exist, skipUnmountDevice
since it must succeed before, no need to doUnmountDevice
multiple times.Which issue(s) this PR fixes:
Fixes #114207
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: