-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix reconstruction of CSI volumes #117804
Fix reconstruction of CSI volumes #117804
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jsafrane The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1a5d763
to
b6e0f35
Compare
Added unit test and removed WIP: |
/triage accepted |
/assign @msau42 @xing-yang |
return asw.addVolume(volumeName, volumeSpec, devicePath) | ||
|
||
pluginIsAttachable := volumeAttachabilityFalse | ||
if attachablePlugin, err := asw.volumePluginMgr.FindAttachablePluginBySpec(volumeSpec); err == nil && attachablePlugin != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any concern of marking the volume attachability as false if there's a temporary error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, though we always treated temporary here as attachability: false
. We could choose to mark it uncertain I guess, but then we have to worry about secondary side effect.
For the time being, we should keep this as it is and may be make a follow up issue for fixing this.
/assign @gnufied |
return asw.addVolume(volumeName, volumeSpec, devicePath) | ||
|
||
pluginIsAttachable := volumeAttachabilityFalse | ||
if attachablePlugin, err := asw.volumePluginMgr.FindAttachablePluginBySpec(volumeSpec); err == nil && attachablePlugin != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, though we always treated temporary here as attachability: false
. We could choose to mark it uncertain I guess, but then we have to worry about secondary side effect.
For the time being, we should keep this as it is and may be make a follow up issue for fixing this.
reconstructVolume() is called when kubelet may not have connection to the API server yet, therefore it cannot get CSIDriver instances to figure out if a CSI volume is attachable or not. Refactor reconstructVolume(), so it does not need FindAttachablePluginBySpec for CSI volumes, because all of them are deviceMountable (i.e. FindDeviceMountablePluginBySpec always returns the CSI volume plugin).
0bdee5a
to
85b3fda
Compare
@@ -525,6 +525,54 @@ func TestActualStateOfWorld_FoundDuringReconstruction(t *testing.T) { | |||
return nil | |||
}, | |||
}, | |||
{ | |||
name: "uncertain attachability is resolved to attachable", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what case this uncertain attachability is resolved to attachable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when we first build ASOW during reconsturction without API server connectivity we can't know atttachability of CSI volumes and hence "attachable" property is uncertain.
So reconciler continuously tries to fix that state and assuming once kubelet is able to talk to api-server uncertain attachability will be resolved.
SyncReconstructedVolume(volumeName v1.UniqueVolumeName, podName volumetypes.UniquePodName, outerVolumeSpecName string) | ||
|
||
// Add the specified volume to ASW as uncertainly attached. | ||
AddReconstructedVolume(volumeName v1.UniqueVolumeName, volumeSpec *volume.Spec, nodeName types.NodeName, devicePath string) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this function is only adding volumeAttachabilityUncertain state, how about put this information into the name too?
like AddAttachUncertainReconstructedVolume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to AddAttachUncertainReconstructedVolume
During CSI volume reconstruction it's not possible to tell, if the volume is attachable or not - CSIDriver instance may not be available, because kubelet may not have connection to the API server at that time. Adding uncertain state during reconstruction + adding a correct state when the API server is available.
node.status.volumesInUse should report only attachable volumes, therefore it needs to wait for the reconciler to update uncertain attachability of volumes from the API server.
85b3fda
to
7cd60df
Compare
To volumesNeedUpdateFromNodeStatus - because both devicePath and uncertain attach-ability needs to be fixed from node status.
to be in sync with volumesNeedUpdateFromNodeStatus.
I also renamed |
/retest |
/lgtm |
LGTM label has been added. Git tree hash: 9c693654e6be61e8059307591c407efe8dddd4c2
|
…7804-upstream-release-1.27 Automated cherry pick of #117804: Refactor FindAttachablePluginBySpec out of CSI code path
What type of PR is this?
/kind bug
What this PR does / why we need it:
During CSI volume reconstruction it's not possible to tell, if the volume is attachable or not - CSIDriver instance may not be available, because kubelet may not have connection to the API server at that time.
Therefore remove all calls to
FindAttachablePluginBySpec
here:reconstructVolume
not to callFindAttachablePluginBySpec
for CSI volumes. It's just a simple refactoring.Which issue(s) this PR fixes:
Fixes: #117745
Special notes for your reviewer:
WIP:PR for discussionmissing unit testsUnit tests addedDoes this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: