Scheduling: Retry on conflict while updating pvc or pv in VolumeBinding plugin

### What would you like to be added?

For `VolumeBinding` plugin, I hope to retry on conflict while updating pvc or pv in `perBind` stage. 

Due to there is no API rollback if the actual updating fails, so if one pods with multi pvcs, it could happen that only parts of pvcs updated with pod re-scheduled, which would leads pods stuck in pending forever (see Example for details) .

This issue wants to help pod to be bound successfully, while greatly reducing the situation that the pods have been scheduled in cache, but  the final binding fails due to update pvc/pv conflicts.

@Huang-Wei @cofyc @kerthcet PTAL, thanks.

/sig scheduling

### Why is this needed?

currently, once the pvc / pv is out of date, the updating operation will failed, then pod will be scheduled failed in perBind stage fro this round, and backoff to re-schedule.

for most situation, current impl looks fine. But When one pod with multi pvcs, it could let pod stuck in pending state forever for some cases.

### Example
There are only :
- two nodes: nodeA & nodeB, 
- two pods: podA & podB, podA with pvcA1 & pvcA2, podB with pvcB1 & pvcB2. 
We want those two pods to be deployed on different nodes with TopologySpreadConstraints.
![image](https://github.com/kubernetes/kubernetes/assets/18323315/1fa91b97-c8c6-4f41-9e89-85c8b5e70e23)

Then:
- scheduling podA, it assumed to `nodeA` in cache, but failed in  in VolumeBinding plugin perBind stage, and `the annotations on pvcA1 updated successfully`, updating pvcA2 failed with conflict.
- scheduling podB, it successfully bind to `nodeA`. Meanwhile, pvcA1 is bound.
- re-scheduling podA, TopologySpreadConstraints wants podA to nodeB, But the node affinity of pv for pvcA1  wants podA to nodeA. 
Finally, podB stuck in pending forever, unless we delete pvcA1.

![image](https://github.com/kubernetes/kubernetes/assets/18323315/c9e228e6-441f-4890-b0cf-18df9224a668)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Scheduling: Retry on conflict while updating pvc or pv in VolumeBinding plugin #125338

What would you like to be added?

Why is this needed?

Example

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scheduling: Retry on conflict while updating pvc or pv in VolumeBinding plugin #125338

Description

What would you like to be added?

Why is this needed?

Example

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions