-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k8s.io/dynamic-resource-allocation: fix compatibility with Kubernetes 1.27 #120868
Conversation
@pohly: GitHub didn't allow me to request PR reviews from the following users: elezar. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
// and users can set the DRA_WITH_DETERMINISTIC_RESOURCE_CLAIM_NAMES env | ||
// variable to an arbitrary non-empty value to use the naming from Kubernetes < | ||
// 1.28. | ||
func NewNameLookup(client kubernetes.Interface) *Lookup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping this function easy and not doing anything which can fail is intentional because it avoids adding additional error checks in code which uses this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question: Are there other options that we would want to pass to the contructor (something like withDeterministicResourceClaimNames
to allow for overriding this behaviour in tests)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of any and didn't want to make it more complex (interim solution, package doesn't need a stable API).
// and users can set the DRA_WITH_DETERMINISTIC_RESOURCE_CLAIM_NAMES env | ||
// variable to an arbitrary non-empty value to use the naming from Kubernetes < | ||
// 1.28. | ||
func NewNameLookup(client kubernetes.Interface) *Lookup { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question: Are there other options that we would want to pass to the contructor (something like withDeterministicResourceClaimNames
to allow for overriding this behaviour in tests)?
// If podClaim.Template is not nil, the caller must check that the | ||
// ResourceClaim is indeed the one that was created for the Pod by calling | ||
// IsUsable. | ||
// Determining the name depends on Kubernetes >= 1.28. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm to understand this correctly, this function will only work in k8s >= 1.28. What happens when this condition is not met? Is this something we want to explicitly check and return some error?
Also, is this superseded by Lookup.Name
function? If so, could we delegate to that -- constructing an applicable Lookup
object when this function is called?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, do we want to mark Name
as deprecated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm to understand this correctly, this function will only work in k8s >= 1.28.
Correct.
What happens when this condition is not met?
The DRA driver fails with a "resource claim not created yet" even though the ResourceClaim has been created - it just doesn't have a deterministic name.
Is this something we want to explicitly check and return some error?
We can't. This function has no way to detect the actual behavior of the cluster, nor can the caller.
} | ||
|
||
if *l.usePodStatus { | ||
return Name(pod, podClaim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ... ok, I see that the relationship is the inverse of what I was expecting. As a matter of interest, do we expect that Name
will be the function used moving forward, or would we rather have clients depend on the Lookup
type (or possibly an interface)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see NewNameLookup
as an interim solution and want to remove it again once 1.27 is no longer supported, upstream or by this package (whatever happens sooner).
I was tempted to clarify that by adding a Deprecated: use Name instead of you don't need to support Kubernetes 1.27
, but it felt odd to add something that is deprecated right out of the gate.
… 1.27 Kubernetes 1.28 introduced generated resource claim names, with the actual name recorded in the pod status. As part of that change, the helper library for DRA drivers was updated to match that behavior. Since then, driver deployments were either compatible with Kubernetes 1.27 when using k8s.io/dynamic-resource-allocation 0.27 or Kubernetes 1.28 when using 0.28, but never both. This was okay because this is an alpha feature, but it makes testing of DRA drivers harder (in particular because cloud providers have not all updated to 1.28 yet) and can be fixed fairly easily. Therefore adding support for 1.27 is worthwhile and reasonable.
deb8dec
to
e5f25cc
Compare
Tests, if there were any, would have to cover the different code paths without such an override. I can't think of anything else either. Tests are missing because I don't know how to mock the Such testing is necessary to determine the actual behavior of different server releases. I had previously tested with GKE 1.27 and the version check worked there. But with Kubernetes master I got "28+" (not a number!) and parsing it failed. I just pushed an update which works with both 1.27 and master (tested manually). |
/triage accepted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@elezar: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: elezar, pohly The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
LGTM label has been added. Git tree hash: 815c85e2a042328e56017366eae9be4552205b28
|
What type of PR is this?
/kind bug
/kind failing-test
What this PR does / why we need it:
Kubernetes 1.28 introduced generated resource claim names, with the actual name recorded in the pod status. As part of that change, the helper library for DRA drivers was updated to match that behavior. Since then, driver deployments were either compatible with Kubernetes 1.27 when using k8s.io/dynamic-resource-allocation 0.27 or Kubernetes 1.28 when using 0.28, but never both.
This was okay because this is an alpha feature, but it makes testing of DRA drivers harder (in particular because cloud providers have not all updated to 1.28 yet) and can be fixed fairly easily. Therefore adding back support for 1.27 is worthwhile and reasonable.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
/cc @elezar