-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kubeadm: fix the bug that kubeadm always do CRI detection when --config is passed even if it is not required by the subcommand #120828
kubeadm: fix the bug that kubeadm always do CRI detection when --config is passed even if it is not required by the subcommand #120828
Conversation
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/hold |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: SataQiu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no other way to know when to skip it, we have to add the extra argument everywhere?
@@ -251,47 +256,47 @@ func DefaultedInitConfiguration(versionedInitCfg *kubeadmapiv1.InitConfiguration | |||
} | |||
|
|||
// LoadInitConfigurationFromFile loads a supported versioned InitConfiguration from a file, converts it into internal config, defaults it and verifies it. | |||
func LoadInitConfigurationFromFile(cfgPath string) (*kubeadmapi.InitConfiguration, error) { | |||
func LoadInitConfigurationFromFile(cfgPath string, skipCRIDetect bool) (*kubeadmapi.InitConfiguration, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no other way to know when to skip it, we have to add the extra argument everywhere?
The tricky part is here.
It's hard to know if we can skip the CRI detection without an additional parameter or context for LoadOrDefaultInitConfiguration
-> LoadInitConfigurationFromFile
.
This is different from DefaultedInitConfiguration
, which can avoid CRI detection by pre-filling the criSocket
field.
Any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we spread things everywhere, it tastes bad. However, we have to do that sometimes.
In other cases, they may have several things that need to be options/flags everywhere. They use a xxxOptions
struct with several bools or some other type attributes. Then later, if we have further logic to add to spread everywhere, we can add a new option in this options struct then. But this looks like an over-design if we only have one flag here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to double check my understanding. every time we call a phase with --config, there is no way to know if the user wants the cri detection or not. some phases can go without cri detection completely.
so logically the only way to do this is to pass either the skipcridetection flag or perhaps a phase name and then decide to put the no-op cri socket name.
it seems we are passing the skipcridetection flag in too many locations (too deepnin code paths) and it doesn't feel right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we spread things everywhere, it tastes bad. However, we have to do that sometimes.
In other cases, they may have several things that need to be options/flags everywhere. They use a
xxxOptions
struct with several bools or some other type attributes. Then later, if we have further logic to add to spread everywhere, we can add a new option in this options struct then. But this looks like an over-design if we only have one flag here.
an options struct or a variadic arg list sounds like something we can do at some point.
we can log a k/kubeadm issue for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every time we call a phase with --config, there is no way to know if the user wants the cri detection or not
We can know if a command(phase) requires CRI information by checking whether it provides --cri-socket
flag.
That's exactly what #114455 did.
But it's hard to get this cobra command information anywhere else.
it seems we are passing the skipcridetection flag in too many locations (too deepnin code paths) and it doesn't feel right.
Agree!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An example here :
kubernetes/pkg/apis/core/validation/validation.go
Lines 3814 to 3830 in 534a094
// PodValidationOptions contains the different settings for pod validation | |
type PodValidationOptions struct { | |
// Allow invalid pod-deletion-cost annotation value for backward compatibility. | |
AllowInvalidPodDeletionCost bool | |
// Allow invalid label-value in LabelSelector | |
AllowInvalidLabelValueInSelector bool | |
// Allow pod spec to use non-integer multiple of huge page unit size | |
AllowIndivisibleHugePagesValues bool | |
// Allow pod spec to use status.hostIPs in downward API if feature is enabled | |
AllowHostIPsField bool | |
// Allow invalid topologySpreadConstraint labelSelector for backward compatibility | |
AllowInvalidTopologySpreadConstraintLabelSelector bool | |
// Allow node selector additions for gated pods. | |
AllowMutableNodeSelectorAndNodeAffinity bool | |
// The top-level resource being validated is a Pod, not just a PodSpec | |
// embedded in some other resource. | |
ResourceIsPod bool |
But not sure if we should use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo, we can do this with the extra argument that we pass on many places, the way this PR is.
but we should log a ticket in k/kubeadm for refactor with struct ^.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubeadm ticket: https://kugithub.com/kubernetes/kubeadm/issues/2941
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
I'm sorry for not getting back to you sooner after the holiday.
Let's follow it at kubernetes/kubeadm#2941.
…ig is passed even if it is not required by subcommand
58fa5da
to
1a68195
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold
LGTM label has been added. Git tree hash: f538ab11eae6e8bdf2b1fbbae21f93a7add92ba5
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
I saw the discussion here for the possibility of the a new field in a struct, not sure whether that approach is easy to implemented or not.
But I believe this change will fix the issue, I would more happy to see some testcases to guard around the new logic added though.
klog.V(4).Infof("skip CRI socket detection, fill with placeholder %s", kubeadmconstants.UnknownCRISocket) | ||
cfg.CRISocket = kubeadmconstants.UnknownCRISocket // set a value to pass the ValidateSocketPath | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any testcase to check the logic added here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can only happen if the execution of the command or sub-command does not depend on the CRISocket
. The purpose here is just to pass the validation check. It's pretty straightforward.
Previously, we were pre-setting the CRISocket field to prevent CRI detection, for example:
kubernetes/cmd/kubeadm/app/cmd/util/cmdutil.go
Lines 108 to 115 in fc479f4
func DefaultInitConfiguration() *kubeadmapiv1.InitConfiguration { | |
initCfg := &kubeadmapiv1.InitConfiguration{ | |
NodeRegistration: kubeadmapiv1.NodeRegistrationOptions{ | |
CRISocket: kubeadmconstants.UnknownCRISocket, // avoid CRI detection | |
}, | |
} | |
return initCfg | |
} |
kubernetes/cmd/kubeadm/app/cmd/token_test.go
Lines 176 to 178 in fc479f4
NodeRegistration: kubeadmapiv1.NodeRegistrationOptions{ | |
CRISocket: constants.UnknownCRISocket, | |
}, |
Currently, we use this additional parameter
skipCRIDetect
to control whether to do CRI detection. This seems easier to understand than implicit control.
BTW, I think we need to do a subsequent refactoring with the struct option, and that might be a good time to add more test cases.
@@ -110,6 +110,11 @@ func SetNodeRegistrationDynamicDefaults(cfg *kubeadmapi.NodeRegistrationOptions, | |||
} | |||
|
|||
if cfg.CRISocket == "" { | |||
if skipCRIDetect { | |||
klog.V(4).Infof("skip CRI socket detection, fill with placeholder %s", kubeadmconstants.UnknownCRISocket) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls have some info about the phase/command/func name in the log, this will help us to quickly identify where we are since the same log is printed inside of func SetResetDynamicDefaults
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetXXXDynamicDefaults
is only called by LoadOrDefaultXXXConfiguration
.
So we can tell where we are from the top context.
For example:
I1007 07:24:17.900137 4332 initconfiguration.go:260] loading configuration from "./kubeadm.yaml"
I1007 07:24:17.900851 4332 initconfiguration.go:114] skip CRI socket detection, fill with placeholder unix:///var/run/unknown.sock
I1007 07:24:17.902233 4332 interface.go:432] Looking for default routes with IPv4 addresses
I1007 07:24:17.902269 4332 interface.go:437] Default route transits interface "eth0"
I1007 07:24:17.902387 4332 interface.go:209] Interface eth0 is up
I1007 07:24:17.902482 4332 interface.go:257] Interface "eth0" has 3 addresses :[172.18.0.2/16 fc00:f853:ccd:e793::2/64 fe80::42:acff:fe12:2/64].
I1007 07:24:17.902542 4332 interface.go:224] Checking addr 172.18.0.2/16.
I1007 07:24:17.902557 4332 interface.go:231] IP found 172.18.0.2
I1007 07:24:17.902601 4332 interface.go:263] Found valid IPv4 address 172.18.0.2 for interface "eth0".
I1007 07:24:17.902610 4332 interface.go:443] Found active IP 172.18.0.2
...
} | ||
|
||
// documentMapToResetConfiguration takes a map between GVKs and YAML documents (as returned by SplitYAMLDocuments), | ||
// finds a ResetConfiguration, decodes it, dynamically defaults it and then validates it prior to return. | ||
func documentMapToResetConfiguration(gvkmap kubeadmapi.DocumentMap, allowDeprecated, allowExperimental bool, strictErrors bool) (*kubeadmapi.ResetConfiguration, error) { | ||
func documentMapToResetConfiguration(gvkmap kubeadmapi.DocumentMap, allowDeprecated, allowExperimental bool, strictErrors bool, skipCRIDetect bool) (*kubeadmapi.ResetConfiguration, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the parameters list can be simplified as allowDeprecated, allowExperimental, strictErrors, skipCRIDetect bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to do this with a follow-up refactor PR (struct option).
/unhold some refactor could be done with a followup: kubernetes/kubeadm#2941 |
What type of PR is this?
/kind bug
What this PR does / why we need it:
followup #114455
We have fixed the scenario of not passing
--config
, this PR is based on #114455, and try to fix the scenario of passing a config file.Which issue(s) this PR fixes:
Ref kubernetes/kubeadm#2937
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: