-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster-bootstrap: address constant-time problems as in NCC-E003660-TTV #120400
cluster-bootstrap: address constant-time problems as in NCC-E003660-TTV #120400
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: neolit123 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold for review |
reviews are welcome. /sig security security are the organizers (?) of the audit that found this. |
} | ||
|
||
token[i] = validBootstrapTokenChars[int(b)%len(validBootstrapTokenChars)] | ||
token[i] = validBootstrapTokenChars[val.Uint64()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This performs in effect an array lookup based on a secret value, the index of the token character in string validBootstrapTokenChars
. This would in principle leak the index, and therefore the secret token character, via timing side-channels. The lookup based on index x, where x is val.Uint64()
, could be replaced by a constant-time selection e.g. res := x + 48 + (39 & ((9 - x) >> 8))
, then string(rune(res))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the reply.
so we still generate a random index x := val.Uint64() but then use the simple operations to range lock it in the ASCII values for a-z and 0-9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you keep the random index generation as is (val, err := rand.Int(rand.Reader, max)
). Then, instead of performing an array lookup, you map x (val.Uint64
) to the appropriate ASCII ranges using the operation above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
understood, thank you. i will update this as soon as possible in the next few days (not on a computer right now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated!
The function generates bytes in the x={0-252} range and then applies an y=(x mod 36) to obtain allowed token characters from validBootstrapTokenChars[y]. Instead of using crypto/rand.Reader, use crypto/rand.Int() that operates in the val={0-len(validBootstrapTokenChars))}. Once a random index is generated, use simple operations to obtain a random character in the a-z,0-9 character range. This makes the character generation in constant-time.
06e32d2
to
ae30bb4
Compare
/retest |
notDigit := (c < 48 || c > 57) // Character is not in the 0-9 range | ||
notLetter := (c < 97 || c > 122) // Character is not in the a-z range |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gdncc do you think that it's worth using golang's crypto/subtle package for these comparisons?
https://pkg.go.dev/crypto/subtle#ConstantTimeLessOrEq
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another question,
in the issue description we have:
Attackers may determine Bootstrap Token values based on timing-based side channels.
is that a case where the attacker has tapped into the protected process memory of a the running binary (e.g. kubeadm) run by the cluster administrator and they could intercept the tokens? that would imply that the host OS has a vulnerability.
i am giving kubeadm as an example, because kubeadm is the only core k8s consumer of the cluster-bootstrap library.
the way to generates a token is a one-shot operation of calling the randBytes()
, generating a token and uploading it as a secret in etcd. the token is also validated in the process using IsValidBootstrapToken()
.
the token is also validated when the user specifies it (i.e. pre-generated by the user).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the crypto/subtle package is an option, but I can’t think of a compelling reason to switch.
To perform a timing side-channel attack, the attacker does not need to have access to the memory of the running binary e.g. access to a debugger.
Common processors use caches to speed up access to resources such as data and code. Attackers who can monitor the cache, for example from the host, or from a VM machine for code running on the host (e.g. a different process/user), or another VM on the same hypervisor than the attacker controlled VM, may observe changes in speed and cache behaviour, when resources that depend on sensitive information are used. This attack can reveal in principle the locations where the victim is accessing data (data flow), or the code the victim is running (control flow). The attacker leverages a signal which is measured through elapsed time. In the case of an array access at a secret address, this exercises caches and their contents, so that the memory access, but also other subsequent memory accesses elsewhere in the system, will be impacted (e.g. the actual array access kicked out of cache another data element, and a later access to that evicted element will be slower). This means that what the attacker actually measures can be other code running at a (slightly) later time.
Timing differences may also be observable from the network (e.g. how long it takes to validate the token). This is probably not exploitable in this case (although recent paper Out of the Box Testing discusses impressive results a few network hops from the target, or better, using the loopback interface).
Noting that this finding was classified as informational. There are also potential attenuating circumstances. Timing side-channel attacks are statistical; the attacker has to be able to do repeated experiments and somehow assemble together the small scraps of information they gathered. It would require additional research to determine whether it is exploitable, or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the context. this is quite interesting!
dropping sig auth (we can consider this library as SIG CL exclusive) |
The function uses BootstrapTokenRegexp.MatchString(token) which is not a recommended practice. Instead, break down the token into its components: ID, secret. The ID is public thus we can use Regexp matching for it. The secret needs constant time comparison. Iterate over every character and make sure it fits the 0-9a-z range and that it has a length of 16.
ae30bb4
to
1d519f1
Compare
/retest |
LGTM label has been added. Git tree hash: 1a1afc62f55460c0cdafce2e5d505121f6f48d31
|
canceling the hold. thanks for the LGTM i also asked @justinsb to have another look. in case there are remarks i can send a cleanup / followup. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
as reported in #119632 there are a couple of problems in the cluster-bootstrap package.
1
the function randBytes() responsible for generating a token uses a modulo to perform range reduction from 0-252 -> 0-36 to obtain a character from the 0-9a-z predefined literal / lookup table.
solution: generate numbers in the 0-36 range using crypto/rand.Int() then instead of accessing a character from
the table with
[]
indexing, use simple operations (constant-time) to obtain the random character.2
the function IsValidBootstrapToken() uses regexp string matching which is not a good practice.
solution: break down the function and use simple comparisons.
Which issue(s) this PR fixes:
Fixes #119632
Special notes for your reviewer:
not an expert on "constant time", have some expertise on how a CPU works...
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: