cluster-bootstrap: address constant-time problems as in NCC-E003660-TTV #120400

neolit123 · 2023-09-04T10:49:10Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

as reported in #119632 there are a couple of problems in the cluster-bootstrap package.

1

the function randBytes() responsible for generating a token uses a modulo to perform range reduction from 0-252 -> 0-36 to obtain a character from the 0-9a-z predefined literal / lookup table.

solution: generate numbers in the 0-36 range using crypto/rand.Int() then instead of accessing a character from
the table with [] indexing, use simple operations (constant-time) to obtain the random character.

2

the function IsValidBootstrapToken() uses regexp string matching which is not a good practice.

solution: break down the function and use simple comparisons.

Which issue(s) this PR fixes:

Fixes #119632

Special notes for your reviewer:

not an expert on "constant time", have some expertise on how a CPU works...

Does this PR introduce a user-facing change?

cluster-bootstrap: improve the security of the functions responsible for generation and validation of bootstrap tokens

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2023-09-04T10:49:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~staging/src/k8s.io/cluster-bootstrap/token/OWNERS~~ [neolit123]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

neolit123 · 2023-09-04T10:51:32Z

/hold for review
/priority important-longterm
/triage accepted

neolit123 · 2023-09-13T18:09:33Z

reviews are welcome.

/sig security

security are the organizers (?) of the audit that found this.

gdncc · 2023-09-22T17:41:34Z

staging/src/k8s.io/cluster-bootstrap/token/util/helpers.go

 }
-
- token[i] = validBootstrapTokenChars[int(b)%len(validBootstrapTokenChars)]
+ token[i] = validBootstrapTokenChars[val.Uint64()]


This performs in effect an array lookup based on a secret value, the index of the token character in string validBootstrapTokenChars. This would in principle leak the index, and therefore the secret token character, via timing side-channels. The lookup based on index x, where x is val.Uint64(), could be replaced by a constant-time selection e.g. res := x + 48 + (39 & ((9 - x) >> 8)), then string(rune(res)).

thanks for the reply.
so we still generate a random index x := val.Uint64() but then use the simple operations to range lock it in the ASCII values for a-z and 0-9?

Yes, you keep the random index generation as is (val, err := rand.Int(rand.Reader, max)). Then, instead of performing an array lookup, you map x (val.Uint64) to the appropriate ASCII ranges using the operation above.

understood, thank you. i will update this as soon as possible in the next few days (not on a computer right now).

The function generates bytes in the x={0-252} range and then applies an y=(x mod 36) to obtain allowed token characters from validBootstrapTokenChars[y]. Instead of using crypto/rand.Reader, use crypto/rand.Int() that operates in the val={0-len(validBootstrapTokenChars))}. Once a random index is generated, use simple operations to obtain a random character in the a-z,0-9 character range. This makes the character generation in constant-time.

neolit123 · 2023-09-26T17:34:47Z

/retest

neolit123 · 2023-10-04T09:05:13Z

staging/src/k8s.io/cluster-bootstrap/token/util/helpers.go

+ notDigit := (c < 48 || c > 57) // Character is not in the 0-9 range
+ notLetter := (c < 97 || c > 122) // Character is not in the a-z range


@gdncc do you think that it's worth using golang's crypto/subtle package for these comparisons?
https://pkg.go.dev/crypto/subtle#ConstantTimeLessOrEq

another question,
in the issue description we have:

Attackers may determine Bootstrap Token values based on timing-based side channels.

is that a case where the attacker has tapped into the protected process memory of a the running binary (e.g. kubeadm) run by the cluster administrator and they could intercept the tokens? that would imply that the host OS has a vulnerability.

i am giving kubeadm as an example, because kubeadm is the only core k8s consumer of the cluster-bootstrap library.
the way to generates a token is a one-shot operation of calling the randBytes(), generating a token and uploading it as a secret in etcd. the token is also validated in the process using IsValidBootstrapToken().
the token is also validated when the user specifies it (i.e. pre-generated by the user).

Using the crypto/subtle package is an option, but I can’t think of a compelling reason to switch.

To perform a timing side-channel attack, the attacker does not need to have access to the memory of the running binary e.g. access to a debugger.

Common processors use caches to speed up access to resources such as data and code. Attackers who can monitor the cache, for example from the host, or from a VM machine for code running on the host (e.g. a different process/user), or another VM on the same hypervisor than the attacker controlled VM, may observe changes in speed and cache behaviour, when resources that depend on sensitive information are used. This attack can reveal in principle the locations where the victim is accessing data (data flow), or the code the victim is running (control flow). The attacker leverages a signal which is measured through elapsed time. In the case of an array access at a secret address, this exercises caches and their contents, so that the memory access, but also other subsequent memory accesses elsewhere in the system, will be impacted (e.g. the actual array access kicked out of cache another data element, and a later access to that evicted element will be slower). This means that what the attacker actually measures can be other code running at a (slightly) later time.

Timing differences may also be observable from the network (e.g. how long it takes to validate the token). This is probably not exploitable in this case (although recent paper Out of the Box Testing discusses impressive results a few network hops from the target, or better, using the loopback interface).

Noting that this finding was classified as informational. There are also potential attenuating circumstances. Timing side-channel attacks are statistical; the attacker has to be able to do repeated experiments and somehow assemble together the small scraps of information they gathered. It would require additional research to determine whether it is exploitable, or not.

thanks for the context. this is quite interesting!

neolit123 · 2023-10-04T09:15:43Z

dropping sig auth (we can consider this library as SIG CL exclusive)
/remove-sig auth

The function uses BootstrapTokenRegexp.MatchString(token) which is not a recommended practice. Instead, break down the token into its components: ID, secret. The ID is public thus we can use Regexp matching for it. The secret needs constant time comparison. Iterate over every character and make sure it fits the 0-9a-z range and that it has a length of 16.

neolit123 · 2023-10-05T10:49:55Z

updated to use the api.BootstrapTokenSecretBytes constant instead of 16

@gdncc could you please add the text LGTM in a comment as a sign-off, in case you agree with the current state of the change. i can ask a k8s org member to double check and add our lgtm label.

thank you

neolit123 · 2023-10-05T11:50:45Z

/retest

reylejano · 2023-10-05T14:57:43Z

Thank you for reviewing @gdncc
@gdncc is from the vendor that reporting the finding NCC-E003660-TTV: Timing Side Channel in Bootstrap Tokens Generation and Handling in issue #119632
/lgtm

k8s-ci-robot · 2023-10-05T14:57:51Z

LGTM label has been added.

Git tree hash: 1a1afc62f55460c0cdafce2e5d505121f6f48d31

neolit123 · 2023-10-06T08:43:19Z

canceling the hold.
/hold cancel

thanks for the LGTM

i also asked @justinsb to have another look. in case there are remarks i can send a cleanup / followup.

k8s-ci-robot requested review from CecileRobertMichon and justinsb September 4, 2023 10:50

neolit123 mentioned this pull request Sep 4, 2023

NCC-E003660-TTV: Timing Side Channel in Bootstrap Tokens Generation and Handling #119632

Closed

k8s-ci-robot added sig/security Categorizes an issue or PR as relevant to SIG Security. sig/auth Categorizes an issue or PR as relevant to SIG Auth. labels Sep 13, 2023

gdncc reviewed Sep 22, 2023

View reviewed changes

neolit123 force-pushed the 1.29-fix-bootstrap-token-constant-time branch from 06e32d2 to ae30bb4 Compare September 23, 2023 15:27

neolit123 commented Oct 4, 2023

View reviewed changes

k8s-ci-robot removed the sig/auth Categorizes an issue or PR as relevant to SIG Auth. label Oct 4, 2023

neolit123 force-pushed the 1.29-fix-bootstrap-token-constant-time branch from ae30bb4 to 1d519f1 Compare October 5, 2023 10:47

k8s-ci-robot assigned reylejano Oct 5, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 5, 2023

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 6, 2023

k8s-ci-robot merged commit 5ff7961 into kubernetes:master Oct 6, 2023
14 checks passed

k8s-ci-robot added this to the v1.29 milestone Oct 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster-bootstrap: address constant-time problems as in NCC-E003660-TTV #120400

cluster-bootstrap: address constant-time problems as in NCC-E003660-TTV #120400

neolit123 commented Sep 4, 2023 •

edited

k8s-ci-robot commented Sep 4, 2023

neolit123 commented Sep 4, 2023

neolit123 commented Sep 13, 2023 •

edited

gdncc Sep 22, 2023

neolit123 Sep 22, 2023

gdncc Sep 22, 2023

neolit123 Sep 22, 2023

neolit123 Sep 23, 2023

neolit123 commented Sep 26, 2023

neolit123 Oct 4, 2023

neolit123 Oct 4, 2023 •

edited

gdncc Oct 4, 2023

neolit123 Oct 4, 2023

neolit123 commented Oct 4, 2023

neolit123 commented Oct 5, 2023

neolit123 commented Oct 5, 2023

reylejano commented Oct 5, 2023

k8s-ci-robot commented Oct 5, 2023

neolit123 commented Oct 6, 2023

		notDigit := (c < 48 \|\| c > 57) // Character is not in the 0-9 range
		notLetter := (c < 97 \|\| c > 122) // Character is not in the a-z range

cluster-bootstrap: address constant-time problems as in NCC-E003660-TTV #120400

cluster-bootstrap: address constant-time problems as in NCC-E003660-TTV #120400

Conversation

neolit123 commented Sep 4, 2023 • edited

What type of PR is this?

What this PR does / why we need it:

1

2

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Sep 4, 2023

neolit123 commented Sep 4, 2023

neolit123 commented Sep 13, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neolit123 commented Sep 26, 2023

Choose a reason for hiding this comment

neolit123 Oct 4, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neolit123 commented Oct 4, 2023

neolit123 commented Oct 5, 2023

neolit123 commented Oct 5, 2023

reylejano commented Oct 5, 2023

k8s-ci-robot commented Oct 5, 2023

neolit123 commented Oct 6, 2023

neolit123 commented Sep 4, 2023 •

edited

neolit123 commented Sep 13, 2023 •

edited

neolit123 Oct 4, 2023 •

edited