Skip to content

Commit

Permalink
Make fromString roughly symmetrical to toString
Browse files Browse the repository at this point in the history
  • Loading branch information
mathiasbynens committed Dec 22, 2017
1 parent cdbb917 commit a29f339
Show file tree
Hide file tree
Showing 2 changed files with 51 additions and 38 deletions.
37 changes: 28 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,15 @@ This proposal is at stage 0 of [the TC39 process](https://tc39.github.io/process
For `Number` values, there is [`parseInt(string, radix = 10)`](https://tc39.github.io/ecma262/#sec-parseint-string-radix) and [`Number.parseInt(string, radix = 10)`](https://tc39.github.io/ecma262/#sec-number.parseint), but its behavior is suboptimal:

- It returns `NaN` instead of throwing a `SyntaxError` exception when `string` does not represent a number.
- It returns `NaN` instead of throwing a `RangeError` exception when `radix !== 0 && radix < 2` or `radix > 36`.
- It returns `NaN` instead of throwing a `RangeError` exception when `radix` is not valid (i.e. `radix !== 0 && radix < 2` or `radix > 36`).
- It accepts radix `0`, treating it as `10` instead, which does not make sense.
- It supports hexadecimal integer literal prefixes `0x` and `0X` but lacks support for octal integer literal prefixes `0o` and `0O` or binary integer literal prefixes `0b` and `0B`, which is inconsistent.
- It ignores leading whitespace and trailing non-digit characters.
- The fact that `parseInt` has some level of support for integer literal prefixes means that it not a clear counterpart to `toString`.

## Proposed solution

We propose extending both `BigInt` and `Number` with a new static `fromString(string, radix = 10)` method which acts as the inverse of `{BigInt,Number}.prototype.toString(radix)`.
We propose extending both `BigInt` and `Number` with a new static `fromString(string, radix = 10)` method which acts as the inverse of `{BigInt,Number}.prototype.toString(radix = 10)`. It accepts only ASCII-case-insensitive strings that can be produced by `{BigInt,Number}.prototype.toString(radix = 10)`, and throws an exception for any other input.

## High-level API

Expand All @@ -48,15 +50,24 @@ Number.parseInt('0xC0FFEE');
// → 12648430
Number.parseInt('0o755');
// → 0
Number.parseInt('0b00110011');
Number.parseInt('0b00101010');
// → 0

Number.fromString('0xC0FFEE');
//0
//SyntaxError
Number.fromString('0o755');
// → 0
Number.fromString('0b00110011');
// → 0
// → SyntaxError
Number.fromString('0b00101010');
// → SyntaxError

Number.fromString('C0FFEE', 16);
// → 12648430 === 0xC0FFEE
Number.fromString('c0ffee', 16);
// → 12648430 === 0xc0ffee
Number.fromString('755', 8);
// → 493 === 0o755
Number.fromString('00101010', 2);
// → 42 === 0b00101010
```

Unlike `parseInt`, `fromString` throws a `SyntaxError` exception when `string` does not represent a number.
Expand All @@ -77,14 +88,18 @@ Number.fromString('x');
// → SyntaxError
```

Unlike `parseInt`, `fromString` throws a `RangeError` exception when `radix !== 0 && radix < 2` or `radix > 36`.
Unlike `parseInt`, `fromString` throws a `RangeError` exception when `radix < 2` or `radix > 36`.

```js
Number.parseInt('1234', 0);
// → 1234
Number.parseInt('1234', 1);
// → NaN
Number.parseInt('1234', 37);
// → NaN

Number.fromString('1234', 0);
// → RangeError
Number.fromString('1234', 1);
// → RangeError
Number.fromString('1234', 37);
Expand All @@ -95,7 +110,11 @@ Number.fromString('1234', 37);

#### What about legacy octal integers?

`fromString` intentionally lacks special handling for legacy octal integer literals, i.e. those without the explicit `0o` or `0O` prefix such as `010`. In other words, `Number.fromString('010')` results in `10` (and not `8`).
`fromString` intentionally lacks special handling for legacy octal integer literals, i.e. those without the explicit `0o` or `0O` prefix such as `010`. In other words, `Number.fromString('010')` throws a `SyntaxError` exception.

### What about numeric separators?

`fromString` does not need to support [numeric separators](https://github.com/tc39/proposal-numeric-separator), as they cannot occur in `{BigInt,Number}.prototype.toString(radix)` output. `Number.fromString('1_000_000_000')` throws a `SyntaxError` exception.

## Specification

Expand Down
52 changes: 23 additions & 29 deletions spec.html
Original file line number Diff line number Diff line change
Expand Up @@ -20,60 +20,54 @@
</style>
<emu-clause id="sec-bigint-fromstring-string-radix">
<h1>BigInt.fromString ( _string_, _radix_ )</h1>
<p>The `BigInt.fromString` function produces an integer value dictated by interpretation of the contents of the _string_ argument according to the specified _radix_. Leading white space in _string_ is ignored. If _radix_ is *undefined* or 0, it is assumed to be 10.</p>
<p>The `BigInt.fromString` function produces an numeric value dictated by interpretation of the contents of the _string_ argument according to the specified _radix_. If _radix_ is *undefined*, it is assumed to be 10.</p>
<p>When the `BigInt.fromString` function is called, the following steps are taken:</p>
<emu-alg>
1. Let _inputString_ be ? ToString(_string_).
1. Let _S_ be a newly created substring of _inputString_ consisting of the first code unit that is not a |StrWhiteSpaceChar| and all code units following that code unit. (In other words, remove leading white space.) If _inputString_ does not contain any such code unit, let _S_ be the empty string.
1. Let _S_ be ? ToString(_string_).
1. If _S_ is empty, throw a *SyntaxError* exception.
1. Let _sign_ be 1.
1. If _S_ is not empty and the first code unit of _S_ is the code unit 0x002D (HYPHEN-MINUS), let _sign_ be -1.
1. If _S_ is not empty and the first code unit of _S_ is the code unit 0x002B (PLUS SIGN) or the code unit 0x002D (HYPHEN-MINUS), remove the first code unit from _S_.
1. If the first code unit of _S_ is the code unit 0x002D (HYPHEN-MINUS), then
1. Let _sign_ be -1.
1. Remove the first code unit from _S_.
1. If _S_ is empty, throw a *SyntaxError* exception.
1. If _R_ = 0, throw a *RangeError* exception.
1. Let _R_ be ? ToInt32(_radix_).
1. If _R_ = 0, then
1. Let _R_ be 10.
1. Else,
1. Assert: _R_ &ne; 0.
1. If _R_ &lt; 2 or _R_ &gt; 36, throw a *RangeError* exception.
1. If _S_ contains a code unit that is not a radix-_R_ digit, then
1. Let _Z_ be the substring of _S_ consisting of all code units before the first such code unit.
1. Else,
1. Let _Z_ be _S_.
1. If _Z_ is empty, throw a *SyntaxError* exception.
1. Let _mathInt_ be the mathematical integer value that is represented by _Z_ in radix-_R_ notation, using the letters <b>A</b>-<b>Z</b> and <b>a</b>-<b>z</b> for digits with values 10 through 35.
1. If _S_ represents a mathematical integer value in radix-_R_ notation, using the letters <b>A</b>-<b>Z</b> and <b>a</b>-<b>z</b> for digits with values 10 through 35, then
1. Let _mathInt_ be that mathematical integer value.
1. Else, throw a *SyntaxError* exception.
1. Let _number_ be _sign_ &times; _mathInt_.
1. Return the BigInt value for _number_.
</emu-alg>
<emu-note>
<p>`BigInt.fromString` may interpret only a leading portion of _string_ as an integer value; it ignores any code units that cannot be interpreted as part of the notation of an integer, and no indication is given that any such code units were ignored.</p>
</emu-note>
</emu-clause>
<hr>
<emu-clause id="sec-number-fromstring-string-radix">
<h1>Number.fromString ( _string_, _radix_ )</h1>
<p>The `Number.fromString` function produces an integer value dictated by interpretation of the contents of the _string_ argument according to the specified _radix_. Leading white space in _string_ is ignored. If _radix_ is *undefined* or 0, it is assumed to be 10.</p>
<p>The `Number.fromString` function produces an integer value dictated by interpretation of the contents of the _string_ argument according to the specified _radix_. If _radix_ is *undefined*, it is assumed to be 10.</p>
<p>When the `Number.fromString` function is called, the following steps are taken:</p>
<emu-alg>
1. Let _inputString_ be ? ToString(_string_).
1. Let _S_ be a newly created substring of _inputString_ consisting of the first code unit that is not a |StrWhiteSpaceChar| and all code units following that code unit. (In other words, remove leading white space.) If _inputString_ does not contain any such code unit, let _S_ be the empty string.
1. Let _S_ be ? ToString(_string_).
1. If _S_ is empty, throw a *SyntaxError* exception.
1. Let _sign_ be 1.
1. If _S_ is not empty and the first code unit of _S_ is the code unit 0x002D (HYPHEN-MINUS), let _sign_ be -1.
1. If _S_ is not empty and the first code unit of _S_ is the code unit 0x002B (PLUS SIGN) or the code unit 0x002D (HYPHEN-MINUS), remove the first code unit from _S_.
1. If the first code unit of _S_ is the code unit 0x002D (HYPHEN-MINUS), then
1. Let _sign_ be -1.
1. Remove the first code unit from _S_.
1. If _S_ is empty, throw a *SyntaxError* exception.
1. If _R_ = 0, throw a *RangeError* exception.
1. Let _R_ be ? ToInt32(_radix_).
1. If _R_ = 0, then
1. Let _R_ be 10.
1. Else,
1. Assert: _R_ &ne; 0.
1. If _R_ &lt; 2 or _R_ &gt; 36, throw a *RangeError* exception.
1. If _S_ contains a code unit that is not a radix-_R_ digit, then
1. Let _Z_ be the substring of _S_ consisting of all code units before the first such code unit.
1. Else,
1. Let _Z_ be _S_.
1. If _Z_ is empty, throw a *SyntaxError* exception.
1. Let _mathInt_ be the mathematical integer value that is represented by _Z_ in radix-_R_ notation, using the letters <b>A</b>-<b>Z</b> and <b>a</b>-<b>z</b> for digits with values 10 through 35. (However, if _R_ is 10 and _Z_ contains more than 20 significant digits, every significant digit after the 20th may be replaced by a 0 digit, at the option of the implementation; and if _R_ is not 2, 4, 8, 10, 16, or 32, then _mathInt_ may be an implementation-dependent approximation to the mathematical integer value that is represented by _Z_ in radix-_R_ notation.)
1. Let _number_ be _sign_ &times; _mathInt_.
1. If _S_ represents a mathematical number value in radix-_R_ notation, using the letters <b>A</b>-<b>Z</b> and <b>a</b>-<b>z</b> for digits with values 10 through 35, then
1. Let _mathNum_ be that mathematical number value.
1. Else, throw a *SyntaxError* exception.

This comment has been minimized.

Copy link
@littledan

littledan Dec 22, 2017

Member

Minor editorial: I like the semantics here, and specifically listing the relevant code units, but I sort of liked the old wording better, where you'd first ask if there is an errant code unit, and only after that interpret it as a number. As worded now, I'm wondering, "is there any string that contains only the appropriate code units but does not represent an appropriate mathematical number value"? Might also be good to call out digits 0-9 as the lower digit code units.

This comment has been minimized.

Copy link
@mathiasbynens

mathiasbynens Dec 22, 2017

Author Member

The problem is we need to support things like Infinity, 1e3, and 1.234 as well, and it’s not possible to reduce that to just a set of blocklisted code units.

This comment has been minimized.

Copy link
@littledan

littledan Dec 22, 2017

Member

OK, if you're supporting all of that, I think it's better if you write an explicit grammar. I didn't understand that you were supporting all of that kind of notation.

1. Let _number_ be _sign_ &times; _mathNum_.
1. Return the Number value for _number_.

This comment has been minimized.

Copy link
@littledan

littledan Dec 22, 2017

Member

Note that this step does a bit of rounding. parseInt gives a bunch of latitude to implementations for rounding, while parseFloat and everything with Numbers uses this sort of wording. I'm happy with these semantics, but when writing test262 tests, it will be helpful to have strong tests to ensure that exact rounding is used.

</emu-alg>
<emu-note>
<p>`Number.fromString` may interpret only a leading portion of _string_ as an integer value; it ignores any code units that cannot be interpreted as part of the notation of an integer, and no indication is given that any such code units were ignored.</p>
</emu-note>
</emu-clause>

0 comments on commit a29f339

Please sign in to comment.