Skip to content

<input type="number"> does not support full-width numbers (e.g., "ー200") for Japanese IME users #47095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kyouhei-horizumi
Copy link

@kyouhei-horizumi kyouhei-horizumi commented Jun 24, 2025

c9d4536

<input type="number"> does not support full-width numbers (e.g., "ー200") for Japanese IME users
https://bugs.webkit.org/show_bug.cgi?id=284363

Reviewed by Alexey Proskuryakov.

This patch adds normalization support for full-width digits, the minus sign variants (U+FF0D, U+30FC, U+2212), and the full-width full stop (U+FF0E) by mapping them to their ASCII equivalents (e.g., U+002D for minus, U+002E for dot) when handling typed input in <input type=number> fields.
These characters are commonly produced by Japanese IMEs when entering numeric values.

Test: fast/forms/number/number-fullwidth-numeral-normalization.html

e782662

Misc iOS, visionOS, tvOS & watchOS macOS Linux Windows
❌ 🧪 style ✅ 🛠 ios ✅ 🛠 mac ✅ 🛠 wpe ✅ 🛠 win
✅ 🧪 bindings ✅ 🛠 ios-sim ✅ 🛠 mac-AS-debug ✅ 🧪 wpe-wk2 ⏳ 🧪 win-tests
✅ 🧪 webkitperl ❌ 🧪 ios-wk2 ✅ 🧪 api-mac ✅ 🧪 api-wpe
✅ 🧪 ios-wk2-wpt ❌ 🧪 mac-wk1 ✅ 🛠 wpe-cairo
✅ 🛠 🧪 jsc ✅ 🧪 api-ios ❌ 🧪 mac-wk2 ✅ 🛠 gtk
✅ 🛠 🧪 jsc-arm64 ✅ 🛠 vision ❌ 🧪 mac-AS-debug-wk2 ❌ 🧪 gtk-wk2
✅ 🛠 vision-sim ✅ 🧪 mac-wk2-stress ✅ 🧪 api-gtk
✅ 🧪 vision-wk2 ❌ 🧪 mac-intel-wk2 🛠 playstation
✅ 🛠 tv ✅ 🛠 mac-safer-cpp ✅ 🛠 jsc-armv7
✅ 🛠 tv-sim ✅ 🧪 jsc-armv7-tests
✅ 🛠 watch
✅ 🛠 watch-sim

@Ahmad-S792 Ahmad-S792 added the Forms For bugs specific to form elements (checkboxes, buttons, text fields, etc.) label Jun 24, 2025
@nt1m nt1m requested a review from annevk June 24, 2025 04:41
Copy link
Contributor

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem correct to me. I don't think we should convert whatever is assigned to the value setter. Instead we should convert what the end user types in before changing the underlying value.

We probably also want web-platform-tests to ensure other browsers do it in the same way.

@kyouhei-horizumi
Copy link
Author

This does not seem correct to me. I don't think we should convert whatever is assigned to the value setter. Instead we should convert what the end user types in before changing the underlying value.

We probably also want web-platform-tests to ensure other browsers do it in the same way.

Thanks for the clarification — you're absolutely right from the HTML spec perspective.

This patch currently applies normalization even when assigning full-width characters via .value, which is technically out of scope per the spec. Chromium, for that reason, limits normalization strictly to native keyboard input, avoiding changes to setter behavior. That behavior aligns with the spec as written.

Interestingly, Firefox does accept .value = '123' and normalizes it internally, even though that's not defined by the spec either.

As I understand it, Web Platform Tests (WPT) are meant to test only behavior that is explicitly specified in the HTML standard. So if we restrict normalization in WebKit to occur only during native keyboard input, that behavior would fall outside the spec — and therefore outside the scope of WPT. Similarly, the current LayoutTests covering this would no longer be valid unless the behavior is spec-defined.

With that in mind, I'd like to ask:
Is there interest in proposing a change to the HTML specification to define this normalization behavior more formally?
This would clarify expectations across engines, and would allow WPT coverage to be added consistently.

I'm happy to help draft a spec proposal or open an issue on WHATWG if that's of interest.

@annevk
Copy link
Contributor

annevk commented Jun 24, 2025

That makes sense to me, here's what I'd suggest:

  1. We test that the value setter and value attribute don't support this in WPT.
  2. I think proposing a change to the HTML standard whereby at least a set of non-ASCII numbers are a "should" is perfectly reasonable.
  3. That would then allow writing WPT .optional tests. Arguably the existing note in the type=number section would already allow for that, but it's probably worth strengthening it with a normative paragraph.

@kyouhei-horizumi
Copy link
Author

Thanks so much for the quick and helpful reply — really appreciate it.

I've never proposed a change to the HTML spec before, so it might take me a bit of time to navigate the process, but I'm happy to take a shot at it.

In the meantime, I'll keep this pull request on hold until we see how the spec discussion evolves.

@annevk
Copy link
Contributor

annevk commented Jun 24, 2025

Thank you for improving this part of the web! If you fill out "New issue" at https://github.com/whatwg/html/issues/new/choose with the basic information I'm happy to copy the relevant people. I suspect people will agree though it might take a while to get everyone to comment, so you could do some of the other work in parallel.

@kyouhei-horizumi
Copy link
Author

I've opened a WHATWG HTML issue to propose this behavior formally:
whatwg/html#11395

Thank you again for your feedback and support!

@kyouhei-horizumi
Copy link
Author

Proposal: Improve number input sanitization to match Chromium behavior

While reviewing and updating the <input type="number"> sanitization behavior, I noticed that Chromium applies more strict filtering and normalization of user input than WebKit currently does.

Specifically, Chromium's HandleBeforeTextInsertedEvent implementation performs:

  • Normalization of full-width numerals, minus signs, and full-width full stops to ASCII equivalents
  • Removal of any characters not in the allowed set [0-9 . e E - +]
  • Additional validation of placement rules (e.g., no duplicate decimal points, only one exponent character, correct placement of signs)

In contrast, WebKit currently only normalizes some characters and allows others to remain in the field, which can result in invalid floating-point representations.

Given that the HTML specification states that:

User agents must not allow the user to insert a string that is not a valid floating-point number.

... it seems appropriate to consider aligning WebKit's sanitization more closely with Chromium's stricter approach.

Current progress

So far, I have implemented an initial version of normalization that:

  • Only applies to user input (keyboard or paste), not to .value assignments via script
  • Converts full-width numerals, minus signs, and full-width full stops to ASCII equivalents
  • Leaves other characters unchanged for now

This approach is working well in early testing and ensures that scripted .value assignments are not unexpectedly transformed, while user input is consistently normalized.

Proposed next steps

  • Evaluate implementing stricter character filtering similar to Chromium:
    • Removing any characters outside [0-9 . e E - +]
    • Enforcing placement constraints (single decimal point, single exponent, correct sign placement)
  • Maintain the distinction between script assignments and user input (i.e., script .value remains unaltered)

Since building and testing these changes locally is time-consuming, I'm leaving this note here to track the proposal and current progress before proceeding with a full patch.

Reference: Chromium's HandleBeforeTextInsertedEvent implementation


Please let me know if there are any objections, suggestions, or related efforts before moving forward.

@kyouhei-horizumi

This comment was marked as resolved.

…0") for Japanese IME users

https://bugs.webkit.org/show_bug.cgi?id=284363

Reviewed by NOBODY (OOPS!).

This patch adds normalization support for full-width digits, the minus sign variants (U+FF0D, U+30FC, U+2212), and the full-width full stop (U+FF0E) by mapping them to their ASCII equivalents (e.g., U+002D for minus, U+002E for dot) when handling typed input in <input type=number> fields.
These characters are commonly produced by Japanese IMEs when entering numeric values.
In addition, like Chromium, disallowed characters are now rejected, and normalization is applied immediately after input.
@kyouhei-horizumi kyouhei-horizumi force-pushed the feat-input-number-fullwidth-support branch from c9d4536 to e782662 Compare June 30, 2025 16:31
@kyouhei-horizumi
Copy link
Author

I’ve implemented the behavior for now. I’ll write tests and additional improvements afterwards.

@webkit-ews-buildbot webkit-ews-buildbot added the merging-blocked Applied to prevent a change from being merged label Jun 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Forms For bugs specific to form elements (checkboxes, buttons, text fields, etc.) merging-blocked Applied to prevent a change from being merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants