Skip to content

AX: AXTextMarker::findMarker, a frequently called function, can copy all the text-runs strings, causing performance issues on pages with lots of text #47209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

twilco
Copy link
Contributor

@twilco twilco commented Jun 25, 2025

d05070c

AX: AXTextMarker::findMarker, a frequently called function, can copy all the text-runs strings, causing performance issues on pages with lots of text
https://bugs.webkit.org/show_bug.cgi?id=294996
rdar://154332688

Reviewed by Joshua Hoffman.

For groups of text runs that contain non-ASCII characters, AXTextMarker::findMarker needs to use CachedTextBreakIterator
to determine how many offsets to move forward or backward. This is relevant for things like multi-byte emojis. CachedTextBreakIterator
takes a StringView, but we were always doing the far less efficient thing of combining the text from all AXTextRun structs
into a single String, requiring multiple allocations along the way. This is a major performance problem on webpages that
have lots of text (e.g. 40 seconds of a 60 second sample was spent building and allocating strings underneath findMarker).

We had to do it this way because CachedTextBreakIterator needs the full string from all runs, but because each AXTextRun
owned its own string, we couldn't just naively use StringView here, because you can't combine disparate StringViews (one
for each AXTextRun) into one larger StringView (that's not how StringViews work).

Solve the issue by making AXTextRuns own a singular string for all runs, and add start and end index fields to AXTextRun
that point into this string. This allows us to use StringViews in many more places, which this commit does.

* Source/WebCore/accessibility/AXLogger.cpp:
(WebCore::streamTextRuns):
* Source/WebCore/accessibility/AXTextMarker.cpp:
(WebCore::AXTextMarkerRange::toString const):
(WebCore::AXTextMarker::findMarker const):
(WebCore::AXTextMarker::findParagraph const):
(WebCore::AXTextMarker::findWordOrSentence const):
* Source/WebCore/accessibility/AXTextRun.cpp:
(WebCore::AXTextRuns::debugDescription const):
(WebCore::AXTextRuns::localRect const):
(WebCore::AXTextRuns::substring const): Deleted.
* Source/WebCore/accessibility/AXTextRun.h:
(WebCore::AXTextRun::AXTextRun):
(WebCore::AXTextRun::length const):
(WebCore::AXTextRuns::AXTextRuns):
(WebCore::AXTextRuns::at const):
(WebCore::AXTextRuns::operator[] const):
(WebCore::AXTextRuns::runLength const):
(WebCore::AXTextRuns::lastRunLength const):
(WebCore::AXTextRuns::lineID const):
(WebCore::AXTextRuns::toString const):
(WebCore::AXTextRuns::toStringView const):
(WebCore::AXTextRuns::substring const):
(WebCore::AXTextRuns::runString const):
(WebCore::AXTextRuns::runStringView const):
(WebCore::AXTextRuns::runStartsWithLineBreak const):
(WebCore::AXTextRuns::runEndsWithLineBreak const):
(WebCore::AXTextRun::debugDescription const): Deleted.
(WebCore::AXTextRun::startsWithLineBreak const): Deleted.
(WebCore::AXTextRun::endsWithLineBreak const): Deleted.
(WebCore::AXTextRun::characterAt const): Deleted.
(WebCore::AXTextRun::textLength const): Deleted.
* Source/WebCore/accessibility/AccessibilityRenderObject.cpp:
(WebCore::AccessibilityRenderObject::textRuns):
* Source/WebCore/accessibility/cocoa/AXTextMarkerCocoa.mm:
(WebCore::AXTextMarkerRange::toAttributedString const):

Canonical link: https://commits.webkit.org/296681@main

aff5d98

Misc iOS, visionOS, tvOS & watchOS macOS Linux Windows
✅ 🧪 style ✅ 🛠 ios ✅ 🛠 mac ✅ 🛠 wpe ✅ 🛠 win
✅ 🧪 bindings ✅ 🛠 ios-sim ✅ 🛠 mac-AS-debug ✅ 🧪 wpe-wk2 ⏳ 🧪 win-tests
✅ 🧪 webkitperl ✅ 🧪 ios-wk2 ✅ 🧪 api-mac ✅ 🧪 api-wpe
✅ 🧪 ios-wk2-wpt ✅ 🧪 mac-wk1 ✅ 🛠 wpe-cairo
✅ 🧪 api-ios ❌ 🧪 mac-wk2 ✅ 🛠 gtk
✅ 🛠 vision ✅ 🧪 mac-AS-debug-wk2 ✅ 🧪 gtk-wk2
✅ 🛠 vision-sim ✅ 🧪 mac-wk2-stress ✅ 🧪 api-gtk
✅ 🛠 🧪 merge ✅ 🧪 vision-wk2 ✅ 🧪 mac-intel-wk2 ✅ 🛠 playstation
✅ 🛠 tv ✅ 🛠 mac-safer-cpp
✅ 🛠 tv-sim
✅ 🛠 watch
✅ 🛠 watch-sim

@twilco twilco self-assigned this Jun 25, 2025
@twilco twilco added the Accessibility For bugs related to accessibility. label Jun 25, 2025
Copy link
Contributor

@minorninth minorninth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

I think we talked about comparing stringValue to the concatenated textRuns strings to assert that they're equal, and maybe even avoid saving both - that'd be even easier now.

@twilco twilco force-pushed the eng/AX-AXTextMarker-findMarker-a-frequently-called-function-can-copy-all-the-text-runs-strings-causing-performance-issues-on-pages-with-lots-of-text branch from 44d3719 to aff5d98 Compare June 25, 2025 22:52
@webkit-ews-buildbot webkit-ews-buildbot added the merging-blocked Applied to prevent a change from being merged label Jun 26, 2025
@twilco twilco added merge-queue Applied to send a pull request to merge-queue and removed merging-blocked Applied to prevent a change from being merged labels Jun 26, 2025
…all the text-runs strings, causing performance issues on pages with lots of text

https://bugs.webkit.org/show_bug.cgi?id=294996
rdar://154332688

Reviewed by Joshua Hoffman.

For groups of text runs that contain non-ASCII characters, AXTextMarker::findMarker needs to use CachedTextBreakIterator
to determine how many offsets to move forward or backward. This is relevant for things like multi-byte emojis. CachedTextBreakIterator
takes a StringView, but we were always doing the far less efficient thing of combining the text from all AXTextRun structs
into a single String, requiring multiple allocations along the way. This is a major performance problem on webpages that
have lots of text (e.g. 40 seconds of a 60 second sample was spent building and allocating strings underneath findMarker).

We had to do it this way because CachedTextBreakIterator needs the full string from all runs, but because each AXTextRun
owned its own string, we couldn't just naively use StringView here, because you can't combine disparate StringViews (one
for each AXTextRun) into one larger StringView (that's not how StringViews work).

Solve the issue by making AXTextRuns own a singular string for all runs, and add start and end index fields to AXTextRun
that point into this string. This allows us to use StringViews in many more places, which this commit does.

* Source/WebCore/accessibility/AXLogger.cpp:
(WebCore::streamTextRuns):
* Source/WebCore/accessibility/AXTextMarker.cpp:
(WebCore::AXTextMarkerRange::toString const):
(WebCore::AXTextMarker::findMarker const):
(WebCore::AXTextMarker::findParagraph const):
(WebCore::AXTextMarker::findWordOrSentence const):
* Source/WebCore/accessibility/AXTextRun.cpp:
(WebCore::AXTextRuns::debugDescription const):
(WebCore::AXTextRuns::localRect const):
(WebCore::AXTextRuns::substring const): Deleted.
* Source/WebCore/accessibility/AXTextRun.h:
(WebCore::AXTextRun::AXTextRun):
(WebCore::AXTextRun::length const):
(WebCore::AXTextRuns::AXTextRuns):
(WebCore::AXTextRuns::at const):
(WebCore::AXTextRuns::operator[] const):
(WebCore::AXTextRuns::runLength const):
(WebCore::AXTextRuns::lastRunLength const):
(WebCore::AXTextRuns::lineID const):
(WebCore::AXTextRuns::toString const):
(WebCore::AXTextRuns::toStringView const):
(WebCore::AXTextRuns::substring const):
(WebCore::AXTextRuns::runString const):
(WebCore::AXTextRuns::runStringView const):
(WebCore::AXTextRuns::runStartsWithLineBreak const):
(WebCore::AXTextRuns::runEndsWithLineBreak const):
(WebCore::AXTextRun::debugDescription const): Deleted.
(WebCore::AXTextRun::startsWithLineBreak const): Deleted.
(WebCore::AXTextRun::endsWithLineBreak const): Deleted.
(WebCore::AXTextRun::characterAt const): Deleted.
(WebCore::AXTextRun::textLength const): Deleted.
* Source/WebCore/accessibility/AccessibilityRenderObject.cpp:
(WebCore::AccessibilityRenderObject::textRuns):
* Source/WebCore/accessibility/cocoa/AXTextMarkerCocoa.mm:
(WebCore::AXTextMarkerRange::toAttributedString const):

Canonical link: https://commits.webkit.org/296681@main
@webkit-commit-queue webkit-commit-queue force-pushed the eng/AX-AXTextMarker-findMarker-a-frequently-called-function-can-copy-all-the-text-runs-strings-causing-performance-issues-on-pages-with-lots-of-text branch from aff5d98 to d05070c Compare June 26, 2025 20:39
@webkit-commit-queue
Copy link
Collaborator

Committed 296681@main (d05070c): https://commits.webkit.org/296681@main

Reviewed commits have been landed. Closing PR #47209 and removing active labels.

@webkit-commit-queue webkit-commit-queue merged commit d05070c into WebKit:main Jun 26, 2025
@webkit-commit-queue webkit-commit-queue removed the merge-queue Applied to send a pull request to merge-queue label Jun 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accessibility For bugs related to accessibility.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants