-
Notifications
You must be signed in to change notification settings - Fork 9.5k
core: add entity classification of origins to the LHR #14622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: Connor Clark <[email protected]>
Added to-do item: rename |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this LGTM with a few nits + the rename
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once the property has been renamed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! A couple of comments but otherwise LGTM
* @return {boolean} | ||
*/ | ||
function isFirstParty(url) { | ||
return entityByUrl.get(url) === firstParty; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entityByUrl
not having a URL should be a bug, do you foresee a scenario?
it definitely feels like there should be a fallback or a loud error for this case, e.g. if I made this request, would it have been first/third party. I don't think there's anywhere where we do this, but I think it's very possible someone wouldn't notice if they added it in the future and this just always returns false
for that case.
A if (!entityByUrl.get(url)) throw new Error('...')
equivalent would probably be fine to keep things simple until it's actually needed.
@@ -17,7 +17,7 @@ describe('Third party summary', () => { | |||
const artifacts = { | |||
devtoolsLogs: {defaultPass: pwaDevtoolsLog}, | |||
traces: {defaultPass: pwaTrace}, | |||
URL: {finalDisplayedUrl: 'https://pwa-rocks.com'}, | |||
URL: {finalDisplayedUrl: 'https://pwa.rocks'}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happened here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seemed to be an incorrect entry, per progressive-app-m60.json
trace which has no hostname matching pwa-rocks.com
.
isUnrecognized?: boolean; | ||
} | ||
|
||
interface EntityClassification { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any way this can be moved into the file itself? Type dumping grounds make me sad, and since this isn't an artifact but a live function return value, seems better to have defined with the source?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cross referenced across several audits and is there a cleaner way to share? LH.Artifacts.*
is where all computed audits that require sharing is defined. Example: LH.Artifacts.NetworkRequest
* rename entityClassification to entities in LHR. * update comment * Update proto def with LhrEntity similar to LhrCategory
Addressing #14440 and the design doc, this PR introduces
EntityClassification
computed artifact that classifies all 1st and 3rd party entities (recognized by third-party-web and unrecognized). The classification is exposed in LHR as a top-level propertyLHR.entities
. By and large, this achieves a few important things:item
s with their entities during audit time). Audits that don't resolve entities during audit, but feature aurl
in any form (url
, orsource-location
, etc.) will be resolved duringprepareReportResult
phase.This is an alternate implementation to #14614, based off discussion on the PR. This will attempt to do the same outcome as the original PR, but without classifying entities during most of the audits.
I've completed the checked items on this branch:
entity-classification
parts.entity-classification
audit itself (and tests), and the the audits that require to find 1st/3rd parties during audit (see below for a list)._computeTableItemEntities
function inDetailsRenderer
that (Now moved to report: use entity classification to filter third-parties #14697),TableColumnHeading.valueType === 'url'
and picks up the key to identify a URL to lookup entity for. This should cover all audits that include aurl
in any column (checks only the first level items).item.entity
during audit.source-location
initem.source
level.no-unload-listeners
geolocation-on-start
no-document-write
notification-on-start
uses-passive-event-listeners
deprecations
origin
corresponds tourl
but marked asvalueType: text
network-rtt
hidden
audit.network-server-latency
hidden
audit.Additional targets based on the meeting discussion:
entity-classification
available as an LHR level property, and remove it from being an audit.ComputedEntityClassification
for any audit that would like to classify entities during audit.item.entity
query-ability from LHR JSON.prepareReportResult
(Now moved to report: use entity classification to filter third-parties #14697).lhr.entityClassification
tolhr.entities
andlhr.entities.list
.Audits that make use of
ComputedEntityClassification
to resolve 1p vs. 3p during audit time:The following audits previously assumed that all unrecognized third-parties are first-parties.
valid-sourcemaps
third-party-summary
third-party-facades
(Marks entities on items during audit)legacy-javascript
Comparison to 14614: entity-based-3p...entity-based-3p-reportonly
I've tried to document all the audit results that deal with a URL and could use entity classification, but have differences in their output format, and requires special handling, in this document. It might not be exhaustive yet, but we could use it to collect them and discuss how to handle those differences.