Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explainer and spec update for padding #186

Merged
merged 13 commits into from
Jun 1, 2023

Conversation

xyaoinum
Copy link
Collaborator

@xyaoinum xyaoinum commented May 26, 2023

  1. Update the header format, and include the padding algorithm, per discussion in Sec-Browsing-Topics header format seems a bit awkward #183.
  2. Reorder the comparator conditions of BrowsingTopic (i.e. "version" first, and then "topic id"). This aligns with the current implementation.

@xyaoinum
Copy link
Collaborator Author

@domenic PTAL. Thanks! (Thanks for filing the issues giving suggestions to the spec. I will address them later, but this patch is a behavior change and we'd like to add it to the spec first)

@jkarlin PTAL. Thanks!

@xyaoinum xyaoinum requested a review from jkarlin May 26, 2023 20:20
Copy link

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems closely related to my open issue of #183. Although I am glad you are changing the format to be a bit less problematic, I think the core issue remains: the desire to not be explicit about the version association.

Have you considered just always adding the version, to every topic? That makes parsing much easier, and if I understand this correctly, it also would make the padding less necessary. (Although still necessary in some cases.)

spec.bs Outdated
<h2 id="topics-for-caller-header">Topics for caller</h2>
<div algorithm>
To <dfn>calculate the topics for caller</dfn>, given a [=topics caller context=] |callerContext|, perform the following steps. They return a list of {{BrowsingTopic}}s.
To <dfn>calculate the topics for caller</dfn>, given a [=topics caller context=] |callerContext|, perform the following steps. They return a list of {{BrowsingTopic}}s
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the period was good, keep the period :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

spec.bs Outdated
1. Insert |topicItem| to |topicsInnerList|.
1. Let |maxNumberOfEpochs| be 3 (i.e. topics are selected from the last 3 epochs).
1. Let |topicMaxLength| be 3 (i.e. [=browsing topics types/topic id=] has maximum 3 digits).
1. Let |versionMaxLength| be 13 (i.e. chrome.1:1:11).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "i.e." isn't clarifying to me. Why is 13 related to chrome.1:1:11? What does this mean for user agents with long names, like, say, internetexplorer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified -- switched to rely on the current [=browsing topics types/version=].

spec.bs Outdated
1. Let |innerListItemsSeparatorLength| be 1 (i.e. structured fields use a single character (' ') to separate inner list items).
1. Let |maxPaddingLength| be <code>|maxNumberOfEpochs| * |topicMaxLength| + |numVersionsInEpochs| * (|versionMaxLength| + |paramsSeparatorLength| + |paramKeyLength|) + (|maxNumberOfEpochs| - 1) * |innerListItemsSeparatorLength|</code>.

Note: the numbers used in the step above are specific to one of Chrome's implementation. Implementers should use numbers compatible with their configuration.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's write the spec to fit any user agent. Putting on my hat as another implementer, it's not clear which of these steps are applicable to me and which aren't.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Some are switched to use the configuration params, some are intended to be fixed constant.

spec.bs Outdated
<div class="example">
One returned topics, and underlying epochs have two different versions:

<code>t=(1;v=chrome.1:1:2 2), p=P0000000000000000000000000</code>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only see one version here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. (Removed. As example are less applicable given we'd always expose all versions.)

@xyaoinum
Copy link
Collaborator Author

@domenic I switched to always expose the version per your suggestion (although still would like to see Josh's position), and addressed your other comments. PTAL again. Thanks!

Copy link

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the move to the simpler format. We can continue discussing other ideas in #183.

Current significant blocker is that tokens cannot contain :.

spec.bs Outdated
1. For each |topic| in |topics|:
1. Let |digitsInTopic| be the number of digits in |topic|["{{BrowsingTopic/topic}}"].
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Let |digitsInTopic| be the number of digits in |topic|["{{BrowsingTopic/topic}}"].
1. Let |digitsInTopic| be the number of base-10 digits in |topic|["{{BrowsingTopic/topic}}"].

just for extra clarity

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

README.md Outdated
@@ -71,9 +71,9 @@ The topics will be inferred by the browser. The browser will leverage a classifi
* The request header will not modify state for the caller unless there is a corresponding response header. That is, the topic of the page won't be considered observed, nor will it affect the user's topic calculation for the next epoch.
* The response header will only be honored if the corresponding request included the topics header (or would have included the header if it wasn't empty).
* The registrable domain used for topic observation is that of the url of the request.
* Example request header: `Sec-Browsing-Topics: 123;v=chrome.1:1:2, 2`
* Example request header: `Sec-Browsing-Topics: t=(123;v=chrome.1:1:2 2;v=chrome.1:1:2), p=P000000000000000000000000`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I realized that : is not allowed in structured field tokens. Probably you need to change them to be strings instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the spec does allow ":" and "."? https://www.rfc-editor.org/rfc/rfc8941.html#section-3.3.4

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for missing this; you're right!

spec.bs Outdated
1. Let |topicMaxLength| be number of digits in the maximum [=browsing topics types/topic id=] (e.g. for <a href="https://github.com/patcg-individual-drafts/topics/blob/main/taxonomy_v1.md">Chrome's initial taxonomy</a>, |topicMaxLength| is 3, as the [=browsing topics types/topic id=] has maximum 3 digits).
1. Let |versionMaxLength| be the length of the current [=browsing topics types/version=].
1. Let |innerListItemsSeparatorLength| be 1 (i.e. structured fields use a single character (' ') to separate inner list items).
1. Let |maxPaddingLength| be <code>|maxNumberOfEpochs| * (|topicMaxLength| + |versionMaxLength| + |paramsSeparatorLength| + |paramKeyLength|) + (|maxNumberOfEpochs| - 1) * |innerListItemsSeparatorLength|</code>.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Let |maxPaddingLength| be <code>|maxNumberOfEpochs| * (|topicMaxLength| + |versionMaxLength| + |paramsSeparatorLength| + |paramKeyLength|) + (|maxNumberOfEpochs| - 1) * |innerListItemsSeparatorLength|</code>.
1. Let |maxPaddingLength| be |maxNumberOfEpochs| * (|topicMaxLength| + |versionMaxLength| + |paramsSeparatorLength| + |paramKeyLength|) + (|maxNumberOfEpochs| - 1) * |innerListItemsSeparatorLength|.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

spec.bs Outdated
1. Insert |topicItem| to |topicsInnerList|.
1. Let |maxNumberOfEpochs| be 3 (i.e. topics are selected from the last 3 epochs).
1. Let |topicMaxLength| be number of digits in the maximum [=browsing topics types/topic id=] (e.g. for <a href="https://github.com/patcg-individual-drafts/topics/blob/main/taxonomy_v1.md">Chrome's initial taxonomy</a>, |topicMaxLength| is 3, as the [=browsing topics types/topic id=] has maximum 3 digits).
1. Let |versionMaxLength| be the length of the current [=browsing topics types/version=].
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the "current" one? [=browsing topics types/version=] is only defined given the three inputs configurationVersion, taxonomyVersion, and modelVersion.

Probably the best solution here is to move this definition to https://patcg-individual-drafts.github.io/topics/#terminology-and-types-header to make it clear you're trying to figure out a user-agent-wide maximum. E.g. something like

The maximum version string length is the maximum possible string length of a [=browsing topics types/version=] that a user agent could possibly generate in a given browsing session.

(or maybe in a given "software release"??)

and then adding an example would be even nicer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

1. [=Set a structured field value=] given ([:Sec-Browsing-Topics:], |headerStructuredFields|) in |request|'s [=request/header list=].

Note: In Chrome's experimentation phase, it will additionally require a valid <a href="https://github.com/GoogleChrome/OriginTrials/blob/gh-pages/explainer.md">Origin Trial</a> token to exist in |initiatorWindow|'s <a data-cite="!HTML#concept-document-window">associated document</a> for the request to be eligible for topics.
<div class="note">
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible idea for spec-writing, feel free to disregard: you could try to generate the padding length by first generating an artificial maximum-length header value, then generating a first draft of your real header value, and then subtracting their serialized lengths. It's inefficient, but it's perhaps easier to understand for the reader, and the implementation could do the more efficient thing.

(You could maybe even generate the artificial maximum-length header value ahead of time, in the terminology section or something?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I incorporated your first suggestion. (I kept the generation of maximum-length inline as I feel it'd give more context here.)

@xyaoinum xyaoinum requested a review from domenic June 1, 2023 04:15
Copy link

@domenic domenic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nits and one suggestion for clarity improvement, which you can either do via another review pass or via a followup PR (since I imagine it might be important to get the normative changes settled quickly).

I think the current algorithm is efficient but hard to read. The reuse of currentTopicsInnerList, and the way it relies on the topics that come back being sorted, makes it quite stateful. I'd suggest an easier to read two-pass algorithm like so:

  1. Let versionsToTopics be an ordered map.
  2. For each topic of topics:
    1. Let version be topic["version"].
    2. Let topicInteger be topic["topic"].
    3. If versionsToTopics[version] does not exist, set it to an empty list.
    4. Append topicInteger to versionsToTopics[version].
  3. Let topicsStructuredFieldsList be an empty Structured Fields List.
  4. For each versiontopicIntegers of versionsToTopics:
    1. Let innerList be an empty Structured Fields Inner List.
    2. Append all items from topicIntegers to innerList.
    3. Let topicParameters be an empty Structured Fields Parameters.
    4. Set topicParameters["v"] to a Structured Field Token having value version.
    5. Associated topicParamters with innerList.
    6. Append innerList to topicsStructuredFieldsList.

spec.bs Outdated Show resolved Hide resolved
spec.bs Outdated Show resolved Hide resolved
spec.bs Outdated Show resolved Hide resolved
spec.bs Outdated Show resolved Hide resolved
spec.bs Outdated Show resolved Hide resolved
spec.bs Outdated Show resolved Hide resolved
@xyaoinum
Copy link
Collaborator Author

xyaoinum commented Jun 1, 2023

Thanks @domenic! Landing this now. I will create followup PR for the clarity improvement and other feedback if any.

@xyaoinum xyaoinum merged commit 9d8fec1 into patcg-individual-drafts:main Jun 1, 2023
github-actions bot added a commit that referenced this pull request Jun 1, 2023
SHA: 9d8fec1
Reason: push, by xyaoinum

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit to durangoe/topics that referenced this pull request Jun 2, 2023
SHA: 9d8fec1
Reason: push, by pull[bot]

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit to admariner/topics-API that referenced this pull request Jun 2, 2023
SHA: 9d8fec1
Reason: push, by pull[bot]

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
xyaoinum added a commit to xyaoinum/topics that referenced this pull request Jun 6, 2023
@xyaoinum xyaoinum mentioned this pull request Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants