Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loosen the per-caller filtering #143

Merged

Conversation

xyaoinum
Copy link
Collaborator

No description provided.

README.md Outdated
@@ -81,7 +81,7 @@ The topics will be inferred by the browser. The browser will leverage a classifi
* The reason that each site gets associated with only one of the user's topics for that epoch is to ensure that callers on different sites for the same user see different topics. This makes it harder to reidentify the user across sites.
* e.g., site A might see topic ‘cats’ for the user, but site B might see topic ‘automobiles’. It’s difficult for the two to determine that they’re looking at the same user.
* The beginning of a week is per-user and per-site. That is, for the same user, site A may see the new week's topics introduced at a different time than site B. This is to make it harder to correlate the same user across sites via the time that they change topics.
* Not every API caller will receive a topic. Only callers that observed the user visit a site about the topic in question within the past three weeks can receive the topic. If the caller (specifically the site of the calling context) did not call the API in the past for that user on a site about that topic, then the topic will not be included in the array returned by the API. The exception to this filtering is the 5% random topic, that topic will not be filtered.
* Not every API caller will receive a topic. Only callers that observed the user visit a site about the topic in question within the past three weeks can receive the topic. If the topic in question was not observed by the caller, but one of its ancestor topics in the taxonomy hierarchy was observed by the caller, then the caller can also receive the closest ancestor topic that was observed by the caller. If the caller (specifically the site of the calling context) did not call the API in the past for that user on a site about that topic and its ancestor topics, then the topic will not be included in the array returned by the API. The exception to this filtering is the 5% random topic, that topic will not be filtered.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps leave this paragraph alone and instead add a separate bullet as a note: Note that observing a topic also includes observing the topics entire ancestry tree. For instance, observing /Arts & Entertainment/Humor/Live Comedy also counts as having observed /Arts & Entertainment/Humor/ and /Arts & Entertainment.

WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aa

README.md Outdated
@@ -82,6 +82,7 @@ The topics will be inferred by the browser. The browser will leverage a classifi
* e.g., site A might see topic ‘cats’ for the user, but site B might see topic ‘automobiles’. It’s difficult for the two to determine that they’re looking at the same user.
* The beginning of a week is per-user and per-site. That is, for the same user, site A may see the new week's topics introduced at a different time than site B. This is to make it harder to correlate the same user across sites via the time that they change topics.
* Not every API caller will receive a topic. Only callers that observed the user visit a site about the topic in question within the past three weeks can receive the topic. If the caller (specifically the site of the calling context) did not call the API in the past for that user on a site about that topic, then the topic will not be included in the array returned by the API. The exception to this filtering is the 5% random topic, that topic will not be filtered.
* Note that observing a topic also includes observing the topics entire ancestry tree. For instance, observing `/Arts & Entertainment/Humor/Live Comedy` also counts as having observed `/Arts & Entertainment/Humor/` and `/Arts & Entertainment`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/topics/topic's/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -82,6 +82,7 @@ The topics will be inferred by the browser. The browser will leverage a classifi
* e.g., site A might see topic ‘cats’ for the user, but site B might see topic ‘automobiles’. It’s difficult for the two to determine that they’re looking at the same user.
* The beginning of a week is per-user and per-site. That is, for the same user, site A may see the new week's topics introduced at a different time than site B. This is to make it harder to correlate the same user across sites via the time that they change topics.
* Not every API caller will receive a topic. Only callers that observed the user visit a site about the topic in question within the past three weeks can receive the topic. If the caller (specifically the site of the calling context) did not call the API in the past for that user on a site about that topic, then the topic will not be included in the array returned by the API. The exception to this filtering is the 5% random topic, that topic will not be filtered.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR also needs to mention something in the Privacy and security considerations. The statement, "There is one piece of information that the API reveals that goes beyond the capabilities of third-party cookies: that the topic returned is one of the top 5 browsing topics for the user for the given week." is now wrong. There are now two pieces of information: 1) the topic is one of the top 5 topics for the given week, and 2) if the topic returned is an ancestor of the actual observed topic, then it's possible the caller learns that the user visited a page about the ancestor topic.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jkarlin
Copy link
Collaborator

jkarlin commented Feb 23, 2023

lgtm

@xyaoinum xyaoinum merged commit 86945ea into patcg-individual-drafts:main Mar 3, 2023
@xyaoinum xyaoinum deleted the loosen-per-caller-filtering branch March 3, 2023 14:42
xyaoinum added a commit to xyaoinum/topics that referenced this pull request Mar 29, 2023
Update the spec to include <iframe> request header and relaxed filtering behavior. Those new behaviors were added after the spec was drafted.
patcg-individual-drafts#145
patcg-individual-drafts#143
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants