Skip to content
This repository has been archived by the owner on Mar 16, 2023. It is now read-only.

Commit

Permalink
Specify registrable domain
Browse files Browse the repository at this point in the history
  • Loading branch information
jkarlin committed Mar 2, 2021
1 parent 0dbb92b commit 18e1374
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ As a first step toward implementing FLoC, browsers will need to perform closed e
For this initial phase of Chrome’s Proof-Of-Concept, simple client-side methods will be used to calculate the user’s cohort based on all of the sites that they visit with public IP addresses. The qualifying subset of users who meet the criteria described below will have their cohort temporarily logged with their sync data to perform the sensitivity analysis by Chrome described below. The collection of cohorts will be analyzed to ensure that cohorts are of sufficient size and do not correlate too strongly with known [sensitive categories](https://support.google.com/adspolicy/answer/143465?hl=en). Cohorts that don’t pass the test will be concealed by the browser in any subsequent phases.

### How the Interest Cohort will be calculated
This is where most of the experimentation will occur as we explore the privacy and utility space of FLoC. Our first approach involves applying a SimHash algorithm to the domains of the sites visited by the user in order to cluster users that visit similar sites together. Other ideas include adding other features, such as the full path of the URL or categories of pages provided by an on-device classifier. We may also apply federated learning methods to estimate client models in a distributed fashion. To further enhance user privacy, we will also experiment with adding noise to the output of the hash function, or with occasionally replacing the user's true cohort with a random one.
This is where most of the experimentation will occur as we explore the privacy and utility space of FLoC. Our first approach involves applying a SimHash algorithm to the registrable domains of the sites visited by the user in order to cluster users that visit similar sites together. Other ideas include adding other features, such as the full path of the URL or categories of pages provided by an on-device classifier. We may also apply federated learning methods to estimate client models in a distributed fashion. To further enhance user privacy, we will also experiment with adding noise to the output of the hash function, or with occasionally replacing the user's true cohort with a random one.

### Qualifying users for whom a cohort will be logged with their sync data
For Chrome’s POC, cohorts will be logged with sync in a limited set of circumstances. Namely, all of the following conditions must be met:
Expand Down

0 comments on commit 18e1374

Please sign in to comment.