From 18e1374c0fb658f209e633145a74c9374ca89545 Mon Sep 17 00:00:00 2001 From: Josh Karlin Date: Tue, 2 Mar 2021 14:22:56 -0500 Subject: [PATCH] Specify registrable domain --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 3d21180..1fec882 100644 --- a/README.md +++ b/README.md @@ -59,7 +59,7 @@ As a first step toward implementing FLoC, browsers will need to perform closed e For this initial phase of Chrome’s Proof-Of-Concept, simple client-side methods will be used to calculate the user’s cohort based on all of the sites that they visit with public IP addresses. The qualifying subset of users who meet the criteria described below will have their cohort temporarily logged with their sync data to perform the sensitivity analysis by Chrome described below. The collection of cohorts will be analyzed to ensure that cohorts are of sufficient size and do not correlate too strongly with known [sensitive categories](https://support.google.com/adspolicy/answer/143465?hl=en). Cohorts that don’t pass the test will be concealed by the browser in any subsequent phases. ### How the Interest Cohort will be calculated -This is where most of the experimentation will occur as we explore the privacy and utility space of FLoC. Our first approach involves applying a SimHash algorithm to the domains of the sites visited by the user in order to cluster users that visit similar sites together. Other ideas include adding other features, such as the full path of the URL or categories of pages provided by an on-device classifier. We may also apply federated learning methods to estimate client models in a distributed fashion. To further enhance user privacy, we will also experiment with adding noise to the output of the hash function, or with occasionally replacing the user's true cohort with a random one. +This is where most of the experimentation will occur as we explore the privacy and utility space of FLoC. Our first approach involves applying a SimHash algorithm to the registrable domains of the sites visited by the user in order to cluster users that visit similar sites together. Other ideas include adding other features, such as the full path of the URL or categories of pages provided by an on-device classifier. We may also apply federated learning methods to estimate client models in a distributed fashion. To further enhance user privacy, we will also experiment with adding noise to the output of the hash function, or with occasionally replacing the user's true cohort with a random one. ### Qualifying users for whom a cohort will be logged with their sync data For Chrome’s POC, cohorts will be logged with sync in a limited set of circumstances. Namely, all of the following conditions must be met: