Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sendHistogramReport is not an accurate name #44

Closed
yoavweiss opened this issue May 16, 2023 · 7 comments
Closed

sendHistogramReport is not an accurate name #44

yoavweiss opened this issue May 16, 2023 · 7 comments

Comments

@yoavweiss
Copy link
Contributor

IIUC, sendHistogramReport doesn't in fact send a report, but queues up a (single) PAHistogramContribution.

As such, we may want to consider renaming it, e.g. to aggregateHistogramContribution or appendHistogramContribution.

Aside: would it make sense to enable it to get an array of PAHistogramContributions as input?

@alexmturner
Copy link
Collaborator

Changing makes definitely sense. Wdyt about contributeToHistogram()? This would avoid having to make it plural if we allow an array as input as well.

alexmturner added a commit that referenced this issue May 19, 2023
Changes sendHistogramReport() to contributeToHistogram() and
reportContributionForEvent() to contributeToHistogramOnEvent().
Functionality is otherwise unchanged.

See issue #44.
alexmturner added a commit that referenced this issue May 23, 2023
Changes sendHistogramReport() to contributeToHistogram() and
reportContributionForEvent() to contributeToHistogramOnEvent().
Functionality is otherwise unchanged.

See issue #44.
@alexmturner
Copy link
Collaborator

(Leaving open for the array suggestion)

@xottabut
Copy link

Hi Alex and Yoav, thanks for selecting the better name for API functions.

I just wanted to create a new issue with proposal to add the method that accepts multiple keys/values, but then I noticed that this issue left open for the array suggestion.

The proposal is that browser sends multiple reports to ad tech side in one request when those request passed in one API method call.

Our use case:

  • We have multiple report and time grains, that requires to send many reports at the time (in some cases might be > 100).

Benefits:

  • It will be more optimized for both user and ad tech side as less requests will be sent out.

To preserve privacy:

  • random number (up to some X) of "null" reports could be added to the batch
  • aligning the size of the batched report to some extent, so it's difficult to reverse-engineer the number of reports in one batch
  • browser might split batch in few smaller requests instead of sending one request, this also makes it hard to reverse-engineer the number of reports in one batch

Alex, what would be your thoughts about this proposal?

@xottabut
Copy link

xottabut commented Jun 2, 2023

Just found the next information about the batching:
https://github.com/patcg-individual-drafts/private-aggregation-api#reducing-volume-by-batching

This is something that would work for us. The only concern for me is truncating to the 20 reports per batch, another proposed option with splitting bigger batches into few requests each with 20 reports max per one request looks better.

Is there any information what is the status of this proposal? Will it be implemented?

@alexmturner
Copy link
Collaborator

Hi @xottabut! Sorry for the delay in responding.

Yes, as you saw, our current design involves consolidating/batching any contributions in a "batching scope" together into one report. We're still considering allowing an array to be passed to contributeToHistogram(), but this would just be syntactic sugar -- the effects would be identical to multiple calls each with one contribution.

This batching plan is currently implemented in Chrome and available under an Origin Trial. You can find some developer documentation here (although I don't believe it discusses this batching): https://developer.chrome.com/docs/privacy-sandbox/private-aggregation/. You can also find some more discussion of the design here: #32. But basically, the limit of 20 came to mitigate the size increase that will be introduced with padding.

However, we're definitely looking feedback on the design, for example about:

  • the appropriate value for the maximum number of contributions per report
  • the appropriate behavior if there are more contributions than that value

It sounds like if we moved from truncation to splitting the contributions into multiple reports each with 20 (or fewer) contributions, that would work for you -- is that correct?

Can you also share more details around your specific use case?

Thanks for the feedback!
Alex

@xottabut
Copy link

xottabut commented Jun 5, 2023

Hi @alexmturner , thank you very much for providing more details.

Yes, splitting them into multiple batches instead of truncating would work for us.
Example of the use case is to have 6-month reports on the daily basis.
With the current state of the API this requires to have a separate key for each day. And on any day the ad event will contribute at maximum to ~180 keys (situation when given ad is happening first time for the given chrome instance and chrome profile).

Based on the calculations you provided in #32 (~1.2 kB for 50 contributions and ~500 B for 20 contributions) having 50 contributions per one request instead of 20 would work better for us.

Here I have another question: since report is scheduled to be sent with a delay, will there be any batching for reports that were created from different worklet executions (i.e. different ad events)?

  • if no: is there any plans to introduce it?
  • if yes: what is the logic behind batching?

Summarizing for our use case with X duration on a daily basis reports:

  • splitting is required (with truncating we will lose reports)
  • batching of 50 is better (but 20 will still work, just producing more requests from user side to ad-tech)

Anatolii

@yoavweiss
Copy link
Contributor Author

I opened #70 to tackle the "should we accept an array of contributions" question.
@xottabut - If there are still open questions around batching, I think it'd be better to open a new issue specific to those questions.

Otherwise, I think we can close this to avoid confusion, as the OP issue was resolved with #48

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants