-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PARAKEET with Noisy Ranking: Advertiser isolation #47
Comments
Shared storage representation, while scoped to buyer, allows the buyer to build representations of their choosing. Hence, the buyer (exactly like today) can build representations at the advertiser level, and use those as part of the relevance evaluation within the worklet provided to the TM. For example, the buyer can set the list of advertiser domains, which might be used later for something like boosting bids for advertisers the user has seen previously. What isn't explicit here is that per-advertiser models could be run as part of that worklet at the TM. The work to do this is mostly implementation and API type specifications: all of the data is in place within the TM to make that evaluation possible. In contrast, the noisy ranking vector sent from the TM to the buyer at request time likely shouldn't be advertiser specific. In theory, a DSP can choose what representation they want, but either they have to choose only a single advertiser to pass through the embedding model, or they have to somehow concatentate a bunch together. In either case, the amount of noise needed for privacy will likely muddy any per advertiser signal, with the exception perhaps of a couple of extremely large cases. A more workable representation is for the more general noised representation to be sent, the DSP then stratifies a variety of advertisers to send back, and let's the TM handle the per advertiser representation cases. |
I see how the buyer could satisfy the isolation requirement by choosing a single advertiser to pass through the embedding model, stratifying ad selection and finally filtering. This, however, seems inefficient: a niche advertiser in a crowded market would have very few ads in the mix, if any. And even that only if it is selected to be the one passed through the embedding model. Another solution allowed by PARAKEET with Noisy Ranking would be for a buyer to deploy a private instance of its system for the advertiser, so that it would act as its own buyer. That would solve all the problems related to sharing any quotas with other advertisers. However, it has another flaw: it multiplies the number of embedding models trained on the TM. If the buying system provider deploys a single embedding model for different advertisers’ private instances, they will be trained separately and diverge. Maybe there is a middle ground? What do you think about allowing buyers to put advertisers into groups, isolating them for all purposes, except for the embedding models? That could satisfy the advertiser isolation requirement without the inefficiency of a stratified ad selection. For privacy considerations, this should not be worse than a DSP deploying a private instance of their service for a subset of advertisers under a new domain. |
Thought we were going to cover this in the meeting, but apparently was missed. Ideally, and the plan would be, the training / evaluation module is robust enough to support this as is, if the buyer wants to. E.g., you have K different "submodels" that run for different subgroups of advertisers (similar to the example which has 2 models but for different feature sets). When you train, the script identifies which submodel should be triggered for a sample based on say some addomain, and so the loss is only computed on that part of the sample. When you evaluate, the evaluator returns the vectors from all the models, say concatenated together to the A hacked-together class might look something like this:
|
If I understand correctly, this is how the solutions mentioned so far compare
If this is accurate, D does not seem to be an improvement over B. Is this comparison fair, or have I missed something? |
An advertiser might not want to provide signals to be potentially used to their competitors’ advantage. To satisfy the advertiser’s needs, the buyer could try to isolate users’ data and compute embeddings separately. During an auction, such isolated advertisers should be considered independently, even if represented by the same buyer.
PARAKEET with Noisy Ranking seems to lack mechanisms to support that, with the following scoped per buyer:
What do you think about this use case?
The text was updated successfully, but these errors were encountered: