Differential Privacy: The Future of Advertising & Marketing

The pressure of responsible data privacy in marketing and advertising is certainly the most important sea change in media since the beginnings of social. Interestingly, social media is probably most responsible for the awareness of data practices, some malicious and some unfortunate that began the swell of consumer data regulations.

We know now we are only at the beginning of what will be continued restrictions around collection, sharing and availability. Not only is the main identifier for much of ad targeting the cookie being deprecated, but working groups are in place to remove mobile ad identifiers and even the User Agent which has been a foundational piece of web measurement and analytics that predates use of the cookie.

For brands and the technology that support marketing, ‘privacy-by-design’ has become an essential product component. So too, the ability to see what a privacy-centric world is above and beyond current GDPR is an essential position. This is an issue for many. The industry was seemingly caught off guard with Google’s announcement to kill third-party data in Chrome and has been reactive and reluctant to legislative efforts.

Without a privacy compliant foundation, many adtech and martech tools have no choice but to productize guardrails and incremental restrictions for data privacy. Rather than looking where we are today and trying to solve the problem with step-change, companies must solve this problem by starting at the end point – the most privacy compliant and data secure as possible. That end-point seems more and more to be mechanisms and theorems called ‘differential privacy.’ The initial applications built from differential privacy for advertising and marketing are being called ‘clean rooms.’ These are terms the industry will become familiar with in the next few years.

It is important to know the foundations of differential privacy:

  • Noise and inaccuracies are algorithmically injected into data to ensure privacy against adversaries. This prevents re-identification and neutralizes ID linkage.
  • There are a number of different algorithms that can power these systems but importantly privacy loss can be quantified, measured and tuned via epsilon values.
  • The application of this database is through queries. However, it is immune to post-processing. This means an analyst cannot increase or decrease privacy loss no matter what additional information is available outside the dataset.

While work on differential privacy is sixteen years old, use in production applications is fairly nascent  This is about to change. Differential Privacy’s largest general use case to date will be the 2020 US Census. It is also worth noting differential privacy has been the core methodology for Apple in iOS (and not without some early concerns to its epsilon values). There is a lot of documentation on the web about differential privacy including open source libraries from Google and TensorFlow.

Differential privacy forms the computational foundation for the clean room/private algorithms that are emerging from Google’s Ads Data Hub, Amazon Marketing Cloud (currently in closed beta), Facebook and start-ups like InfoSum who are building and providing clean room services for marketers, publishers and brands.

At the highest level, the backing of the platforms will undoubtably help establish the viability of differential privacy and they will lobby regulators to recognize these methods. Differential privacy should pass the rigorous standards of data privacy being legislated now for advertising and marketing because it creates privacy by randomness and only outputs cohorts. This would seem to handle any compliance issues around targeting individuals or needing a unique identifier – though those important specifics remain in question and likely will for some time. As important, it safeguards important data ownership privileges and the enterprise value attached to that transaction/exchange between brand and consumer.

At another level the computational costs can be very large and the platforms can leverage their size advantage to provide privacy solutions to the market. This should raise some concerns since the core advertising use cases bring the platforms even more customer data than they already have. Interesting times.

One of the areas I’m most excited about with differential privacy is that it is NOT about 1:1 marketing. The idea of personalization has gone much farther than it should have for marketers leading to the issues facing ad tech today. Simply put, 1:1 never delivered on its promise. A single person’s behaviors are unpredictable, constantly changing context and comprise a data set too small to make accurate predictions from. This is why display advertising has delivered irrelevant ads for the better part of 20 years. Anyone who has done A/B testing knows that segments, cohorts (or what media calls audiences) needed to have enough data to be statistically significant. In addition, as machine learning became/becomes more widespread in marketing data volume plays a critical role in developing accurate predictive models. Without lots of high quality – generally unavailable on a personal level – data systems are incapable with to drive “personalization” and 1:1 targeting. Ironically differential privacy might just take personalization to levels of ROI it never reached with the cookie though to be clear how this data is keyed and activated is still a work-in-progress.

As use cases get better defined among applications (attribution for example) and as feature sets and graphs become more understood (by vertical for example) these privacy centric algorithms will evolve and improve over time. It is easy to see how in five-years these methods will not only be essential but the organizations that have the capacity to develop and tune their algos to increase the value of their result sets will gain increasing returns. Be that a digital platform, a retailer, a brand or a marketer, like all ML there is a first mover advantage.

In the vein of software eating the world, one other key instrumental change is movement of marketing and advertising apps away from vendor-based point solutions and into larger scale brand owned / rented cloud infrastructure. While not every organization will be able to afford or operate clean rooms and establish an org to build applications on top of the datasets, those that do will have distinct advantages. It’s easy to see how CDPs will connect to a clean room and how those connections increase the value of each. Advances in differential privacy create new dynamics in data sharing and along with it some questions yet to be answered around data management process and controls, essential to compliance.

For now, the organization that owns the clean room makes the rules. But all data providers in the room where it happens need to mutually benefit from sharing. For example, the benefits of a CPG company having a clean room where retailer data can be joined and queried are clear. It is also clear how the retailer can benefit by having a clean room where CPGs can join data and query. And what about competitive data or rights to data models? Expect to see a number of new start-ups and solutions building tools for Private Set Intersection and Secure Multi-Party Computation in the coming years as these questions gain importance.

To be sure many more questions than these will emerge in the coming years as will different attempts to control/comply with consumer privacy in advertising and marketing. I believe differential privacy will provide the best foundation since it has a formal notion of privacy that can be agreed upon. This is no easy task. In addition, DP brings maturity of research, open-source development and several real-world, highly scaled deployments. After years of uncertainty about how to solve consumer privacy differential privacy solutions proposed are future proof and the most sensible starting point to build applications.

Continued investments in differential privacy will create the infrastructures and tooling for marketing/advertising go forward. Rather than meeting an apocalyptic end because the rights of consumer ownership over their data were ignored, differential privacy serves as a consumer-centric foundation that will continue to be improved upon and balance the needs of all parties over time. Let the future begin.





One response to “Differential Privacy: The Future of Advertising & Marketing”

  1. Puneet Gangrade Avatar

    Thanks, Jonathan, for sharing this. It’s insightful. I was checking ADH documentation and it does not mention anywhere the use of ‘differential privacy’ in ADH. But, it uses the term ‘difference checks’. This is a little different than differential privacy because it only ‘checks’ and not adds the ‘random noise’. This randomness is the key piece to a differentially private mechanism. What do you think?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create a website or blog at WordPress.com

%d bloggers like this: