As a follow-up to Tom Phillips’s post on the marketing implications of our first implementation of Non-Invasive Causal Estimation (NICE), this post concerns why it is important and actionable to estimate causal effects using observational data. The NICE methodology is explained in our recently published paper, and the results presented in Tom’s post are an extension of that analysis. I will be presenting the findings of this paper at the 5th Annual International Workshop on Data Mining and Audience Intelligence for Online Advertising at this year’s KDD conference, the preeminent annual conference on data mining, which will be held this year in San Diego from August 21st to 24th.
Despite the fact that billions of dollars a year are spent on digital display advertising, little has been done to quantify the effect of such advertising on customer behavior. As far as we can tell, even less has been done to inform business decisions based on those estimates. Are the high conversion rates seen for subsets of browsers the result of choosing to display ads to a group that has a naturally higher tendency to convert, or does the advertisement itself cause an additional lift? How does showing an ad to different segments of the population affect their tendencies to take a specific action, or convert? By applying NICE methods, we are able to use the existing observed data to estimate the effect of display advertising on customer behavior, and assess the impact of potential advertising decisions. Several examples of this were presented in the prior post.
In brief, by using observational methods like NICE, one does not have to incur the significant costs associated with A/B testing, which include:
• The cost of displaying PSAs to the control group (untreated group).
• The overhead cost of implementing A/B tests and ensuring that they are done correctly. This is a significant cost, and several papers have been written addressing the issue (see e.g. Kohavi, 2010).
• The cost associated with waiting for results, and the resulting delay in critical decision-making.
Fortunately for the display advertising community we do not have to reinvent the wheel. Many statisticians, economists and other scientists have devoted years of research toward improving causal estimation in observational data. In fact, prior to joining Media6Degrees, I was part of a research group at U.C. Berkeley whose sole focus was developing optimal methods for estimating causal effects. The attached paper presents several causal methods that fall under the NICE umbrella, and explains how those methods may be used for estimating the causal effect of display advertising. One particular class of estimation methods, and the one we employ here, is called targeted maximum likelihood estimation (TMLE), and it stands above the others in terms of its ability to return both accurate and precise estimates of causal effects.
One of the major reasons that TMLE exhibits these robust properties is that all of the computation and heavy lifting it does is focused strictly on estimating the causal effect as well as possible. Though it could be assumed that this is the case for most estimators, this focus is rarely the case. In fact, the motivation for developing several of the other methods was how well they behave asymptotically, or in very large samples, with little concern for how they behave in finite samples (the case for real data). As a result, those methods tend to behave unreliably when estimating the causal effect of advertising (where conversion data is quite finite) and can even return probability estimates that do not fall between zero and one. Other alternative methods are concerned with estimating the overall distribution as well as possible, rather than focusing on the optimal estimation of effect (as TMLE does). These advantages of TMLE, our NICE method of choice, are further explored in the KDD paper.
I would also like to acknowledge two Meetup groups in the New York City area to which data scientists at Media6Degrees are regular contributors – the NYC Machine Learning group and the NYC Predictive Analytics group. In fact, our chief scientist, Claudia Perlich gave a talk last month at the Predictive Analytics Meetup group entitled “What’s in your wallet. Modeling quantiles for wallet estimation”, and I spoke at the same a Meetup gathering just the other night to a group of over 150 members on the NICE methodologies presented here.
If you get a chance, I encourage you to check out their monthly talks on cutting edge topics in machine learning and predictive analytics.
Here at Media6Degrees, we will continue to improve our methods and to tackle more complicated causal questions in the display advertising ecosystem. We will keep you posted on our progress!