November 1st, 2011

m6d CEO Tom Phillips Talks to Digiday’s Brian Morrissey

Posted by:

“It’s systemic and there’s an evolution that we’re going through and it takes time…The whole media consumption pattern is totally different than what we’ve had in the past”
-Tom Phillips, m6d CEO
Part One:

digiday on livestream.com. Broadcast Live Free

Part Two:

digiday on livestream.com. Broadcast Live Free

August 26th, 2011

Claudia Perlich – m6d Chief Scientist – Wins Prestigious KDD Award

Posted by:

It is with great pride that we get to announce that our own Chief Scientist, Claudia Perlich, has once again taken home a coveted prize at this year’s ACM KDD conference.  Her paper, “Leakage in Data Mining: Formulation, Detection and Avoidance,” co-authored with two exceptional statisticians at Tel-Aviv University, has won the best paper award at the 2011 KDD conference.  The competition for this award was extraordinary — over 700 papers from many of the leading machine learning experts and data scientists worldwide.

Claudia is no stranger to winning contests at KDD, which is one of the world’s top data mining conferences, attended by both academia and top industry players (like Google, Yahoo, Microsoft, and now m6d!).  She has actually won their annual data mining competition three times in the past, and now sits on the committee that administers the competition.  Her current “best paper” is related to her experience winning these competitions.  She and her colleagues offer a formal analysis on a common pitfall in data mining & statistical analysis called “Leakage.”  According to the paper, “leakage is essentially the introduction of information about the data mining target, which should not be legitimately available to mine from.”  In other words, information related to the data you are trying to predict has “leaked” into the data you are using to make the prediction.  A trivial example that might illustrate this is as follows:

I am tasked to predict which prospects in a given pool will purchase a product online after being shown an ad for the product.  As the modeler, I pull all recent ad impressions from the data, and I use publisher, time of day and last site visited as my predictors.  I also pull in who has and hasn’t purchased, and who has and hasn’t visited the checkout page.  Now for those who purchased, the last site visited was the checkout page of the product being purchased.  If this was in my set of predictors, it would get a very high weight in my model, though in practice, this would be a useless model.  It is not feasible to target based on someone being on the checkout page, because the checkout page is the event that, by design, always precedes a purchase.

The above is a somewhat trivial example, but leakage is not a trivial problem.  As the paper points out, this problem has occurred in many data mining competitions, designed by highly qualified statisticians.  Kudos to Claudia and team for discovering the issue in these competitions and calling attention to the persistence of the problem.  It is refreshing to read a paper that offers practical guidance around such a subtle, but model-effacing, misapplication of proper modeling methodology.

Reflecting on how this relates to our work at m6d, I am thrilled to have such creative and intuitive colleagues.  We face new modeling challenges all the time, especially in such a fast moving and vast ecosystem.  Every algorithm starts with a team of people trying to solve a problem, and oftentimes these problems are so new that no textbook provides a how-to guide to designing an optimal solution.  In these situations, the algorithms are only as good as the statistical craftsmanship of the people who designed and planned them.  In our case, Claudia is one of the finest craftswoman in the field of data modeling and analysis, and so we are very fortunate to have her on our team.  And to our customers … never should you fear that leakage will ever corrupt your next campaign’s performance!

August 16th, 2011

The Science Behind NICE

Posted by:

As a follow-up to Tom Phillips’s post on the marketing implications of our first implementation of Non-Invasive Causal Estimation (NICE), this post concerns why it is important and actionable to estimate causal effects using observational data. The NICE methodology is explained in our recently published paper, and the results presented in Tom’s post are an extension of that analysis. I will be presenting the findings of this paper at the 5th Annual International Workshop on Data Mining and Audience Intelligence for Online Advertising at this year’s KDD conference, the preeminent annual conference on data mining, which will be held this year in San Diego from August 21st to 24th.

Despite the fact that billions of dollars a year are spent on digital display advertising, little has been done to quantify the effect of such advertising on customer behavior. As far as we can tell, even less has been done to inform business decisions based on those estimates. Are the high conversion rates seen for subsets of browsers the result of choosing to display ads to a group that has a naturally higher tendency to convert, or does the advertisement itself cause an additional lift? How does showing an ad to different segments of the population affect their tendencies to take a specific action, or convert? By applying NICE methods, we are able to use the existing observed data to estimate the effect of display advertising on customer behavior, and assess the impact of potential advertising decisions. Several examples of this were presented in the prior post.

In brief, by using observational methods like NICE, one does not have to incur the significant costs associated with A/B testing, which include:

• The cost of displaying PSAs to the control group (untreated group).
• The overhead cost of implementing A/B tests and ensuring that they are done correctly. This is a significant cost, and several papers have been written addressing the issue (see e.g. Kohavi, 2010).
• The cost associated with waiting for results, and the resulting delay in critical decision-making.

Fortunately for the display advertising community we do not have to reinvent the wheel. Many statisticians, economists and other scientists have devoted years of research toward improving causal estimation in observational data. In fact, prior to joining Media6Degrees, I was part of a research group at U.C. Berkeley whose sole focus was developing optimal methods for estimating causal effects. The attached paper presents several causal methods that fall under the NICE umbrella, and explains how those methods may be used for estimating the causal effect of display advertising. One particular class of estimation methods, and the one we employ here, is called targeted maximum likelihood estimation (TMLE), and it stands above the others in terms of its ability to return both accurate and precise estimates of causal effects.

One of the major reasons that TMLE exhibits these robust properties is that all of the computation and heavy lifting it does is focused strictly on estimating the causal effect as well as possible. Though it could be assumed that this is the case for most estimators, this focus is rarely the case. In fact, the motivation for developing several of the other methods was how well they behave asymptotically, or in very large samples, with little concern for how they behave in finite samples (the case for real data). As a result, those methods tend to behave unreliably when estimating the causal effect of advertising (where conversion data is quite finite) and can even return probability estimates that do not fall between zero and one. Other alternative methods are concerned with estimating the overall distribution as well as possible, rather than focusing on the optimal estimation of effect (as TMLE does). These advantages of TMLE, our NICE method of choice, are further explored in the KDD paper.

I would also like to acknowledge two Meetup groups in the New York City area to which data scientists at Media6Degrees are regular contributors – the NYC Machine Learning group and the NYC Predictive Analytics group. In fact, our chief scientist, Claudia Perlich gave a talk last month at the Predictive Analytics Meetup group entitled “What’s in your wallet. Modeling quantiles for wallet estimation”, and I spoke at the same a Meetup gathering just the other night to a group of over 150 members on the NICE methodologies presented here.

TMLE Analysis
If you get a chance, I encourage you to check out their monthly talks on cutting edge topics in machine learning and predictive analytics.

Here at Media6Degrees, we will continue to improve our methods and to tackle more complicated causal questions in the display advertising ecosystem. We will keep you posted on our progress!

August 12th, 2011

Ads Work!

Posted by:

One of our brilliant new data scientists decided to see whether some of the work he had done in epidemiology to determine the efficacy of disease treatments could be adapted to measure the impact of advertising.  What started as an academic exercise has produced a treasure trove of insights into the impact of digital advertising, and how ad effectiveness varies with targeting methods and creative strategies.
But as excited as we are about the potential of the new methodology to provide insights into our own work identifying high-performing audiences for major brands, the real headline is more fundamental and earthshaking.  ADS WORK!

In 29 of the 30 campaigns we measured, we saw a material lift in the site visit rate of browsers that viewed the ad of our client (more about that one outlier later).  Again and again, we hear – to our dismay – that because click rates on digital ads are microscopic, the ads don’t work.  Well, there is bountiful research showing that clicks alone are an incomplete and even misleading measure of impact.  But beyond that non-negative finding, which our own research has substantiated, we have very positive evidence that digital ads work, and that they deliver lift varying from material to magnificent.

I’m going to defer to our brilliant data scientist – Ori Stitelman – to describe his methodology.  Suffice it to say that he has developed an alternative to A/B testing; the issue with standard A/B testing is that it tends to be both noisy and expensive.  The new methodology is called Non-Invasive Causal Estimation (“NICE”).  NICE is a non-invasive analysis that uses mathematical models to calculate the likelihood of populations to convert, comparing those exposed to an ad to those not exposed.  The models constructed control for all confounders (factors other than ad exposure that distinguish the two populations), including the fact that our targeting technology tends to serve ads to people who are more likely to convert.

The chart below details the lift we measured from running campaigns for thirty clients across two segments – prospecting and retargeting.  These results capture the impact of our ads, not all ads from that marketer.  As such, the effect would likely be higher if the client limited the number of vendors used for a particular campaign.

M6D Prospecting Lift

The results are astounding.  For each of thirty clients, we looked at site visit lift for two populations – prospecting candidates and retargeting candidates.  The average lift for the prospecting candidates was 90% (median 70%).  The average lift for retargeting candidates was 20% (median 10%).

Translated, that means that serving ads to the people that we had identified as good prospects for the brand (but not current customers) caused them to nearly double their site visit rate!  No tricks, and no need for clicks.  The ads had impact, and eliminating all other factors, we found that digital ad exposure resonated with digital media consumers.

As indicated above, we had one case where the lift was zero for both prospecting and retargeting populations (see right side of chart above).  When we investigated this particular travel client, we found that the issue appeared to be in the creative.  Driven perhaps by the desire to maximize clicks, the creative team had generated an ad that highlighted sweepstakes over brand.  In fact, the brand itself was buried – just a thin line of text in the bottom right with a tiny logo.  No wonder we observed no lift in site visits from the population viewing this ad!

Creative Branding

Perhaps the most dramatic result was achieved with three of our telecom clients – all major marketers with huge budgets.  For these three, the relative lift for the prospecting segment averaged 106%, while the relative lift for the retargeting segments averaged 3%.  And the additive lift (the gross increase in site visit rates) for prospecting segments exceeded the additive lift for retargeting segments in two out of three cases.

Relative Lift

We are just getting started using the NICE methodology to prove the efficacy of digital advertising and improve the results we can deliver.  For now, we can happily report that in case after case, digital display advertising is having impact, and that that impact is particularly robust when our M6D prospecting technology is used to deliver the audiences most receptive to a marketer’s brand message.

May 23rd, 2011

Ad Choices Update

Posted by:

For the past several months, Media6Degrees has been rolling out the Ad Choices icon in support of the Digital Advertising Alliance’s Self Regulatory program. At this point, we’re pleased to announce that the icon is being used on over 90% of our campaigns.

Given our wide adoption, we wanted to share some details on how users are responding to this new message:

  • To date, we’ve served almost a billion impressions that included the icon.
  • People who see the icon click through to expand the overlay at a rate of less than 0.005%.
  • The overall opt-out rate is 0.0001%.
  • Of the people who clicked on the icon to expand it, 3% eventually choose to opt-out.

These numbers tell us a couple of things. First, marketers are lining up behind the program. Our customers are top brands that recognize the value of informing users of their choices regarding advertising. They should be commended for supporting this initiative.

Second, despite some sensationalism in the press and attention from Washington, actual users don’t appear to be that concerned about targeted advertising based on behavior. Perhaps after years of getting targeted direct mail at their homes, which often contains very personal information, people have come to understand that targeted marketing is generally benign, and occasionally quite beneficial. Or perhaps they truly appreciate the value of receiving customized ads. Whichever it is, users are overwhelmingly forgoing the opportunity to opt out.

Third, this initiative provides real information and choice to the subset of people who are interested in opting-out. A 3% post-click conversion rate is not bad from a direct response perspective. The people who are concerned about their privacy are taking action when the option is offered to them. This is exactly how the program is designed to work.

We continue to be proud of these efforts, and truly believe that our privacy-by-design approach to targeting and our support of this industry initiative are valued by our customers. We hope the initiative will give users, customers and regulators the confidence that the companies that adhere to the Self-Regulatory Principles are using targeting responsibly.