September 7, 2011
NEW YORK, September 7, 2011 — m6d, the prospect engine for brands, today announced that its Chief Scientist, Claudia Perlich, is the 2011 winner of the Best Paper Award at the prodigious ACM KDD conference for her paper, “Leakage in Data Mining: Formulation, Detection and Avoidance.” The paper, co-authored by Perlich and two statisticians at Tel-Aviv University, was chosen for the premier KDD prize in a field of over 700 papers submitted by many of the world’s leading machine learning experts and data scientists from both academia and industry leaders such as Google, Yahoo and Microsoft.
ACM KDD is the premier international conference on knowledge discovery and data mining and draws annually more than 1,000 experts on data analytics, predictive modeling, and data mining from industry and academia.
Perlich, who is responsible for developing the data science that powers m6d’s proprietary technology, won three consecutive KDD Cup Competitions in the last decade, and in recognition of her excellence in bridging data science research and application, now sits on the committee that organizes KDD Cup 2012, as well as the $3 million Heritage Health Prize. Her current best paper is related to her experience winning these competitions, as she and her colleagues offer a formal analysis on a common pitfall in data mining and statistical analysis called “leakage.” Perlich addresses the gaps in the industry, including data mining competitions, where leakage occurs.
The paper defines leakage as “essentially the introduction of information about the data mining target by the data generation, collection, and preparation process. This information should not be legitimately available to mine from.” “In other words, information related to the target you are trying to predict has leaked into the data you are using to make the prediction,” explains Brian Dalessandro, m6d’s Director of Data Science.
Perlich describes the challenge as follows: “I am tasked with predicting which prospects will be interested and ultimately purchase a product online. As a modeler, I might consider pulling all records of online browsers who made this purchase as well as browsers who did not. The available predictors could include some browsing/search history. In this case, the model will find out that examples without any browsing history are perfect prospects for this product. The only reason we have them in our universe is because they purchased the product. This, however, is a completely useless model in practice when looking for new customers.”
Perlich holds multiple patents in the area of machine learning and has published more than thirty scientific articles. Her earlier success at KDD was earned while she worked at IBM Research. Her continued success is evidence of her ability to not only do outstanding breakthrough work, but also to consistently adapt herself – and thus m6d – to the ever-changing ecosystem of data mining and analysis.
About m6d
m6d is a marketing technology company that uses machine learning to identify patterns in web-wide data that generate high-performing online media campaigns and actionable marketing insights. We help marketers advertise to their best new customer prospects by calculating the unique pattern in which their online customers cluster around the web. We analyze this social pattern for each marketer – called the brand signature – to power our advertising decisions. Having run campaigns for hundreds of major marketers, we have refined and extended our technology to deliver unparalleled results that are revolutionizing the way digital media is planned and purchased.