Dynamic Micro Targeting: Fitness- Based Approach to Predicting Individual Preferences Tianyi Jiang Alexander Tuzhilin Leonard N. Stern School of Business New York University February 2007 1
Personalization Research From Amazon shopping to choosing your politician, personalizing your decision choices via Data Mining 2
Research Questions o How to effectively segment customer base? o What is the “ideal” segmentation of the customer base? o Is it practically achievable? o What is the distribution of the segment sizes in this ideal segmentation scheme? o Is it better to partition customers and products together to achieve better targeting? 3
Customer Segmentation via Direct Grouping Methods Direct grouping of customers C into segments: combine the transactional data of customers C m , C m+1 , …, C n into a group P i = (C m , C m+1 , …, C n ) Build a predictive model Define fitness score on the model e.g. RAE, RME, etc. 4
Customer Segmentation via Direct Grouping Methods (Example) • C are Amazon customers • P i is customers from NewYork City • X 1 , … , X p are these customer’s demographic and purchase attributes such as age, gender, day of purchase, purchase total, etc., • Y - will they buy a product while visiting Amazon.com, predicts these customers’ propensity to • purchase during a Amazon.com visit • fitness score is the Relative Absolute Error 5
Optimal Customer Segmentation (OCS) Problem Given the customer base C of N customers and predictive model Partition C into the set of mutually exclusive collectively exhaustive segments P = {P 1 ,...,P k }, • Build predictive model for each segment P i • Find optimal partitioning P = {P 1 ,...,P k } so that the objective function is maximized over all possible partitions P, where is the fitness function for segment P i and weight α i specifies “importance” of segment i . 6
OCS Solution Space Theorem. OCS problem is NP- hard… Therefore… suboptimal solution: • find a suboptimal polynomial customer segmentation methods providing reasonable fitness 7
Related Work • Combinatorial Optimization Problems in Operations Research – ( Land et al. 1960, Guignard et al. 1987, Gomory 1958) • Customer segmentation and clustering in Marketing Research – clustering, mixture models (Wedel et al. 2000) • Data Mining Research on Customer Segmentation – basket shopping, hierarchical, & pattern based clustering (Brijs et al. 2001, Jiang et al. 2006, Yang et al. 2003) 8
Traditional Segmentation Methods Hierarchical Clustering (HC) • compute some summary statistics from customers’ demographic and transactional data • consider these statistics as points in an n - dimensional space • group customers into segments by applying various clustering algorithms to these n - dimensional points. * Jiang, Tuzhilin , “Segmenting Customers from Population to Individuals: Does 1 -to- 1 Keep Your Customers Forever?” TKDE 18(10), 2006 9
Traditional Segmentation Methods Affinity Propagation (AP) • n unique customers • AP identifies a set of training points, exemplars , as cluster centers by recursively propagating “affinity messages” among training points. • Similar to greedy K-medoids algorithms, AP picks exemplars as cluster centers during every iteration • where each exemplar in our study is a single customer represented by his/her summary statistics vector. 10
Suboptimal Efficient Solution of OCS Problem using Direct Grouping Iterative Merge (IM) Method: start with segments containing individual • customers, • iteratively merge two existing segments Seg A , and Seg B at a time when I. the predictive model based on the combined data performs better and II. combining Seg A with any other existing segments would have resulted in a worse performance than the combination of both Seg A and Seg B . 11
Micro Targeting… Product Types × Customer Matrix ( √ stands for a purchase of product type by customer) … Customer Customer Customer 1 2 N √ Product Type 1 √ √ Product Type 2 … … … … … √ √ Product Type L 12
Micro Targeting Method Iterative Merge Products (IM_Prod): start with segments containing individual • customer’s specific product type transaction data • Bootstrap operation to merge small segments based on K-nearest neighbors of customer’s product type and demographic summary statistics vectors • Run IM with customers’ product type segments 13
Empirical Comparisons of Different Approaches Comparing Three Segmentation Approaches: • Statistics based • Direct grouping based • Micro Targeting based Across five dimensions of different • Types of datasets (ComScore, Nielsen, Synthetic data) • Types of customers (high vs. low-volume) • Types of predictive models (classifiers J48 & Naïve Bayes) • Dependent variables (3 variables per dataset) • Performance Measures Root Mean Squared Error – RME Relative Absolute Error – RAE Correctly Classified Instances - CCI 14
Data Sets: Customer Types & Transaction Counts Average Customer % of Total Total DataSet Families Transactions Type Population Transactions Per Family ComScore High 5% 2,230 137,157 62 ComScore Low 5% 2,230 24,344 11 Nielsen High 10% 156 28,985 186 Nielsen Low 10% 156 5,007 32 Syn-High High 100% 2,048 204,800 100 Syn-Low Low 100% 2,048 20,480 10 15
Statistical Significance We apply the Mann-Whitney rank test to compare any two performance distributions across • 6 datasets • 3 variables • 2 classifiers • 3 performance measures for a total of 108 pair-wise distribution tests between any segmentation approaches 16
Statistical Significance The null hypothesis for comparing distributions generated by methods A and B for a performance measure is: (I) H 0 : The distribution of a performance measure generated by method A is not different from the distribution of the performance measure generated by method B. H 1 +: The distribution of a performance measure generated by method A is different from the distribution of the performance measure generated by method B in the positive direction. H 1 -: The distribution of a performance measure generated by method A is different from the distribution of the performance measure generated by method B in the negative direction. 17
Empirical Results Comparing All Methods Methods HC IM IM_Prod H+ H- H+ H- H+ H- 66 18 12 57 0 108 AP - - 6 90 0 108 HC 108 0 96 0 - - IM_Prod Performance tests across all statistics-based segmentation methods for Hypothesis Test (I) at 95% significance level (numbers in columns H 1 + and H 1 - indicate the number of statistical tests that reject hypothesis H 0 . Total significance tests per method to method comparison pair is 108) 18
Empirical Results Sample CCI score distributions (“Day of the Week” prediction across High & Low-Volume ComScore Customers) IM_Prod IM Low-Volume Datasets High-Volume Datasets 19
Empirical Results Error distributions (“Day of the Week” prediction across High & Low-Volume ComScore Customers) RAE High Volume RME Low Volume IM_Prod IM 20
Empirical Results Segment Size Distribution Generated by IM_Prod and IM 3000 16000 Number of Segments 14000 2500 Number of Segments 12000 2000 10000 IM_Prod 1500 8000 6000 1000 4000 500 2000 0 0 1 3 5 22 52 118 688 732 983 1 3 5 7 68 99 408 674 1032 1304 Segment Size Segment Size 8000 5000 7000 4500 4000 6000 IM 3500 5000 3000 Count Count 4000 2500 3000 2000 1500 2000 1000 1000 500 0 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 1 8 15 22 29 36 43 50 Segment Size Segment Size High-Volume Datasets Low-Volume Datasets 21
Empirical Results Customer Segment Membership Count Distribution 10000 7000 9000 6000 8000 5000 7000 Frequency Frequency 6000 4000 5000 3000 4000 3000 2000 2000 1000 1000 0 0 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 High Vol Segment Membership Low Vol Segment Membership High-Volume Datasets Low-Volume Datasets 22
Empirical Results Generated segments in the “Segment Count” × “Average CCI per segment” × “Number of Purchases in Segment” space IM_Prod IM High-Volume Datasets Low-Volume Datasets 23
IM_Prod Computational Expense 24
Conclusions Partition customers based on micro targeting • results in formation of “better” customer segmentations than traditional clustering based and fitness-based direct grouping approaches Micro targeting produces smaller segments than • Direct Grouping methods The above results add support for Micro • Segmentation (partition based on both customer and product types) approaches to personalization 25
Future Research • Improve method not just based on predictive accuracy, but also in terms of the standard marketing oriented performance measures such as customer value, profitability and other economics based performance measures • Investigate scalability and generalizability issues of our approach against different types of very large real world datasets and be able to handle incremental or time series data 26
Recommend
More recommend