customer selection for targeted promotion a market basket
play

Customer Selection for Targeted Promotion: A Market Basket Analysis - PDF document

Customer Selection for Targeted Promotion: A Market Basket Analysis Approach Yinghui (Catherine) Yang Graduate School of Management University of California, Davis yiyang@ucdavis.edu http://faculty.gsm.ucdavis.edu/~yiyang Chunhui Hao &


  1. Customer Selection for Targeted Promotion: A Market Basket Analysis Approach Yinghui (Catherine) Yang Graduate School of Management University of California, Davis yiyang@ucdavis.edu http://faculty.gsm.ucdavis.edu/~yiyang Chunhui Hao & Ansheng Ge Institute of Automation Chinese Academy of Sciences haochh@gmail.com; geansheng@gmail.com Research questions Data mining processes can sometimes generate overwhelming amount of information, which is very difficult for managers to comprehend, let alone put into actions. Directly incorporating managers’ decision goals into the data mining process can potentially provide more values to managers. In this research, we integrate the decision on customer selections into market basket analysis (i.e. association rule analysis). Market basket analysis is a useful method of discovering customer purchasing patterns by extracting associations among products from purchase transactions. For a retailer (either online or offline), a very common decision to make is to select customers to provide promotions so that the money spent on the promotion can achieve the maximum value. Given the potential for cross-selling among products, market basket analysis is a viable tool of discovering such correlations among different products. Integrating the results of market basket analysis into the customer selection process can provide potentially significant value for retailers. Problem Formulation and Approach 1

  2. We consider a market basket transaction database that contains purchase records P = generated by a group of consumers. Let { p , p ,..., p } be a set of products and 1 2 n C = { c , c ,..., c } be a set of consumers. Any consumer c can purchase any product p (for 1 2 w k i k ∈ i ∈ [ 1 ... ] , [ 1 ... ] ). An shopping transaction is a list of products purchased by a consumer at w n a single check-out (it can also be extended to include purchases made within a certain period of k ∈ kj ∈ j ∈ p ki ∈ time, e.g. a week): c : ( p , p ,..., p ) ( [ 1 ... w ] , [ 1 ... n ] for [ 1 ... l ] ), where P , k k 1 k 2 kl c k ∈ ≠ i ≠ ∈ C and p p (for j and i , j [ 1 ... l ] ). ki kj (1) Single item promotion Assuming a supermarket is considering to offer discounted coupons on a certain product, and the coupon of this item can only be offered to limited number of customers due to budget limitations. The decision the supermarket needs to make is which N customers the coupons should be sent to in order to maximize the campaign profit given that the purchases of products are correlated. According to the results of market basket analysis, we are able to estimate the likelihood of other related purchases from a certain customer once the coupon is offered to him or her. In a simpler version, we can calculate a “value” for each customer. Naturally, the customers with higher value should be offered coupons first. For each customer, we define the following value: Definition 1 Value of a single item for a customer (denoted as function Value1()) [ ] ∑ − NT * PR * ( Confidence of the rule Support of RHS) RHS rules in S A S is a set of rules with the item on promotion on the left hand side and one single item on the A right hand side (e.g. A � X). For rules with multiple items on the right hand side, we can simply consider them separately. For example, A � {B, C} can be separated into A � B and A � C. RHS is short for the item on the right hand side of the rule. NT is the number of transactions 2

  3. supporting the rule. PR is the unit profit of the item on the right hand side of the rule. The lift RHS Confidence of the rule of the rule is defined as , and it can be used to measure whether the Support of RHS purchase of the item on the left hand side is positively affecting the purchase of the item on the right hand side. Because lift value is always positive, we cannot directly use it in Definition 1. The reason is that for rule with lift smaller than 1, we don’t want it to positively contribute to the value of the item on the left hand side. Therefore, we use a modified form of lift, − ( Confidence of the rule Support of RHS) . This value will be negative if the lift is smaller than 1, and will be positive if it is greater than 1. And this value will range between [-100%, 100%]. The rules discovered are from the transactions of the individual customer. The problem with one item on promotion is fairly simple. For each customer, we calculate the value of the item based on Definition 1. We can simply pick the top N customers to send coupons to. (2) Multiple item promotion For simultaneous multiple item promotion, the retailer can decide to send the coupon for a certain product (or products) to a certain customer. For simplicity, we assume that each item has the same number of coupons N . A naïve approach is to use the method for single item promotion for each item (i.e. pick the top N customers for each item to send coupons to). However, there are issues with this approach. For example, after we send coupon for Item A to a customer, the additional benefits of Item B’s coupon may not be as high as it is sent without Item A. One reason is because Item A might encourage the purchase of Item B, and Item B’s coupon may not be as useful for this customer even through this item has high value independently for this customer. Item B’s coupon might generate higher benefit for someone else with lower value for Item B alone. Below is a heuristic for this multiple item problem. 3

  4. Step 1, For each customer, calculate Value1() for all the items on promotion. Step 2, For each item, calculate the aggregated value of the top N customers for this item. Step 3, Rank the items according to the aggregated values. Let R be the list of ranked items. Step 4, Start from the top item i1 in R. Step 5, Select the top N customers for i1 Step 6, For the next item i2 in R, Step 7, For each of the top N customers for i2, check whether this customer was selected for the previous items. If not, select this customer. If yes, calculate the additional value that i2 can add on top of the previous items. Compare this added value with the value of the next available customer outside the top N customers (also consider its added value if this customer has been selected for other items), and pick the one with higher value. Do this down the list of the top customers until the number of customers selected reaches N. Step 8, Repeat Step 6 and 7 until all items have been dealt with. Expected contributions First, we believe that the paper is set out to address a very real and important problem faced by supermarkets, grocery stores, online retailers and etc. The specific problem this paper is addressing belongs to the broader family of targeted marketing, which has a very wide application in practice. Secondly, this research incorporates profit maximization into the data mining process and thus contributes to the actionability of data mining. Moreover, the methods proposed in this paper combine both optimization and data mining, and thus can contributes to the existing literature on applying optimization in data mining and vice versa. In addition, the framework set up in this paper can be easily extended to related problems such as item recommendation. Current status of the manuscript Currently we are evaluating alternative heuristics to approach the problem. We have obtained data from a supermarket containing information about customers, products (price etc.) and purchase transactions. We are planning to implement several alternative heuristic-based algorithms on both synthetic data and the real supermarket data to test the effectiveness of our approach. 4

Recommend


More recommend