NOVEC Customer Segmentation Analysis Anita Ahn Mesele Aytenifsu Bryan Barfield Daniel Kim Department of Systems Engineering and Operations Research SYST/OR 699 – Fall 2016-Final Presentation George Mason University NOVEC Customer Segmentation Analysis 1
Agenda • Introduction • Problem Statement • Methodology/ Data Description • Cluster Analysis • Applications • Difficulties/ Lessons Learned • Conclusion/Recommendation George Mason University NOVEC Customer Segmentation Analysis George Mason University 2
Introduction – About • NOVEC: Northern Virginia Electric Cooperative. Locally based electric distribution system • Services 651 sq miles of area • 6,880 miles of power lines • Provides electricity to more than 155,000 home and businesses • Stretches over multiple Counties: Fairfax, Loudoun, Prince William Stafford, Fauquier Well-known clients: Potomac Mills Mall, Verizon, AT&T George Mason University George Mason University NOVEC Customer Segmentation Analysis 3
Introduction – Background on NOVEC’s Customers • NOVEC currently has 3-4 different qualitative consumer segments • Residential • Small Commercial • Large Commercial • Church • These qualitative consumer segments are not homogeneous nor good indicators of consumer’s energy usage behavior George Mason University NOVEC Customer Segmentation Analysis George Mason University 4
Problem Statement • NOVEC wants to: • Segment customers based on their usage of electricity using data already collected for another purpose • Determine how these customer segments contribute towards NOVEC’s system peak usage • Why is this Important? George Mason University NOVEC Customer Segmentation Analysis George Mason University 5
Assumptions & Limitations • Current data is Stratified Sample, collected for the purpose of rate making • Data contains higher population of consumers who use large amounts of electricity (i.e. Large Commercial) • Majority of NOVEC’s consumers consist of Residential customers George Mason University NOVEC Customer Segmentation Analysis George Mason University 6
Goals for this Project •Using NOVEC’s data on July energy consumption, segment the consumers into groups respective of NOVEC’s total peak consumption Clustering for July data • Using the same clustering technique, segment the consumers January usage respective of NOVEC’s total peak consumption Clustering for January Data • Validate the consumer segments by looking at load profiles Validate Consumer Clusters • Using these consumer clusters NOVEC intends to use these customer segments for future forecasting, pricing analysis, and capacity planning NOVEC Implements *Project Team will focus on Goals 1-3; Goal 4 will be done by NOVEC George Mason University NOVEC Customer Segmentation Analysis George Mason University 7
Data Description Provided Variables Description Account Unique customer identifier Map Location Geospatial identifier Group Customer Billing Classification (RES, LGCOM, SMCOM, CHRCH) Usage Energy expenditure in kilowatt-hour (kWh) DateTime MM-DD-YYYY 00:00 (24-hour) Useful Variables Description Account Unique customer identifier Map Location Geospatial identifier Group Customer Billing Classification (RES, LGCOM, SMCOM, CHRCH) Usage Energy expenditure in kilowatt-hour (kWh) DateTime MM-DD-YYYY 00:00 (24-hour) George Mason University George Mason University
Terminology used in Analysis Consumer’s Peak Consumption: Consumer’s highest energy usage amount in the time period Consumer’s Average Energy Use: Consumer’s average KwH energy usage amount in the time period Peak System Load: Maximum peak electricity usage in KwH for entire NOVEC’s system in time period Coincident Peak Usage: Consumer’s KwH usage at the time NOVEC’s system peaked Worknight/Workday Total Usage: Consumer’s total KwH usage during 8am- 4pm/ on Monday-Friday for entire month Weekday/Weekend Total Usage: Consumer’s total KwH usage during Monday-Friday/ Saturday-Sunday for entire month George Mason University NOVEC Customer Segmentation Analysis 9
Derived Variables • Demand Factor Consumer ′ s Peak Consumption Peak System Load • Load Factor Consumer ′ s Avg Energy Use Consumer ′ sPeak Consumption • Coincident Usage Ratio Consumer ′ s Coincident Peak Usage • Account Peak System Load • Coincident Peak Ratio • Usage Consumer ′ s Coincident Peak Usage • DateTime Consumer ′ s Peak Consumption • Worknight to Workday Usage Ratio Worknight Total Usage Workday Total Usage • Weekday to Weekend Usage Ratio Weekday Total Usage Weekend Total Usage George Mason University NOVEC Customer Segmentation Analysis George Mason University 10
Method for Customer Segmentation 1. Manipulate and transform the data so that it is suitable for the K- means algorithm 2. Determine the optimal number of clusters 3. Run the K-means algorithm 4. Analyze and profile the clusters George Mason University NOVEC Customer Segmentation Analysis George Mason University 11
Variable Exploration (Demand Factor, Coincident Usage Ratio, Weekday-Weekend, Worknight-Workday Ratios) The histograms for these variables show heavily right-skewed distributions. Data will need Log Transformation George Mason University NOVEC Customer Segmentation Analysis George Mason University 12 George Mason University
Load Factor and Coincident Peak Ratio Variables These histograms are not skewed. Okay to use data without Log Transformation George Mason University NOVEC Customer Segmentation Analysis George Mason University 13 George Mason University
Data Transformation George Mason University NOVEC Customer Segmentation Analysis 14
How do we segment the customers? Using the K-Means Clustering Algorithm ! Steps: 1) Choose the number of clusters, k . 2) Generate k random points as cluster centroids. 3) Assign each point to the nearest cluster centroid. 4) Recompute the new cluster centroid. 5) Repeat the two previous steps until some convergence criterion is met (usually when assignment of clusters has not changed over multiple iterations). Requires the user to choose the number of clusters to be generated beforehand. George Mason University NOVEC Customer Segmentation Analysis George Mason University 15 George Mason University
Finding Optimal Number of Clusters Index Number of Name Reference Clusters KL Krzanowski and Lai 1988 6 CH Calinski and Harabasz 1974 10 Hartigan Hartigan 1975 5 Optimal Number of CCC Sarle 1983 10 Scott Scott and Symons 1971 6 Clusters: 2 or 6 Marriot Marriot 1971 6 TrCovW Milligan and Cooper 1985 3 10 TraceW Milligan and Cooper 1985 6 Friedman Friedman and Rubin 1967 6 8 Rubin Friedman and Rubin 1967 6 COUNT Cindex Hubert and Levin 1976 2 6 DB Davies and Bouldin 1979 2 4 Silhouette Rousseeuw 1987 2 Duda Duda and Hart 1973 2 2 Pseudot2 Duda and Hart 1973 2 Beale Beale 1969 2 0 Ratkowsky Ratkowsky and Lance 1978 6 2 3 4 5 6 7 8 9 101112131415 Ball Ball and Hall 1965 3 Ptbiserial Milligan 1980, 1981 3 Frey and Van Groenewoud NUMBER OF CLUSTERS Frey 1972 13 McClain McClain and Rao 1975 2 Dunn Dunn 1974 2 Hubert Hubert and Arabie 1985 6 SDindex Halkidi et al. 2000 13 Dindex Lebart et al. 2000 6 SDbw Halkidi and Vazirgiannis 2001 15 George Mason University NOVEC Customer Segmentation Analysis George Mason University 16
Are the clusters really different from each other? Kruskal-Wallis Test : There are at least two clusters that are statistically different per variable. Variable DF Chi-Square P-value DemandFactor 5 2586.1 < 0.0001 Load_Factor 5 2928.4 < 0.0001 CoincidentUsageRatio 5 3179.2 < 0.0001 Coincident_Peak_Ratio 5 3022.9 < 0.0001 Wknight_wkday_Ratio 5 1504.9 < 0.0001 Wkday_wkend_Ratio 5 1335.8 < 0.0001 Variable Description Variable Description Customer's Peak Consumption/Peak Customer's Coincident Peak Demand Factor Coincident Peak Ratio System Load Usage/Customer's Peak Consumption Customer's Avg Energy Weekend to Weekday Weekday Total Usage/Weekend Total Load Factor Usage/Customer's Peak Consumption Usage Ratio Usage Coincident Usage Customer's Coincident Peak Worknight to Workday Worknight Total Usage/Workday Total Ratio Usage/Peak System Load Usage Ratio Usage George Mason University NOVEC Customer Segmentation Analysis George Mason University 17
Are the clusters really different from each other? Post-Hoc Analysis: Dunn Test for multiple pairwise comparisons Load Factor Demand Factor 6000 6000 Mean Rank Mean Rank 4000 4000 2000 2000 0 0 1 2 3 4 5 6 1 2 3 4 5 6 Group Group Coincident Usage Ratio Coincident Peak Ratio 5000 4000 4000 Mean Rank 3000 Mean Rank 3000 2000 2000 1000 1000 0 0 1 2 3 4 5 6 1 2 3 4 5 6 Group Group George Mason University NOVEC Customer Segmentation Analysis George Mason University 18
Are the clusters really different from each other? Weekday vs Weekend Usage Worknight vs Workday Usage Ratio Ratio 5000 5000 4500 4500 4000 4000 3500 3500 Mean Rank Mean Rank 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 1 2 3 4 5 6 1 2 3 4 5 6 Group Group George Mason University NOVEC Customer Segmentation Analysis George Mason University 19
Recommend
More recommend