C u stomer and prod u ct segmentation basics MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on
Data format # Customer by product/service matrix wholesale.head() MACHINE LEARNING FOR MARKETING IN PYTHON
Uns u per v ised learning models Hierarchical cl u stering K - means Non - negati v e matri x factori z ation ( NMF ) Bicl u stering Ga u ssian mi x t u re models ( GMM ) And man y more MACHINE LEARNING FOR MARKETING IN PYTHON
Uns u per v ised learning models Hierarchical cl u stering K - means Non - negati v e matri x factori z ation ( NMF ) Bicl u stering Ga u ssian mi x t u re models ( GMM ) And man y more MACHINE LEARNING FOR MARKETING IN PYTHON
Uns u per v ised learning steps 1. Initiali z e the model 2. Fit the model 3. Assign cl u ster v al u es 4. E x plore res u lts MACHINE LEARNING FOR MARKETING IN PYTHON
E x plore v ariables wholesale.agg(['mean','std']).round(0) Fresh Milk Grocery Frozen Detergents_Paper Delicassen mean 12000.0 5796.0 7951.0 3072.0 2881.0 1525.0 std 12647.0 7380.0 9503.0 4855.0 4768.0 2820.0 # Get the statistics averages = wholesale.mean() st_dev = wholesale.std() x_names = wholesale.columns x_ix = np.arange(wholesale.shape[1]) # Plot the data import matplotlib.pyplot as plt plt.bar(x_ix-0.2, averages, color='grey', label='Average', width=0.4) plt.bar(x_ix+0.2, st_dev, color='orange', label='Standard Deviation', width=0.4) plt.xticks(x_ix, x_names, rotation=90) plt.legend() plt.show() MACHINE LEARNING FOR MARKETING IN PYTHON
Bar chart of a v erages and standard de v iations MACHINE LEARNING FOR MARKETING IN PYTHON
Vis u ali z e pair w ise plot to e x plore distrib u tions import seaborn as sns sns.pairplot(wholesale, diag_kind='kde') plt.show() MACHINE LEARNING FOR MARKETING IN PYTHON
Pair w ise plot re v ie w MACHINE LEARNING FOR MARKETING IN PYTHON
Let ' s practice ! MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON
Data preparation for segmentation MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on
Model ass u mptions First w e ' ll start w ith K - means K - means cl u stering w orks w ell w hen data is 1) ~ normall y distrib u ted ( no ske w), and 2) standardi z ed ( mean = 0, standard de v iation = 1) Second model - NMF - can be u sed on ra w data , especiall y if the matri x is sparse MACHINE LEARNING FOR MARKETING IN PYTHON
Unske w ing data w ith log - transformation # First option - log transformation wholesale_log = np.log(wholesale) sns.pairplot(wholesale_log, diag_kind='kde') plt.show() MACHINE LEARNING FOR MARKETING IN PYTHON
E x plore log - transformed data MACHINE LEARNING FOR MARKETING IN PYTHON
Unske w ing data w ith Bo x- Co x transformation # Second option - Box-Cox transformation from scipy import stats def boxcox_df(x): x_boxcox, _ = stats.boxcox(x) return x_boxcox wholesale_boxcox = wholesale.apply(boxcox_df, axis=0) sns.pairplot(wholesale_boxcox, diag_kind='kde') plt.show() MACHINE LEARNING FOR MARKETING IN PYTHON
E x plore Bo x- Co x transformed data MACHINE LEARNING FOR MARKETING IN PYTHON
Scale the data S u btract col u mn a v erage from each col u mn v al u e Di v ide each col u mn v al u e b y col u mn standard de v iation Will u se StandardScaler() mod u le from sklearn from sklearn.preprocessing import StandardScaler scaler = StandardScaler() scaler.fit(wholesale_boxcox) wholesale_scaled = scaler.transform(wholesale_boxcox) wholesale_scaled_df = pd.DataFrame(data=wholesale_scaled, index=wholesale_boxcox.index, columns=wholesale_boxcox.columns) wholesale_scaled_df.agg(['mean','std']).round() Fresh Milk Grocery Frozen Detergents_Paper Delicassen mean -0.0 0.0 0.0 0.0 -0.0 0.0 std 1.0 1.0 1.0 1.0 1.0 1.0 MACHINE LEARNING FOR MARKETING IN PYTHON
Let ' s practice ! MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON
B u ild c u stomer and prod u ct segmentation MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on
Segmentation steps w ith K - means Segmentation w ith K - means ( for k n u mber of cl u sters ): from sklearn.cluster import KMeans kmeans=KMeans(n_clusters=k) kmeans.fit(wholesale_scaled_df) wholesale_kmeans4 = wholesale.assign(segment = kmeans.labels_) MACHINE LEARNING FOR MARKETING IN PYTHON
Segmentation steps w ith NMF Segmentation w ith NMF ( k n u mber of cl u sters ): from sklearn.decomposition import NMF nmf = NMF(k) nmf.fit(wholesale) components = pd.DataFrame(nmf.components_, columns=wholesale.columns) E x tracting segment assignment : segment_weights = pd.DataFrame(nmf.transform(wholesale, columns=components.index) segment_weights.index = wholesale.index wholesale_nmf = wholesale.assign(segment = segment_weights.idxmax(axis=1)) MACHINE LEARNING FOR MARKETING IN PYTHON
Ho w to initiali z e the n u mber of segments ? Both K - means and NMF req u ire to set a n u mber of cl u sters ( k ) T w o w a y s to de � ne k : 1) Mathematicall y, 2) Test & learn We ' ll e x plore mathematical elbo w criterion method to get a ball - park estimate MACHINE LEARNING FOR MARKETING IN PYTHON
Elbo w criterion method Iterate thro u gh a n u mber of k v al u es R u n cl u stering for each on the same data Calc u late s u m of sq u ared errors ( SSE ) for each Plot SSE against k and identif y the " elbo w" - diminishing incremental impro v ements in error red u ction MACHINE LEARNING FOR MARKETING IN PYTHON
Calc u late s u m of sq u ared errors and plot the res u lts sse = {} for k in range(1, 11): kmeans=KMeans(n_clusters=k, random_state=333) kmeans.fit(wholesale_scaled_df) sse[k] = kmeans.inertia_ plt.title('Elbow criterion method chart') sns.pointplot(x=list(sse.keys()), y=list(sse.values())) plt.show() MACHINE LEARNING FOR MARKETING IN PYTHON
Identif y ing the optimal n u mber of segments MACHINE LEARNING FOR MARKETING IN PYTHON
Test & learn method First , calc u late mathematicall y optimal n u mber of segments B u ild segmentation w ith m u ltiple v al u es aro u nd the optimal k v al u e E x plore the res u lts and choose one w ith most b u siness rele v ance ( Can y o u name the segments ? Are the y ambig u o u s / o v erlapping ?) MACHINE LEARNING FOR MARKETING IN PYTHON
Let ' s b u ild c u stomer segments ! MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON
Vis u ali z e and interpret segmentation sol u tions MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on
Methods to e x plore segments Calc u late a v erage / median / other percentile v al u es for each v ariable b y segment Calc u late relati v e importance for each v ariable b y segment We can e x plore the data table or plot it ( heatmap is a good choice ) MACHINE LEARNING FOR MARKETING IN PYTHON
Anal yz e a v erage K - means segmentation attrib u tes kmeans4_averages = wholesale_kmeans4.groupby(['segment']).mean().round(0) print(kmeans4_averages) MACHINE LEARNING FOR MARKETING IN PYTHON
Plot a v erage K - means segmentation attrib u tes sns.heatmap(kmeans4_averages.T, cmap='YlGnBu') plt.show() MACHINE LEARNING FOR MARKETING IN PYTHON
Plot a v erage NMF segmentation attrib u tes nmf4_averages = wholesale_nmf4.groupby('segment').mean().round(0) sns.heatmap(nmf4_averages.T, cmap='YlGnBu') plt.show() MACHINE LEARNING FOR MARKETING IN PYTHON
Let ' s b u ild 3- segment sol u tions ! MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON
Congrat u lations ! MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON Karolis Urbonas Head of Anal y tics & Science , Ama z on
What ha v e w e learned ? Di � erent t y pes of machine learning - s u per v ised , u ns u per v ised , reinforcement Machine learning steps Data preparation techniq u es for di � erent kinds of models Predict telecom c u stomer ch u rn w ith logistic regression and decision trees Calc u late c u stomer lifetime v al u e Predict ne x t month transactions w ith linear regression Meas u re model performance w ith m u ltiple metrics Segment c u stomers based on their prod u ct p u rchase histor y w ith K - means and NMF MACHINE LEARNING FOR MARKETING IN PYTHON
What ' s ne x t ? Di v e deeper into each topic E x plore the datasets , change the parameters and tr y to impro v e model acc u rac y, or segmentation interpretabilit y Take on a project w ith other dataset , and b u ild models w ith comments b y y o u rself Write a blog post w ith link to GitH u b code once y o u � nish y o u r project Test y o u r kno w ledge in y o u r job MACHINE LEARNING FOR MARKETING IN PYTHON
Thank y o u and great learning ! MAC H IN E L E AR N IN G FOR MAR K E TIN G IN P YTH ON
Recommend
More recommend