introd u ction to click thro u gh rates
play

Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W - PowerPoint PPT Presentation

Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor Click - thro u gh rates Click - thro u gh rate : # of clicks on ads / # of v ie w s of ads Companies and


  1. Introd u ction to click - thro u gh rates P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  2. Click - thro u gh rates Click - thro u gh rate : # of clicks on ads / # of v ie w s of ads Companies and marketers ser v ing ads w ant to ma x imi z e click - thro u gh rate Prediction of click - thro u gh rates is critical for companies and marketers PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  3. A classification lens Classi � cation : assigning categories to obser v ations Classi � ers u se training data and are e v al u ated on testing data Target : a binar y v ariable , 0/1 for non - click or click Feat u re : an y v ariable u sed to help predict the target PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  4. A brief look sample data Each ro w represents a partic u lar o u tcome of click or not click for a gi v en u ser for a gi v en ad Filtering for col u mns can be done thro u gh .isin() : df.columns.isin(['device'])] Ass u ming y is a col u mn of clicks , CTR can be fo u nd b y: y.sum()/len(y) PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  5. Anal yz ing feat u res print(df.device_type.value_counts()) 1 45902 0 2947 print(df.groupby('device_type')['click'].sum()) 0 633 1 7890 PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  6. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

  7. O v er v ie w of machine learning models P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  8. Logistic regression Logistic regression : linear classi � er bet w een dependent v ariable and independent v ariables PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  9. Training the model Can create the model v ia : clf = LogisticRegression() Each classi � er has a fit() method w hich takes in an X_train, y_train : clf.fit(X_train, y_train) X_train is the v ector of training feat u res , y_train is the v ector of training targets Classi � er sho u ld onl y see training data to a v oid " seeing ans w ers beforehand " PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  10. Testing the model Each classi � er has a predict() method w hich takes in an X_test to generate a y_test as follo w s : array([0, 1, 1, ..., 1, 0, 1]) predict_proba() method prod u ces probabilit y scores array([0.2, 0.8], [0.4, 0.6] ..., [0.1, 0.9] [0.3, 0.7]]) Score re � ects probabilit y of a partic u lar ad being clicked b y partic u lar u ser PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  11. E v al u ating the model Acc u rac y: the percentage of test targets correctl y identi � ed accuracy_score(y_test, y_pred) Sho u ld not be the onl y metric to e v al u ate model , partic u larl y in imbalanced datasets CTR prediction is an e x ample w here classes are imbalanced PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  12. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

  13. CTR prediction u sing decision trees P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  14. Decision trees Sample o u tcomes are sho w n in table belo w: First split is based on age of application For y o u th gro u p , second split is based on st u dent stat u s Model pro v ides he u ristics for u nderstanding is _ st u dent loan Nodes represent the feat u res middle _ aged 1 Branches represent the decisions based on feat u res y o u th no 0 y o u th y es 1 PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  15. Training and testing the model Create v ia : clf = DecisionTreeClassifier() Similar to logistic regression , a decision tree also in v ol v es clf.fit(X_train, y_train) for training data and clf.predict(X_test) for testing labels : array([0, 1, 1, ..., 1, 0, 1]) clf.predict_proba(X_test) for probabilit y scores : array([0.2, 0.8], [0.4, 0.6] ..., [0.1, 0.9] [0.3, 0.7]]) E x ample for randoml y spli � ing training and testing data , w here testing data is 30% of total sample si z e : train_test_split(X, y, test_size = .3, random_state = 0) PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  16. E v al u ation w ith ROC c u r v e Tr u e positi v e rate ( Y - a x is ) = #( classi � er predicts positi v e , act u all y positi v e ) / #( positi v es ) False positi v e rate ( X - a x is ) = #( classi � er predicts positi v e , act u all y negati v e ) / #( negati v es ) Do � ed bl u e line : baseline AUC of 0.5 Want orange line ( AUC ) to be as close to 1 as possible PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  17. AUC of ROC c u r v e Y_score = clf.predict_proba(X_test) fpr, tpr, thresholds = roc_curve(Y_test, Y_score[:, 1]) roc_curve() inp u ts : test and score arra y s roc_auc = auc(fpr, tpr) auc() inp u t : false - positi v e and tr u e - positi v e arra y s If model is acc u rate and CTR is lo w, y o u ma y w ant to reassess ho w the ad message is rela y ed and w hat a u dience it is targeted for PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  18. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

Recommend


More recommend