applications of metric e v al u ation
play

Applications of metric e v al u ation P R E D IC TIN G C TR W ITH - PowerPoint PPT Presentation

Applications of metric e v al u ation P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor Fo u r categories of o u tcomes First part of categor y ( tr u e / false ) represents w hether model w as


  1. Applications of metric e v al u ation P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  2. Fo u r categories of o u tcomes First part of categor y ( tr u e / false ) represents w hether model w as correct or not Second part of the categor y ( positi v e / negati v e ) represents the target label the model applied PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  3. Interpretations of fo u r categories If model predicts there is a click , then there is a bid for that impression w hich costs mone y If no click predicted , no bidding and hence no cost Tr u e positi v es ( TP ): mone y gained ( impressions paid for that w ere clicked on ). False positi v es ( FP ): mone y lost ( impressions that w ere paid for , b u t not clicked ). Tr u e negati v es ( TN ): mone y sa v ed ( no click predicted so no impressions bo u ght ). False negati v es ( FN ): mone y lost o u t on ( no click predicted , b u t w o u ld ha v e been act u al click in realit y). PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  4. Conf u sion matri x print(confusion_matrix(y_test, y_pred)) [[8163 166] [1517 154]] # Order: tn, fp, fn, tp print(confusion_matrix(y_test, y_pred).ravel()) [8163, 166, 1517, 154] PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  5. ROI anal y sis Ass u me : some cost c and ret u rn r per X n u mber of impressions total_return = tp * r total_cost = (tp + fp) * c tp * r > (tp + fp) * c roi = total_return / total_spent PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  6. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

  7. Model e v al u ation P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  8. Precision and recall Precision : proportion of clicks relati v e to total n u mber of impressions , TP / (TP + FP) Higher precision means higher ROI on ad spend Recall : the proportion of clicks go � en of all clicks a v ailable , TP / (TP + FN) Higher recall means be � er targeting of rele v ant a u dience PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  9. Calc u lating precision and recall print(precision_score( y_test, y_pred, average = 'weighted')) 0.73 print(recall_score( y_test, y_pred, average = 'weighted')) 0.75 PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  10. Baseline classifiers It is important to e v al u ate classi � ers relati v e to an appropriate baseline The baseline here , d u e to imbalanced nat u re of click data , is a classi � er that al w a y s predicts no click y_pred = np.asarray([0 for x in range(len(X_test))]) [[0] [0] ...] PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  11. Implications on ROI anal y sis For the baseline classi � er , tp and fp w ill be z ero Therefore total ret u rn and total spend w ill be z ero , and ROI u nde � ned Conf u sion matri x v ia confusion_matrix() along w ith ravel() to get the fo u r categories of o u tcomes total_return = tp * r total_spent = (tp + fp) * cost roi = total_return / total_spent PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  12. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

  13. T u ning models P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  14. Reg u lari z ation Reg u lari z ation : addressing o v er � � ing b y altering the magnit u de of coe � cients of parameters w ithin a model Reg u lari z ation can increase performance metrics and hence ROI on ad spend PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  15. E x amples of reg u lari z ation Logistic Regression : the C parameter is the in v erse of the reg u lari z ation strength . From least to most comple x: C=0.05 < C=0.5 < C=1 Decision Tree : the max_depth parameter controls ho w man y la y ers deep the tree can gro w. From least to most comple x: max_depth=3 < max_depth=5 < max_depth=10 PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  16. Cross v alidation For each of the k folds , that fold w ill be u sed as a testing set ( for v alidation ) w hile other k-1 are u sed as training . Therefore , y o u ha v e k e v al u ations of model performance . Note y o u still ha v e the separate e v al u ation testing set . PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  17. E x amples of cross v alidation k_fold = KFold(n_splits = 4, random_state = 0) for i in [3, 5, 10]: clf = DecisionTreeClassifier(max_depth = i) cv_precision = cross_val_score( clf, X_train, y_train, cv = k_fold, scoring = 'precision_weighted') Scoring strings : precision_weighted, recall_weighted, roc_auc PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  18. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

  19. Ensembles and h y perparameter t u ning P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON Ke v in H u o Instr u ctor

  20. Ensemble methods Bagging : random samples selected for di � erent models , then models are indi v id u all y trained and combined . PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  21. Random forests clf = RandomForestClassifier() print(clf) RandomForestClassifier( bootstrap=True, ... max_depth = 10, ... n_estimators = 100, ...) PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  22. H y perparameter t u ning H y perparameter : parameters con � g u red before training , and e x ternal to a model E x amples of parameters b u t NOT h y perparameters : slope coe � cient in linear regression , w eights in logistic regression , etc . E x amples of h y perparameters : max_depth , n_estimators , etc . PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  23. Grid search param_grid = {'n_estimators': n_estimators, 'max_depth': max_depth} clf = GridSearchCV(estimator = model, param_grid = param_grid, scoring = 'roc_auc') print(clf.best_score_) print(clf.best_estimator_) 0.6777 RandomForestClassifier(max_depth = 100, ...) PREDICTING CTR WITH MACHINE LEARNING IN PYTHON

  24. Let ' s practice ! P R E D IC TIN G C TR W ITH MAC H IN E L E AR N IN G IN P YTH ON

Recommend


More recommend