un unsu super pervised vised ensem semble ble of rankin
play

Un Unsu super pervised vised Ensem semble ble of Rankin king - PowerPoint PPT Presentation

Un Unsu super pervised vised Ensem semble ble of Rankin king Mo Models els for New ews s Co Commen ents s Usin Using Pseu seudo An Answ swer ers Soichiro Fujita 1 , Hayato Kobayashi 2 , Manabu Okumura 1 1. Tokyo Institute of


  1. Un Unsu super pervised vised Ensem semble ble of Rankin king Mo Models els for New ews s Co Commen ents s Usin Using Pseu seudo An Answ swer ers Soichiro Fujita 1 , Hayato Kobayashi 2 , Manabu Okumura 1 1. Tokyo Institute of Technology 2. Yahoo Japan Corporation / RIKEN AIP

  2. Background und Ta Task : Ran Ranking g comme mments ts on online news ws se services Go Goal : Di Disp splay ay high gh qu qual ality ty comments ts Problem : `` Pr `` high gh qu qual ality ty‘’ has as complex fac factors No.3 ! No.1 Mo Model el No.2 News Article User comments Ranking

  3. Rank nking ng ne news comment nts is difficul ult • We have various situations of judging whether a comment is good - Indicating rare user experiences - Providing new ideas - Causing discussions • Ranking models often fail to capture these information How to de Ho deal with th th this problem? → En Ense sembl mble techniques If we prepare many models, some models can capture these information

  4. Two Basic Ens nsemble Techni hnique ues Se Selecting Averaging ng model outputs selected outputs model outputs Final output ⨁ ✓ Denoising lower accuracy model ✓ Make up for other models’ mistakes ✘ Depend on a single model output ✘ Lower accuracy model could be noise

  5. Proposed metho hod: HPA ens nsemble • H ybrid method using the P seudo A nswer - Hybrid of an output selection and a typical averaging method - Dynamic denoising of outputs via a pseudo answer ̅ " ! ! " % " as an ideal ranking ! Weighting Treat ̅ # !̅ ! & Remove lower accuracy outputs ! ' ! Selecting $

  6. St Step1: Cal Calculat ate a a pse seudo o an answ swer Pseudo Answer: ̅ $ Rankings: ! ⨁ Normalize & Average 1 r X r = ¯ || r || . | R | r 2 R ・ Model Mo el ・ ・ Each block represents a ranking score of a comment Ranking scores: r

  7. St Step2: Cal Calculat ate si similar arity sc scor ores s of eac ach predicted ran anking Pseudo Answer: ̅ $ Regard the pseudo answer as an ideal ranking Rankings: ! ⨁ 0.87 Using NDCG for a similarity function 0.75 0.65 g ( r ) = NDCG(¯ r, r ) ・ ・ ・ 0.92

  8. Step3: Cal St ate the fina nal rank nking ng from si Calculat similarity sc scor ores Pseudo Answer: ̅ $ Selecting the top k models with the highest scores Rankings: ! ⨁ 0.87 0.87 Final ranking 0.75 ⨁ 0.65 ・ ・ ・ ・ ・ ・ 0.92 0.92 Weighted average of selected models

  9. Experiment ntal Setting ngs Da Datase set : YJ Constructiv ive Comment Rankin ing Dataset Dataset Link Train 1,300 articles , Validation 113 articles , Test 200 articles (each article associated with more than 100 comments) Mo Models dels : LS LSTM-ba based Ra RankNet Net Prepared 100 different models by random initialization Metr Metric ics : ND NDCG@ k and Pr Precision@ k ( k ∈ { 1, 5, 10} )

  10. Evalua uation n Resul ults NDC NDCG Pr Prec. Met Methods @1 @1 @5 @5 @10 @1 @1 @1 @5 @5 @1 @10 RankNet 76.35 77.97 79.52 15.0 33.20 42.99 Best single model NormAvg 79. 79.83 83 80. 80.77 77 82. 82.16 16 17.08 17. 08 37. 37.18 18 46.48 Unsupervised baseline SupWeight 78.64 80.33 81.94 16.28 35.47 46.58 46. 58 Supervised baseline HPA 37.39 79.87 79. 87 81. 81.43 43 82. 82.33 33 17. 17.08 08 47. 47.34 34 Ours Ours w/o weighting SPA 79.68 80.96 82.19 35.87 46.68 17. 17.08 08 Ours w/o selecting WPA 81.39 82.17 46.63 79.87 79. 87 17. 17.08 08 37. 37.88 88 This is a part of the results. Please see Table 1 in our paper if you want to find other baselines.

  11. Evalua uation n Resul ults NDC NDCG Pr Prec. Met Methods @1 @1 @5 @5 @1 @10 @1 @1 @5 @5 @1 @10 RankNet 76.35 77.97 79.52 15.0 33.20 42.99 Best single model NormAvg 79. 79.83 83 80. 80.77 77 82. 82.16 16 17. 17.08 08 37. 37.18 18 46.48 Unsupervised baseline SupWeight 78.64 80.33 81.94 16.28 35.47 46.58 46. 58 Supervised baseline HPA 37.39 79.87 79. 87 81. 81.43 43 82.33 82. 33 17. 17.08 08 47.34 47. 34 Ours Ours w/o weighting SPA 79.68 80.96 82.19 35.87 46.68 17. 17.08 08 Ours w/o selecting WPA 81.39 82.17 46.63 79.87 79. 87 17.08 17. 08 37. 37.88 88 Ou Our r method od achie ieved the best perfor formance

  12. Evalua uation n Resul ults NDC NDCG Pr Prec. Met Methods @1 @1 @5 @5 @1 @10 @1 @1 @5 @5 @1 @10 RankNet 76.35 77.97 79.52 15.0 33.20 42.99 Best single model NormAvg 79.83 79. 83 80. 80.77 77 82. 82.16 16 17. 17.08 08 37. 37.18 18 46.48 Unsupervised baseline SupWeight 78.64 80.33 81.94 16.28 35.47 46. 46.58 58 Supervised baseline HPA 37.39 79. 79.87 87 81. 81.43 43 82. 82.33 33 17. 17.08 08 47. 47.34 34 Ours Ours w/o weighting SPA 79.68 80.96 82.19 35.87 46.68 17.08 17. 08 Ours w/o selecting WPA 81.39 82.17 46.63 79.87 79. 87 17.08 17. 08 37. 37.88 88 Hy Hybrid d of weigh ghti ting g and d selecti ting g is effecti tive

  13. Conc nclus usion Proposed Method : Pr - A hybrid unsupervised method using pseudo answers Result : Re - Our method achieved the best performance - Denoising predicted rankings using the pseudo answer is effective Fut Futur ure e wor ork : - Combine various types of network structures - Investigate effectiveness of our methods on other ranking data sets

Recommend


More recommend