Relevance of Time Spent on Web Pages WEBKDD August 20, 2006, Philadelphia, USA Peter I. Hofgesang hpi@few.vu.nl
Intention of an online visitor • Real-world: customers have the ability to explicitly express what they are looking for • Web: intention is hidden and can only be partially revealed from implicit indicators in the traces users leave behind WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
(Broadly) Available information • Order of visited pages (P1 � P2 � P3 …) • Page popularity (nr. of times visited) • Time Spent on Page (TSP)? – claimed to be important in IR, HCI, E-learning – only rarely used in WUM – details are often not reported (however, preprocessing is not obvious!) WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Example I SALES-Woman 26 SALES-Woman 25 SALES-Woman 24 SALES-Woman 23 SALES-Woman 22 SALES-Woman 21 SALES-Woman 20 SALES-Man 19 SALES-Man 18 SALES-Man 17 SALES-Man 16 Pages visited SALES-Man 15 SALES-Man 14 SALES-Man 13 SALES-Man 12 SALES-Rest 11 SHOP-Household 10 SHOP-Household 9 SHOP-Household 8 SHOP-Household 7 SHOP-Household 6 SHOP-Household 5 SALES-Household 4 SALES-Household 3 SALES-Rest 2 Home 1 0 20 40 60 80 100 120 TSoP (seconds) WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Example II SHOP-PC 11 SHOP-Home 10 SHOP-Child 9 SHOP-Home 8 SHOP-Child 7 Pages visited Info 6 SHOP-PC 5 SHOP-Child 4 SHOP-TV 3 Info 2 SHOP-PC 1 0 20 40 60 80 100 120 140 160 180 200 TSoP (seconds) WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Example III SHOP-DVD 25 SHOP-DVD 24 SHOP-DVD 23 SHOP-DVD 22 SHOP-DVD 21 SHOP-DVD 20 SHOP-DVD 19 SHOP-DVD 18 SHOP-DVD 17 Home 16 Personal 15 Pages visited Personal 14 Personal 13 Personal 12 Personal 11 Personal 10 Order 9 Order 8 Order 7 CART-Home 6 CART-Add 5 CART-Add 4 CART-Home 3 SHOP-TV 2 Home 1 0 500 1000 1500 TSoP (seconds) WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Influential factors I TSP 1 = T2 − T1 (optimistic!) • Data preprocessing – filtering out robot transactions – session identification 5 2.2 x 10 Bank 2 Retail 1 • Distraction 1.8 Retail 2 1.6 1.4 Number of clicks 1.2 1 0.8 0.6 0.4 0.2 0 0 10 20 30 40 50 60 70 80 90 100 110 Time (seconds) WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Influential factors II • Page type (Granularity of pages) • Hierarchy • Network bandwidth and server load • Speed of reading, etc. TSP 2 = T2 − T1 − T networkTraffic − T serverPageGeneration − T distraction WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Clustering 27.63% 21.18% 19.45% 19.14% 12.6% Catalog 40 40 40 40 40 Contact 39 39 39 39 39 Info 38 38 38 38 38 ThankYou 37 37 37 37 37 Order 36 36 36 36 36 DirectOrder 35 35 35 35 35 CART-Remove 34 34 34 34 34 CART-Change 33 33 33 33 33 CART-Add 32 32 32 32 32 Cadez et al. (2001) CART-Home 31 31 31 31 31 Search 30 30 30 30 30 SALES-Rest 29 29 29 29 29 SALES-Games 28 28 28 28 28 SALES-TV 27 27 27 27 27 SALES-DVD 26 26 26 26 26 SALES-GSM 25 25 25 25 25 SALES-PC 24 24 24 24 24 SALES-Sport 23 23 23 23 23 PageId SALES-Household 22 22 22 22 22 SALES-Home 21 21 21 21 21 SALES-Beauty 20 20 20 20 20 SALES-Child 19 19 19 19 19 SALES-Man 18 18 18 18 18 SALES-Woman 17 17 17 17 17 SHOP-Rest 16 16 16 16 16 SHOP-Special 15 15 15 15 15 SHOP-Games 14 14 14 14 14 SHOP-TV 13 13 13 13 13 SHOP-DVD 12 12 12 12 12 SHOP-GSM 11 11 11 11 11 SHOP-PC 10 10 10 10 10 SHOP-Sport 9 9 9 9 9 SHOP-Household 8 8 8 8 8 SHOP-Home 7 7 7 7 7 SHOP-Beauty 6 6 6 6 6 SHOP-Child 5 5 5 5 5 SHOP-Man 4 4 4 4 4 SHOP-Woman 3 3 3 3 3 Personal 2 2 2 2 2 Home 1 1 1 1 1 0 0.1 0.2 0 0.5 0 0.1 0.2 0 0.1 0.2 0 0.2 0.4 33.03% 22.01% 19.56% 17.63% 7.78% Catalog 40 40 40 40 40 Contact 39 39 39 39 39 Info 38 38 38 38 38 ThankYou 37 37 37 37 37 Order 36 36 36 36 36 DirectOrder 35 35 35 35 35 CART-Remove 34 34 34 34 34 CART-Change 33 33 33 33 33 CART-Add 32 32 32 32 32 CART-Home 31 31 31 31 31 Search 30 30 30 30 30 SALES-Rest 29 29 29 29 29 SALES-Games 28 28 28 28 28 SALES-TV 27 27 27 27 27 SALES-DVD 26 26 26 26 26 SALES-GSM 25 25 25 25 25 SALES-PC 24 24 24 24 24 SALES-Sport 23 23 23 23 23 PageId SALES-Household 22 22 22 22 22 SALES-Home 21 21 21 21 21 SALES-Beauty 20 20 20 20 20 SALES-Child 19 19 19 19 19 SALES-Man 18 18 18 18 18 SALES-Woman 17 17 17 17 17 SHOP-Rest 16 16 16 16 16 SHOP-Special 15 15 15 15 15 SHOP-Games 14 14 14 14 14 SHOP-TV 13 13 13 13 13 SHOP-DVD 12 12 12 12 12 Similarity based SHOP-GSM 11 11 11 11 11 SHOP-PC 10 10 10 10 10 SHOP-Sport 9 9 9 9 9 SHOP-Household 8 8 8 8 8 SHOP-Home 7 7 7 7 7 SHOP-Beauty 6 6 6 6 6 SHOP-Child 5 5 5 5 5 SHOP-Man 4 4 4 4 4 SHOP-Woman 3 3 3 3 3 Personal 2 2 2 2 2 Home 1 1 1 1 1 0 0.1 0.2 0 0.5 1 0 0.5 0 0.2 0.4 0 0.5 1
Conclusion • TSP is a sensitive measure • Web log data preprocessing and Time normalization required • Added value in identifying user intention • For many applications the combination of TSP and frequency may be the optimal choice WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Future (current) work • Objective measures of relevance • Normally field experiment to provide some kind of labeled data • Special testbed � – e.g., in case of a retail shop environment we have special labels for buyers – the purchased items indicate user interest and can be compared with the visit WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA
Questions?
Recommend
More recommend