Usage Aware Average-Clicks Kalyan Beemanapalli University of - PowerPoint PPT Presentation

Usage Aware Average-Clicks Kalyan Beemanapalli – University of Minnesota Ramya Rangarajan – University of Minnesota Jaideep Srivastava – University of Minnesota Presenter: Kalyan Beemanapalli WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, at KDD 2006, Philadelphia, PA, USA Intel IT Research 1

Outline � Introduction � Related Work � Background � Method � Experiments and Results � Key Contributions � Conclusions and Future Work � Questions WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 2 at KDD 2006, Philadelphia, PA, USA

Related Work – Link Analysis � Applications � PageRank � HITS � Average-Clicks ( Matsuo et al ) � Disadvantage � Static WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 3 at KDD 2006, Philadelphia, PA, USA

Related work � Solution � Usage Data � Why Usage Aware Average-Clicks? � Average-Clicks � Fairly new algorithm � Proposes a new definition to distance between web pages � Measures distance in user’s context � Ideas from � Usage Aware PageRank ( Oztekin et al ) � Extensions to HITS ( Miller et al ) WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 4 at KDD 2006, Philadelphia, PA, USA

Average-Clicks � Measure of distance between web pages � Definition – An average click is one click among n links � Probability of a random surfer on a page p to click any one of the links is where α = Damping Factor WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 5 at KDD 2006, Philadelphia, PA, USA

Average Clicks � Average Click length of links on page p = Where α = Damping Factor, n = Average Number of links on a page Distance between page p and q � shortest path between the nodes representing the pages in the graph � Path through a longer chain of links can be considered shorter than one through smaller number of links WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 6 at KDD 2006, Philadelphia, PA, USA

Average Clicks - Example WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 7 at KDD 2006, Philadelphia, PA, USA

Usage Aware Average-Clicks Usage Graph No. of occurrences of Q each page P T R No. of co- Number of co - occurences of p, q = occurrences of C ( p , q ) S Number of occurences of p pages Weight of the edge from p to q = C ( p , q ) Weight assigned to node p WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 8 at KDD 2006, Philadelphia, PA, USA

Usage Aware Average-Clicks Link Graph Q P T R S = D ( i , j ) ( 1/Outdegre e(page i)) if there is a link to page j on page i ∞ otherwise WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 9 at KDD 2006, Philadelphia, PA, USA

Usage Aware Average-Clicks � We now have Number of co - occurences of p, q = C ( p , q ) Number of occurences of p = D ( p , q ) ( 1/Outdegre e(page p)) if there is a link to page j on page i ∞ otherwise � We combine the Link Matrix and Usage Matrix to define the new definition of distance between 2 pages as follows: ⎛ ⎞ − α = − log ⎜ ⎟ Dis tan ce ( p , q ) ( 1 C ( p , q )) * ( ⎝ ⎠ Out deg ree ( p ) n WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 10 at KDD 2006, Philadelphia, PA, USA

Usage Aware Average-Clicks � Shortest distance between pairs of nodes – all pairs shortest path algorithm � All Pairs Shortest path algorithm used – Floyd Warshall’s Algorithm � Implementation Issues � Poor scalability WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 11 at KDD 2006, Philadelphia, PA, USA

Solution Set of links for page 0 0 1 2 Template for each node Page ID Avg Click Score Vector holding the heads Usage Score of linked lists Usg Avr Avg Click Score Data Structure for Floyd Warshall WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 12 at KDD 2006, Philadelphia, PA, USA

Experimental Results � Experiments conducted on www.cs.umn.edu � Usage data collected in Apr 2006 � Data set reduced to 100,000 sessions � Noise removed � Link Graph built using our crawler WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 13 at KDD 2006, Philadelphia, PA, USA

Example Distances WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 14 at KDD 2006, Philadelphia, PA, USA

Evaluation Methodology � Domain Expert’s View � Questionnaires � User’s View � Questionnaires � Automate verification � Our Method � Predicting Power WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 15 at KDD 2006, Philadelphia, PA, USA

Evaluation Methodology � Incorporated into a recommender system � Idea - pages that are close to each other are more similar to each other than pages that are farther apart � Performance compared with ‘2, -1’ model � Tested on www.cs.umn.edu WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 16 at KDD 2006, Philadelphia, PA, USA

The Recommender System Architecture Offline Web Logs Website Session Usage Aware Identification Average-Clicks Session Alignment Generation … … Session Similarity … … Graph Usage Aware Average- Clicks Partitioning Hierarchy Sessions Session Clusters Get Clickstream Trees Recommendations Recommendations Recommendation System HTML + Recommendations Web Client Webpage request Web Server Online WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 17 at KDD 2006, Philadelphia, PA, USA

Evaluation Measures � Hit Ratio (HR) : Percentage of hits . If a recommended page is actually requested later in the session, we declare a hit. � Click Reduction (CR) : For a test session (p1, p2,…, pi…, pj…, pn) , if pj is recommended at page pi , and pj is subsequently accessed in the session, then the click reduction due to this recommendation is, − j i = Click reduction i WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 18 at KDD 2006, Philadelphia, PA, USA

Experimental Set-up � 1000 training sessions � 3, 5, 10 recommendations � 10, 15 and 20 ClickStream Clusters � Different testing sessions � Experiment repeated 5 times using different training set � Results compared against the ‘2, -1’ model � T-tests performed � Same procedure for 3000 training sessions WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 19 at KDD 2006, Philadelphia, PA, USA

Results WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 20 at KDD 2006, Philadelphia, PA, USA

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 21 at KDD 2006, Philadelphia, PA, USA

% Path Reduction WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 22 at KDD 2006, Philadelphia, PA, USA

WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 23 at KDD 2006, Philadelphia, PA, USA

Conclusion � Incorporated usage data into Average Clicks algorithm. � Proposed a distance model using usage data and link graph � Used this method to calculate the similarity between the pages in an intranet domain � Showed that using a combination of web graph and link graph will provide better recommendations WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 24 at KDD 2006, Philadelphia, PA, USA

Future Work � Validate the algorithm using various testing methods like � Domain expert testing � User’s perspective � Compare the algorithm against other usage based link analysis algorithms � Compare the quality of recommendations with those obtained by using other kinds of domain information WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 25 at KDD 2006, Philadelphia, PA, USA

Questions WebKDD 2006 Workshop on Knowledge Discovery on the Web, Aug. 20, 2006, Intel IT Research 26 at KDD 2006, Philadelphia, PA, USA

Usage Aware Average-Clicks Kalyan Beemanapalli University of - PowerPoint PPT Presentation

Usage Aware Average-Clicks Kalyan Beemanapalli University of Minnesota Ramya Rangarajan University of Minnesota Jaideep Srivastava University of Minnesota Presenter: Kalyan Beemanapalli WebKDD 2006 Workshop on Knowledge Discovery on

Commercial Energy Usage District Fuel We have reduced Fuel Usage FY03 - FY08 our average

Advanced Usage of Multi Site Functionality Advanced Usage of Multi Site Functionality by by

Clicks within Traffjc to Entrances to Traffjc to Total Cost Clicks Impressions Employer

www.apse.org.uk COST PI 03 Operational recovery ratio (excluding CEC) www.apse.org.uk COST PI

QAVA: Quota Aware Video Adaptation Jiasi Chen, Amitabha Ghosh, Josphat Magutt, Mung Chiang

QAVA: Quota Aware Video Adaptation Jiasi Chen, Amitabha Ghosh, Mung Chiang Princeton University

Social (Media) Jet Lag: how usage of social technology can modulate and reflect circadian

Transient Conduction Cont. Average property values Property Symbol Average Units kg/m 3

CALLS, CLICKS AND CHATS WHERE MARKETING ENDS AND SELLING

Clicks for Cash commonsense.org/education Shareable with attribution for noncommercial use.

CVUSD Parent Forum Friday, May 21, 2010 1 Recent Statistics & Teen Usage Unintentional

CISC / RISC Complex / Reduced Instruction Set Computers CISC / RISC p. 1/12 Instruction

Lab as a Service Compose Your Cloud Automatically with Few Clicks Parker Berberian, UNH Fatih

Euthanasia and PAS Euthanasia an ambigous concept Analytical usage vs. normative usage

IN SUPPORT OF WORKLOAD-AWARE STREAMING STATE MANAGEMENT Vasiliki Kalavri John Liagouris

Toolkit to Support Intelligibility in Context Aware Applications Context-Aware Applications P

Clicks vs Bricks on Campus: Assessing the environmental impact of online food shopping. Sharon

Reducing Average Handling Times in your Contact Centre What is Average Handle Time? Average

Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model Aleksandr Chuklin

ONLINE ADVERTISING What is SIBC online? SIBC Online is a leading online news source for the

Combating Click Fraud Using Premium Clicks Sid Stamm , RavenWhite Inc. and Indiana University

Tutorial Session Agenda 7:00-7:30 Introduction, overview, and basic usage C. Rumsey (NASA

Releasing Search Queries and Clicks Privately Arne Bayer July 24, 2017 Arne Bayer Releasing

Report - Facebook, Instagram, Twitter Feb 01 - Feb 28, 2019 @ConversionIA conversioninteract

Usage Aware Average-Clicks Kalyan Beemanapalli University of - PowerPoint PPT Presentation

Usage Aware Average-Clicks Kalyan Beemanapalli University of Minnesota Ramya Rangarajan University of Minnesota Jaideep Srivastava University of Minnesota Presenter: Kalyan Beemanapalli WebKDD 2006 Workshop on Knowledge Discovery on

Commercial Energy Usage District Fuel We have reduced Fuel Usage FY03 - FY08 our average

Advanced Usage of Multi Site Functionality Advanced Usage of Multi Site Functionality by by

Clicks within Traffjc to Entrances to Traffjc to Total Cost Clicks Impressions Employer

www.apse.org.uk COST PI 03 Operational recovery ratio (excluding CEC) www.apse.org.uk COST PI

QAVA: Quota Aware Video Adaptation Jiasi Chen, Amitabha Ghosh, Josphat Magutt, Mung Chiang

QAVA: Quota Aware Video Adaptation Jiasi Chen, Amitabha Ghosh, Mung Chiang Princeton University

Social (Media) Jet Lag: how usage of social technology can modulate and reflect circadian

Transient Conduction Cont. Average property values Property Symbol Average Units kg/m 3

CALLS, CLICKS AND CHATS WHERE MARKETING ENDS AND SELLING

Clicks for Cash commonsense.org/education Shareable with attribution for noncommercial use.

CVUSD Parent Forum Friday, May 21, 2010 1 Recent Statistics &amp; Teen Usage Unintentional

CISC / RISC Complex / Reduced Instruction Set Computers CISC / RISC p. 1/12 Instruction

Lab as a Service Compose Your Cloud Automatically with Few Clicks Parker Berberian, UNH Fatih

Euthanasia and PAS Euthanasia an ambigous concept Analytical usage vs. normative usage

IN SUPPORT OF WORKLOAD-AWARE STREAMING STATE MANAGEMENT Vasiliki Kalavri John Liagouris

Toolkit to Support Intelligibility in Context Aware Applications Context-Aware Applications P

Clicks vs Bricks on Campus: Assessing the environmental impact of online food shopping. Sharon

Reducing Average Handling Times in your Contact Centre What is Average Handle Time? Average

Incorporating Clicks, Attention and Satisfaction into a SERP Evaluation Model Aleksandr Chuklin

ONLINE ADVERTISING What is SIBC online? SIBC Online is a leading online news source for the

Combating Click Fraud Using Premium Clicks Sid Stamm , RavenWhite Inc. and Indiana University

Tutorial Session Agenda 7:00-7:30 Introduction, overview, and basic usage C. Rumsey (NASA

Releasing Search Queries and Clicks Privately Arne Bayer July 24, 2017 Arne Bayer Releasing

Report - Facebook, Instagram, Twitter Feb 01 - Feb 28, 2019 @ConversionIA conversioninteract

CVUSD Parent Forum Friday, May 21, 2010 1 Recent Statistics & Teen Usage Unintentional