Depth-first Traversal over a Mirrored Space for Non-redundant - PowerPoint PPT Presentation

Depth-first Traversal over a Mirrored Space for Non-redundant Discriminative Itemsets Yoshitaka Kameya and Hiroki Asaoka Meijo University DaWaK-13 1

Outline • Background • Details of our proposed method • Experiments DaWaK-13 2

Outline • Background • Details of our proposed method • Experiments DaWaK-13 3

Background: Discriminative patterns • Discriminative patterns: – Show differences between two groups (classes) – Used for: • Characterizing the class of interest • Building more precise classifiers Discriminative pattern x milk=True  aquatic=False  + + :Positive class – :Negative class Class c of interest • We focus on top- k mining DaWaK-13 4

Background: Coping with redundancy (1) Problem: Redundancy among patterns Item i is significantly relevant to the target class  Patterns containing i tend to occupy the top- k list Top-10 patterns (including ties) Positive Positive Dataset Rank Rank Pattern Pattern Support F-score Support F-score Class Class Transaction Transaction 1 1 {A, C} {A, C} 3 3 0.75 0.75 + + {A, B, D, E} {A, B, D, E} 2 2 {A} {A} 4 4 0.73 0.73 3 3 {B} {B} 4 4 0.67 0.67 + + {A, B, C, D, E} {A, B, C, D, E} 3 3 {A, B} {A, B} 3 3 0.67 0.67 Positive + + {A, C, D, E} {A, C, D, E} 5 5 {A, D} {A, D} 3 3 0.6 0.6 Transactions + + {A, B, C} {A, B, C} 5 5 {A, E} {A, E} 3 3 0.6 0.6 + + {B} {B} 5 5 {A, E, D} {A, E, D} 3 3 0.6 0.6 5 5 {C} {C} 3 3 0.6 0.6 – – {A, B, D, E} {A, B, D, E} 9 9 {A, B, C} {A, B, C} 2 2 0.57 0.57 – – {B, C, D} {B, C, D} 9 9 {A, C, D} {A, C, D} 2 2 0.57 0.57 – – {A, D, E} {A, D, E} Negative 9 9 {A, C, E} {A, C, E} 2 2 0.57 0.57 Transactions – – {B, D, E} {B, D, E} 9 9 {A, C, E, D} {A, C, E, D} 2 2 0.57 0.57 9 9 {C, E} {C, E} 2 2 0.57 0.57 – – {C} {C} 9 9 {C, E, D} {C, E, D} 2 2 0.57 0.57 Support over the positive transactions Relevance score to the positive class DaWaK-13 5

Background: Coping with redundancy (2) • Set-inclusion-based constraints Positive Positive Rank Rank Pattern Pattern Support F-score Support F-score – Closedness [Pasquier+ 99] 1 1 {A, C} {A, C} 3 3 0.75 0.75 2 2 {A} {A} 4 4 0.73 0.73 – Productivity [Bayardo 00][Webb 07] 3 3 {B} {B} 4 4 0.67 0.67 Closedness: 3 3 {A, B} {A, B} 3 3 0.67 0.67 With the same positive support, pick the super-pattern 5 5 {A, D} {A, D} 3 3 0.6 0.6 without closedness Class Transaction 5 5 {A, E} {A, E} 3 3 0.6 0.6 or productivity 5 5 {A, E, D} {A, E, D} 3 3 0.6 0.6 + {A, B, D, E} 5 5 {C} {C} 3 3 0.6 0.6 + {A, B, C, D, E} 9 9 {A, B, C} {A, B, C} 2 2 0.57 0.57 + {A, C, D, E} 9 9 {A, C, D} {A, C, D} 2 2 0.57 0.57 + {A, B, C} only with closedness 9 9 {A, C, E} {A, C, E} 2 2 0.57 0.57 + {B} 9 9 {A, C, E, D} {A, C, E, D} 2 2 0.57 0.57 Positive Positive – {A, B, D, E} Rank Rank Pattern Pattern Support F-score Support F-score 9 9 {C, E} {C, E} 2 2 0.57 0.57 – {B, C, D} 9 9 {C, E, D} {C, E, D} 2 2 0.57 0.57 1 1 {A, C} {A, C} 3 3 0.75 0.75 with closedness – {A, D, E} 2 2 {A} {A} 4 4 0.73 0.73 & productivity – {B, D, E} 3 3 {B} {B} 4 4 0.67 0.67 – {C} 3 3 {A, B} {A, B} 3 3 0.67 0.67 Rank Pattern Positive Support F-score 5 5 {A, E, D} {A, E, D} 3 3 0.6 0.6 1 {A, C} 3 0.75 6 6 {A, B, C} {A, B, C} Productivity: 2 2 0.57 0.57 2 {A} 4 0.73 Remove super-patterns 6 6 {A, C, E, D} {A, C, E, D} 2 2 0.57 0.57 with smaller relevance scores 3 {B} 4 0.67 8 8 {A, B, E, D} {A, B, E, D} 2 2 0.5 0.5 9 9 {A, B, C, E, D} {A, B, C, E, D} 1 1 0.33 0.33 DaWaK-13 6

Background: Suffix enumeration trees • We test: – Closedness by “on -the- fly” closure check – Productivity over suffix enumeration trees [Kameya+ SDM12] Prefix enumeration tree Suffix enumeration tree (traditional search space) (mirrored search space)   F-score F-score 0.6 0.6 0.65 {A} {B} {C} {D} {A} {B} {C} {D} 0.7 0.7 0.65 0.75 0.9 {A,B} {A,C} {A,D} {B,C} {B,D} {C,D} {A,B} {A,C} {B,C} {A,D} {B,D} {C,D} 0.7 0.9 0.7 0.75 0.8 {A,B,C} {A,B,C} {A,C,D} {B,C,D} {A,B,C} {A,B,D} {A,C,D} {B,C,D} 0.8 {A,B,C,D} {A,B,C,D} Uncertain at this moment Immediately judged as non-productive Memory-efficient (depth-first) search even in depth-first search is possible with safe productivity tests DaWaK-13 7

Our goal • To propose an efficient, exact method for finding top- k productive “ closed-on-the-positives ” Closed patterns over the positive transactions • Contributions : – Dual-monotonicy • A generalized condition on relevance scores • Gives a theoretical basis – Suffix-preserving closure extension • A mirrored operation of the one used in LCM [Uno+ DS04] • Can work with closedness and productivity smoothly at the same time DaWaK-13 8

Outline  Background • Details of our proposed method – Dual-monotonicity – Suffix-preserving closure extension • Experiments DaWaK-13 9

Outline  Background • Details of our proposed method – Dual-monotonicity – Suffix-preserving closure extension • Experiments DaWaK-13 10

Dual-monotonicity: Preliminaries (1) • Discriminative pattern x is often evaluated under a relevance score to the class c of interest – Confidence/PMI – Support Difference/WRA/Leverage – c 2 – F-score/Dice/Jaccard – ... x x c c  bad  good These scores measure the distributional overlap between x and c • Computational difficulty: Most of popular relevance scores do not satisfy anti-monotonicity (the Apriori property)  Standard technique: Branch-and-bound search [Morishita+ 00][Zimmermann+ 09][Nijssen+ 09] DaWaK-13 11

Dual-monotonicity: Preliminaries (2) • ROC analysis of a relevance score R c – Confusion matrix for a rule “ x  c ”:  c c False positive: p (  c , x ) True positive: p ( c , x ) x  x False negative: p ( c ,  x ) True negative: p (  c ,  x ) – Any relevance score R c can be seen as a function of true positive rate (TPR) p ( x | c ) and false positive rate (FPR) p ( x |  c ) Good patterns F- score’s ROC space x x ’ Bad patterns DaWaK-13 12

Dual-monotonicity: Definition Relevance score R c is dual-monotonic  R c ( x ) is monotonically increasing w.r.t. TPR p ( x | c ) and R c ( x ) is monotonically decreasing w.r.t. FPR p ( x |  c ) (wherever TPR  FPR) F-score Dual-monotonicity is more general Increasing than convexity [Morishita+ 00][Nijssen+ 09] (e.g. F-score does not satisfy convexity but dual-monotonicity) Increasing Property : Branch-and-bound (B&B) pruning is safe under dual-monotonicity  The applicablility of B&B pruning is enlarged DaWaK-13 13

Dual-monotonicity: Closed patterns • We focus only on “ closed-on-the-positives ” Closed patterns over the positive transactions • Such closed patterns are beneficial in: – Efficiency : Closed Positive Pattern Support F-score on the positives? • Some set of patterns (“generators”) {A, C} 3 0.75 Yes are compressed into a closed pattern {A} 4 0.73 Yes {B} 4 0.67 Yes {A, B} 3 0.67 Yes • Search space is 3 0.6 No {A, D} (possibly exponentially) reduced {A, E} 3 0.6 No 3 0.6 Yes {A, E, D} – Relevance : {C} 3 0.6 No 2 0.57 Yes {A, B, C} Under a dual-monotonic score, {A, C, D} 2 0.57 No {A, C, E} 2 0.57 No closed-on-the-positives are {A, C, E, D} 2 0.57 Yes no less relevant than their generators {C, E} 2 0.57 No {C, E, D} 2 0.57 No [Soulet+ PAKDD04] DaWaK-13 14

Outline  Background • Details of our proposed method  Dual-monotonicity – Suffix-preserving closure extension • Experiments DaWaK-13 15

SPC extension: Background • Suffix-preserving closure (SPC) extension – A mirrored operation of the one used in LCM [Uno+ DS04] – Only generates closed patterns from closed patterns  We need not maintain the top- k list for closedness – Ensures the depth-first traversal over a space like a suffix enumeration tree  This makes it easy to maintain the top- k list for productivity  {A} {B} {C} {D} {A,B} {A,C} {B,C} {A,D} {B,D} {C,D} {A,B,C} {A,B,D} {A,C,D} {B,C,D} {A,B,C,D} DaWaK-13 16

SPC extension: Illustrated example (1) Preparation : Get the item order and reorder items in the transactions Original dataset: Modified dataset: Class Class Transaction Transaction Class Transaction + {A, B, D, E} + + {A, B, E, D} {A, B, E, D} + + {A, B, C, E, D} {A, B, C, E, D} + {A, B, C, D, E} + + {A, C, E, D} {A, C, E, D} + {A, C, D, E} Item F-score + + {A, B, C} {A, B, C} + {A, B, C} A 0.78 + + {B} {B} + {B} B 0.63 – – {A, B, E, D} {A, B, E, D} – {A, B, D, E} C 0.57 – – {B, C, D} {B, C, D} – {B, C, D} D 0.46 – {A, D, E} – – {A, E, D} {A, E, D} E 0.51 – – {B, E, D} {B, E, D} – {B, D, E} – {C} – – {C} {C} Item order : A < B < C < E < D (young) (old) We use negative transactions only when computing relevance scores (Details are omitted) DaWaK-13 17

Depth-first Traversal over a Mirrored Space for Non-redundant - PowerPoint PPT Presentation

Depth-first Traversal over a Mirrored Space for Non-redundant Discriminative Itemsets Yoshitaka Kameya and Hiroki Asaoka Meijo University DaWaK-13 1 Outline Background Details of our proposed method Experiments DaWaK-13 2

graph traversal Nov. 15/16, 2017 1 Today Recursive graph traversal depth first

Graph traversal anhtt-fit@mail.hut.edu.vn Graph Traversal We need also algorithm to traverse

Binary Tree Traversal Methods Preorder Inorder In a traversal of a binary tree, each

Graph Traversal Graph Traversal with DFS/BFS One of the most fundamental graph problems is to

Binary Tree Traversal Methods Preorder Inorder In a traversal of a binary tree, each

for each dst in my.out_edges if dst.depth > my.depth+1 then dst.depth = my.depth+1

Applications of Graph Traversal Algorithm : Design & Analysis [12] In the last class

Evolution of valley depth and width Evolution of valley depth and width Evolution of valley depth

RTSP NAT Traversal Update RTSP NAT Traversal Update draft-ietf-mmusic-rtsp-nat-03.txt

Secrets and Lies Secrets and Lies a summary traversal of Bruce Schneier a summary traversal of

ECE 242 Data Structures Lecture 19 Tree Traversal October 23, 2009 ECE242 L19: Tree Traversal

GRAPH TRAVERSAL PATH FINDING AND GRAPH TRAVERSAL Path finding refers to determining the shortest

tree traversal Oct. 25/26, 2017 1 2 Tree Traversal How to visit (enumerate, iterate through,

Week 5 Kullmann Analysing BFS Depth-first search Depth-first search Analysing DFS

ECE 242 Data Structures Lecture 29 Graph Traversal November 23, 2009 ECE242 L29: Graph

A Parallel External- -Memory Memory A Parallel External Frontier Breadth- -First Traversal

Performance Evaluation of RAID6 Yan Li, Roland Ibbett, Nigel Topham and Tim Courtney School of

Q U A R T E R L Y M E E T I N G Q U A R T E R L Y M E E T I N G M A R CH 11, 2 0 0 9 9 : 3 0

Performance of SPC product codes under the erasure A. Lpez Martn channel Sara D. Cardell 1

A Fast Polar Code List Decoder Architecture Based on Sphere Decoding Seyyed Ali Hashemi , Carlo

CREATING Tracey Ezard @traceyezard www.traceyezard.com EMOTIONS ARE CONTAGIOUS What infection

Code Modification Forum Ashling Hotel, Dublin Wednesday, 29 January 2020 Agenda 1. Review of

Some uniformity aspects of the class of analytic sets Vassilis Gregoriades TU Darmstadt CCC

2015 Electric IRP 2015 Electric IRP Conservation Mike Dillon July 15 th , 2016 Avista Historic

Sambuz

Useful Links

Newsletter

Mail Us

Depth-first Traversal over a Mirrored Space for Non-redundant - PowerPoint PPT Presentation

Depth-first Traversal over a Mirrored Space for Non-redundant Discriminative Itemsets Yoshitaka Kameya and Hiroki Asaoka Meijo University DaWaK-13 1 Outline Background Details of our proposed method Experiments DaWaK-13 2

graph traversal Nov. 15/16, 2017 1 Today Recursive graph traversal depth first

Graph traversal anhtt-fit@mail.hut.edu.vn Graph Traversal We need also algorithm to traverse

Binary Tree Traversal Methods Preorder Inorder In a traversal of a binary tree, each

Graph Traversal Graph Traversal with DFS/BFS One of the most fundamental graph problems is to

Binary Tree Traversal Methods Preorder Inorder In a traversal of a binary tree, each

for each dst in my.out_edges if dst.depth &gt; my.depth+1 then dst.depth = my.depth+1

Applications of Graph Traversal Algorithm : Design &amp; Analysis [12] In the last class

Evolution of valley depth and width Evolution of valley depth and width Evolution of valley depth

RTSP NAT Traversal Update RTSP NAT Traversal Update draft-ietf-mmusic-rtsp-nat-03.txt

Secrets and Lies Secrets and Lies a summary traversal of Bruce Schneier a summary traversal of

ECE 242 Data Structures Lecture 19 Tree Traversal October 23, 2009 ECE242 L19: Tree Traversal

GRAPH TRAVERSAL PATH FINDING AND GRAPH TRAVERSAL Path finding refers to determining the shortest

tree traversal Oct. 25/26, 2017 1 2 Tree Traversal How to visit (enumerate, iterate through,

Week 5 Kullmann Analysing BFS Depth-first search Depth-first search Analysing DFS

ECE 242 Data Structures Lecture 29 Graph Traversal November 23, 2009 ECE242 L29: Graph

A Parallel External- -Memory Memory A Parallel External Frontier Breadth- -First Traversal

Performance Evaluation of RAID6 Yan Li, Roland Ibbett, Nigel Topham and Tim Courtney School of

Q U A R T E R L Y M E E T I N G Q U A R T E R L Y M E E T I N G M A R CH 11, 2 0 0 9 9 : 3 0

Performance of SPC product codes under the erasure A. Lpez Martn channel Sara D. Cardell 1

A Fast Polar Code List Decoder Architecture Based on Sphere Decoding Seyyed Ali Hashemi , Carlo

CREATING Tracey Ezard @traceyezard www.traceyezard.com EMOTIONS ARE CONTAGIOUS What infection

Code Modification Forum Ashling Hotel, Dublin Wednesday, 29 January 2020 Agenda 1. Review of

Some uniformity aspects of the class of analytic sets Vassilis Gregoriades TU Darmstadt CCC

2015 Electric IRP 2015 Electric IRP Conservation Mike Dillon July 15 th , 2016 Avista Historic

Sambuz

Useful Links

Newsletter

Mail Us

for each dst in my.out_edges if dst.depth > my.depth+1 then dst.depth = my.depth+1

Applications of Graph Traversal Algorithm : Design & Analysis [12] In the last class