Advances in algorithms based on CbO Petr Krajca, Jan Outrata, Vilem Vychodil Palacky University, Olomouc, Czech Republic Concept Lattices and Their Applications, Sevilla, 2010 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 1 / 21
Contribution Three topics of interest: improved canonicity test 1 additional canonicity test for Close-by-One result: reduction of the number of concepts computed multiple times parallelization 2 simultanous computation of disjoint sets of concepts focus on various workload distributions data preprocessing 3 role of attribute permutations experimental observations: efficiency of algorithms w. r. t. number of inversions Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 2 / 21
Related Work Next Closure, Close-by-One: Ganter B.: Two basic algorithms in concept analysis. (Technical Report FB4-Preprint No. 831). TH Darmstadt, 1984. Kuznetsov S.: A fast algorithm for computing all intersections of objects in a finite semi-lattice. Autom. Docum. Math. Ling. , 27 (5)(1993), 11–21. Parallel and Distributed CbO: Krajca, Outrata, Vychodil: Parallel algorithm for computing fixpoints of Galois connections. AMAI (to appear), DOI 10.1007/s10472-010-9199-5 . Krajca, Vychodil: Distributed algorithm for computing formal concepts using map-reduce framework. In: Proc. IDA 2009, LNCS 5772 (2009), 333–344. Krajca, Vychodil: Comparison of data structures for computing formal concepts. In: Proc. MDAI 2009, LNAI 5861 (2009), 114–125. Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 3 / 21
Inside CbO: The Canonicity Test Assumptions: X = { 0 , 1 , . . . , m } (objects); Y = { 0 , 1 , . . . , n } (attributes); � A, B � . . . current formal concept; attribute y ∈ Y such that y �∈ B ; put C = ( B ∪ { y } ) ↓ and D = ( B ∪ { y } ) ↓↑ . The Test For � A, B � and � C, D � check whether D ∩ { 0 , 1 , . . . , y − 1 } = B ∩ { 0 , 1 , . . . , y − 1 } is true: yes (success) � = ⇒ proceed with � C, D � ⇒ skip � C, D � no (failure) � = Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 4 / 21
CbO Represented by Recursive Procedure GenerateFrom Procedure GenerateFrom input: formal concept � A, B � , first attribute y to be added to � A, B � output: all formal concepts with intents containing B procedure GenerateFrom ( � A, B � , y ) output � A, B � if B � = Y then for j from y upto n do if j �∈ B then C ← A ∩ { y } ↓ D ← C ↑ if D ∩ { 0 , 1 , . . . , j − 1 } = B ∩ { 0 , 1 , . . . , j − 1 } then call GenerateFrom ( � C, D � , j + 1) Initially called with �∅ ↓ , ∅ ↓↑ � and y = 0 . Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 5 / 21
Performance Issues Canonicity Test: the same closure computed multiple times (returned once) canonicity test is performed after computing of closure Proposed Solution: reuse information about canonicity test failure perform additional test before computing of closure (if possible) CbO Tree – a call tree for procedure GenerateFrom nodes – represent computed closures, two types: � B i , y � : represents invocation of GenerateFrom with arguments B i and y 1 � + B i : B i is computed but fails the canonicity test 2 edges – between nodes labeled by attributes edge between B i and B j is labeled by y whenever ( B i ∪ { y } ) ↓↑ = B j Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 6 / 21
Example (CbO Tree) � B 1 , 0 � 0 1 2 3 4 5 × × × 1 3 5 4 2 0 × × × × × × × × � B 10 , 2 � � B 12 , 3 � � B 2 , 1 � � � � × × B 8 B 8 B 9 3 5 4 2 4 5 3 2 3 5 4 1 � B 11 , 3 � � B 7 , 3 � � B 9 , 5 � � B 3 , 2 � � � � � � � � � B 5 B 5 B 6 B 8 B 8 B 8 B 8 B 8 4 5 3 4 5 3 5 4 5 3 2 � B 8 , 4 � � B 6 , 5 � � B 4 , 3 � � � � � � � � � B 5 B 5 B 5 B 8 B 8 B 8 B 5 B 5 5 4 5 3 � B 5 , 4 � � � � B 5 B 5 B 5 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 7 / 21
Additional Test Under the notation ( B ∪ { j } ) ↓↑ \ B � � B � j = ∩ { 0 , 1 , . . . , j − 1 } we have: Lemma (On Test Failure Propagation) Let B ⊆ Y , j �∈ B , and B � j � = ∅ . Then, for each B ′ ⊇ B such that j �∈ B ′ and B � j �⊆ B ′ , we have B ′ � j � = ∅ . Consequences: if the original test fails for intent B , attribute j �∈ B and D = ( B ∪ { j } ) ↓↑ , then it fails for any intent B ′ ⊇ B and D ′ = ( B ′ ∪ { j } ) ↓↑ provided that j �∈ B ′ and D ∩ { 0 , 1 , . . . , j − 1 } � B ′ ∩ { 0 , 1 , . . . , j − 1 } closure D ′ = ( B ′ ∪ { j } ) ↓↑ need not be computed ( !! ) Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 8 / 21
Example (Pruned CbO Tree) � B 1 , 0 � 0 1 2 3 4 5 × × × 1 3 5 4 2 0 × × × × × × × × � B 10 , 2 � � B 12 , 3 � � B 2 , 1 � � � � × × B 8 B 8 B 9 M 3 = M 5 = { 0 , 2 , 3 , 4 , 5 } = B 8 ; M 4 = { 0 , 4 } = B 9 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 9 / 21
Example (Pruned CbO Tree) � B 1 , 0 � 0 1 2 3 4 5 × × × 1 3 5 4 2 0 × × × × × × × × � B 10 , 2 � � B 12 , 3 � � B 2 , 1 � � � � × × B 8 B 8 B 9 2 3 5 4 1 � B 7 , 3 � � B 9 , 5 � � B 3 , 2 � � � B 8 B 8 M 3 = M 5 = { 0 , 2 , 3 , 4 , 5 } = B 8 ; M 4 = { 0 , 4 } = B 9 2 ∈ M 3 and 2 �∈ B 2 = { 0 } � = ⇒ test fails for j = 3 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 9 / 21
Example (Pruned CbO Tree) � B 1 , 0 � 0 1 2 3 4 5 × × × 1 3 5 4 2 0 × × × × × × × × � B 10 , 2 � � B 12 , 3 � � B 2 , 1 � � � � × × B 8 B 8 B 9 3 5 4 2 4 5 3 2 3 5 4 1 � B 11 , 3 � � B 7 , 3 � � B 9 , 5 � � B 3 , 2 � � � � � � � � � B 5 B 5 B 6 B 8 B 8 B 8 B 8 B 8 4 5 3 4 5 3 5 4 5 3 2 � B 8 , 4 � � B 6 , 5 � � B 4 , 3 � � � � � � � � � B 5 B 5 B 5 B 8 B 8 B 8 B 5 B 5 5 4 5 3 M 3 = M 5 = { 0 , 2 , 3 , 4 , 5 } = B 8 ; M 4 = { 0 , 4 } = B 9 � B 5 , 4 � � � � 2 ∈ M 3 and 2 �∈ B 2 = { 0 } � = ⇒ test fails for j = 3 B 5 B 5 B 5 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 9 / 21
Procedure FastGenerateFrom addtional input: (pointers to) sets of attributes { N y | y ∈ Y } (initially empty) procedure FastGenerateFrom ( � A, B � , y, { N y | y ∈ Y } ) output � A, B � if B � = Y then for j from y upto n do M j ← N j if j �∈ B and N j ∩ { 0 , 1 , . . . , j − 1 } ⊆ B ∩ { 0 , 1 , . . . , j − 1 } then C ← A ∩ { y } ↓ D ← C ↑ if D ∩ { 0 , 1 , . . . , j − 1 } = B ∩ { 0 , 1 , . . . , j − 1 } then put �� C, D � , j � to queue else M j ← D while get �� C, D � , j � from queue call FastGenerateFrom ( � C, D � , j + 1 , { M y | y ∈ Y } ) Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 10 / 21
Soundness, Complexity and Relationship to Other Algorithms soundness: procedure FastGenerateFrom is sound can be proved using unique derivations (paths in FCbO tree) complexity: polynomial time-delay O ( | X | · | Y | 3 ) relationship to other algorithms: lists concepts in the same order as CbO (FCbO = CbO in the worst case) lists as NextClosure (but faster) if the main loop is “ from n downto y ” Outrata, Vychodil: Fast algorithm for computing fixpoints of Galois connections induced by object-attribute relational data (in preparation). Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 11 / 21
Performance evaluation I Test Conditions: algorithms: CbO, FCbO performance indicator: computed closures Results: concepts closures closures ratio ratio (CbO) (FCbO) (CbO) (FCbO) mushroom 238,710 4,006,498 426,563 5.9 % 55.9 % anon. web 129,009 27,949,552 1,475,341 0.4 % 8.7 % debian tags 38,977 12,045,680 679,911 0.3 % 5.7 % tit-tac-toe 59,505 221,608 128,434 26.8 % 46.3 % Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 12 / 21
Performance evaluation II Test Conditions: algorithms: CbO, FCbO, NextClosure performance indicator: computation time in seconds Results: mushroom tic-tac-toe debian tags anon. web size 8,124 × 119 958 × 29 14,315 × 475 32,710 × 295 density 19 % 34 % < 1 % 1 % FCbO 0.23 0.02 0.10 0.15 CbO 4.34 0.06 5.31 27.14 NextClosure 685.00 1.86 1,432.25 8,236.85 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 13 / 21
Recommend
More recommend