advances in algorithms based on cbo
play

Advances in algorithms based on CbO Petr Krajca, Jan Outrata, Vilem - PowerPoint PPT Presentation

Advances in algorithms based on CbO Petr Krajca, Jan Outrata, Vilem Vychodil Palacky University, Olomouc, Czech Republic Concept Lattices and Their Applications, Sevilla, 2010 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms


  1. Advances in algorithms based on CbO Petr Krajca, Jan Outrata, Vilem Vychodil Palacky University, Olomouc, Czech Republic Concept Lattices and Their Applications, Sevilla, 2010 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 1 / 21

  2. Contribution Three topics of interest: improved canonicity test 1 additional canonicity test for Close-by-One result: reduction of the number of concepts computed multiple times parallelization 2 simultanous computation of disjoint sets of concepts focus on various workload distributions data preprocessing 3 role of attribute permutations experimental observations: efficiency of algorithms w. r. t. number of inversions Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 2 / 21

  3. Related Work Next Closure, Close-by-One: Ganter B.: Two basic algorithms in concept analysis. (Technical Report FB4-Preprint No. 831). TH Darmstadt, 1984. Kuznetsov S.: A fast algorithm for computing all intersections of objects in a finite semi-lattice. Autom. Docum. Math. Ling. , 27 (5)(1993), 11–21. Parallel and Distributed CbO: Krajca, Outrata, Vychodil: Parallel algorithm for computing fixpoints of Galois connections. AMAI (to appear), DOI 10.1007/s10472-010-9199-5 . Krajca, Vychodil: Distributed algorithm for computing formal concepts using map-reduce framework. In: Proc. IDA 2009, LNCS 5772 (2009), 333–344. Krajca, Vychodil: Comparison of data structures for computing formal concepts. In: Proc. MDAI 2009, LNAI 5861 (2009), 114–125. Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 3 / 21

  4. Inside CbO: The Canonicity Test Assumptions: X = { 0 , 1 , . . . , m } (objects); Y = { 0 , 1 , . . . , n } (attributes); � A, B � . . . current formal concept; attribute y ∈ Y such that y �∈ B ; put C = ( B ∪ { y } ) ↓ and D = ( B ∪ { y } ) ↓↑ . The Test For � A, B � and � C, D � check whether D ∩ { 0 , 1 , . . . , y − 1 } = B ∩ { 0 , 1 , . . . , y − 1 } is true: yes (success) � = ⇒ proceed with � C, D � ⇒ skip � C, D � no (failure) � = Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 4 / 21

  5. CbO Represented by Recursive Procedure GenerateFrom Procedure GenerateFrom input: formal concept � A, B � , first attribute y to be added to � A, B � output: all formal concepts with intents containing B procedure GenerateFrom ( � A, B � , y ) output � A, B � if B � = Y then for j from y upto n do if j �∈ B then C ← A ∩ { y } ↓ D ← C ↑ if D ∩ { 0 , 1 , . . . , j − 1 } = B ∩ { 0 , 1 , . . . , j − 1 } then call GenerateFrom ( � C, D � , j + 1) Initially called with �∅ ↓ , ∅ ↓↑ � and y = 0 . Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 5 / 21

  6. Performance Issues Canonicity Test: the same closure computed multiple times (returned once) canonicity test is performed after computing of closure Proposed Solution: reuse information about canonicity test failure perform additional test before computing of closure (if possible) CbO Tree – a call tree for procedure GenerateFrom nodes – represent computed closures, two types: � B i , y � : represents invocation of GenerateFrom with arguments B i and y 1 � + B i : B i is computed but fails the canonicity test 2 edges – between nodes labeled by attributes edge between B i and B j is labeled by y whenever ( B i ∪ { y } ) ↓↑ = B j Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 6 / 21

  7. Example (CbO Tree) � B 1 , 0 � 0 1 2 3 4 5 × × × 1 3 5 4 2 0 × × × × × × × × � B 10 , 2 � � B 12 , 3 � � B 2 , 1 � � � � × × B 8 B 8 B 9 3 5 4 2 4 5 3 2 3 5 4 1 � B 11 , 3 � � B 7 , 3 � � B 9 , 5 � � B 3 , 2 � � � � � � � � � B 5 B 5 B 6 B 8 B 8 B 8 B 8 B 8 4 5 3 4 5 3 5 4 5 3 2 � B 8 , 4 � � B 6 , 5 � � B 4 , 3 � � � � � � � � � B 5 B 5 B 5 B 8 B 8 B 8 B 5 B 5 5 4 5 3 � B 5 , 4 � � � � B 5 B 5 B 5 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 7 / 21

  8. Additional Test Under the notation ( B ∪ { j } ) ↓↑ \ B � � B � j = ∩ { 0 , 1 , . . . , j − 1 } we have: Lemma (On Test Failure Propagation) Let B ⊆ Y , j �∈ B , and B � j � = ∅ . Then, for each B ′ ⊇ B such that j �∈ B ′ and B � j �⊆ B ′ , we have B ′ � j � = ∅ . Consequences: if the original test fails for intent B , attribute j �∈ B and D = ( B ∪ { j } ) ↓↑ , then it fails for any intent B ′ ⊇ B and D ′ = ( B ′ ∪ { j } ) ↓↑ provided that j �∈ B ′ and D ∩ { 0 , 1 , . . . , j − 1 } � B ′ ∩ { 0 , 1 , . . . , j − 1 } closure D ′ = ( B ′ ∪ { j } ) ↓↑ need not be computed ( !! ) Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 8 / 21

  9. Example (Pruned CbO Tree) � B 1 , 0 � 0 1 2 3 4 5 × × × 1 3 5 4 2 0 × × × × × × × × � B 10 , 2 � � B 12 , 3 � � B 2 , 1 � � � � × × B 8 B 8 B 9 M 3 = M 5 = { 0 , 2 , 3 , 4 , 5 } = B 8 ; M 4 = { 0 , 4 } = B 9 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 9 / 21

  10. Example (Pruned CbO Tree) � B 1 , 0 � 0 1 2 3 4 5 × × × 1 3 5 4 2 0 × × × × × × × × � B 10 , 2 � � B 12 , 3 � � B 2 , 1 � � � � × × B 8 B 8 B 9 2 3 5 4 1 � B 7 , 3 � � B 9 , 5 � � B 3 , 2 � � � B 8 B 8 M 3 = M 5 = { 0 , 2 , 3 , 4 , 5 } = B 8 ; M 4 = { 0 , 4 } = B 9 2 ∈ M 3 and 2 �∈ B 2 = { 0 } � = ⇒ test fails for j = 3 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 9 / 21

  11. Example (Pruned CbO Tree) � B 1 , 0 � 0 1 2 3 4 5 × × × 1 3 5 4 2 0 × × × × × × × × � B 10 , 2 � � B 12 , 3 � � B 2 , 1 � � � � × × B 8 B 8 B 9 3 5 4 2 4 5 3 2 3 5 4 1 � B 11 , 3 � � B 7 , 3 � � B 9 , 5 � � B 3 , 2 � � � � � � � � � B 5 B 5 B 6 B 8 B 8 B 8 B 8 B 8 4 5 3 4 5 3 5 4 5 3 2 � B 8 , 4 � � B 6 , 5 � � B 4 , 3 � � � � � � � � � B 5 B 5 B 5 B 8 B 8 B 8 B 5 B 5 5 4 5 3 M 3 = M 5 = { 0 , 2 , 3 , 4 , 5 } = B 8 ; M 4 = { 0 , 4 } = B 9 � B 5 , 4 � � � � 2 ∈ M 3 and 2 �∈ B 2 = { 0 } � = ⇒ test fails for j = 3 B 5 B 5 B 5 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 9 / 21

  12. Procedure FastGenerateFrom addtional input: (pointers to) sets of attributes { N y | y ∈ Y } (initially empty) procedure FastGenerateFrom ( � A, B � , y, { N y | y ∈ Y } ) output � A, B � if B � = Y then for j from y upto n do M j ← N j if j �∈ B and N j ∩ { 0 , 1 , . . . , j − 1 } ⊆ B ∩ { 0 , 1 , . . . , j − 1 } then C ← A ∩ { y } ↓ D ← C ↑ if D ∩ { 0 , 1 , . . . , j − 1 } = B ∩ { 0 , 1 , . . . , j − 1 } then put �� C, D � , j � to queue else M j ← D while get �� C, D � , j � from queue call FastGenerateFrom ( � C, D � , j + 1 , { M y | y ∈ Y } ) Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 10 / 21

  13. Soundness, Complexity and Relationship to Other Algorithms soundness: procedure FastGenerateFrom is sound can be proved using unique derivations (paths in FCbO tree) complexity: polynomial time-delay O ( | X | · | Y | 3 ) relationship to other algorithms: lists concepts in the same order as CbO (FCbO = CbO in the worst case) lists as NextClosure (but faster) if the main loop is “ from n downto y ” Outrata, Vychodil: Fast algorithm for computing fixpoints of Galois connections induced by object-attribute relational data (in preparation). Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 11 / 21

  14. Performance evaluation I Test Conditions: algorithms: CbO, FCbO performance indicator: computed closures Results: concepts closures closures ratio ratio (CbO) (FCbO) (CbO) (FCbO) mushroom 238,710 4,006,498 426,563 5.9 % 55.9 % anon. web 129,009 27,949,552 1,475,341 0.4 % 8.7 % debian tags 38,977 12,045,680 679,911 0.3 % 5.7 % tit-tac-toe 59,505 221,608 128,434 26.8 % 46.3 % Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 12 / 21

  15. Performance evaluation II Test Conditions: algorithms: CbO, FCbO, NextClosure performance indicator: computation time in seconds Results: mushroom tic-tac-toe debian tags anon. web size 8,124 × 119 958 × 29 14,315 × 475 32,710 × 295 density 19 % 34 % < 1 % 1 % FCbO 0.23 0.02 0.10 0.15 CbO 4.34 0.06 5.31 27.14 NextClosure 685.00 1.86 1,432.25 8,236.85 Krajca, Outrata, Vychodil (UP Olomouc; CZ) Advances in algorithms based on CbO CLA 2010 13 / 21

Recommend


More recommend