formal concept analysis
play

Formal Concept Analysis II Closure Systems and Implications Robert - PowerPoint PPT Presentation

Formal Concept Analysis II Closure Systems and Implications Robert J aschke Asmelash Teka Hadgu FG Wissensbasierte Systeme/L3S Research Center Leibniz Universit at Hannover slides based on a lecture by Prof. Gerd Stumme Robert J


  1. Formal Concept Analysis II Closure Systems and Implications Robert J¨ aschke Asmelash Teka Hadgu FG Wissensbasierte Systeme/L3S Research Center Leibniz Universit¨ at Hannover slides based on a lecture by Prof. Gerd Stumme Robert J¨ aschke (FG KBS) Formal Concept Analysis 1 / 25

  2. Agenda Implications 4 Implications Attribute Logic Concept Intents and Implications Implications and Closure Systems Pseudo-Intents and the Stem Base Computing the Stem Base With Next Closure Bases of Association Rules Robert J¨ aschke (FG KBS) Formal Concept Analysis 2 / 25

  3. Implications Def.: An implication X Ñ Y Bicycle Trail holds in a context, if every NPS Guided Tours Hiking Fishing Muir Woods John object that has all attributes Pinnacles Muir Horseback Riding from X also has all attributes Lava Beds Swimming Fort Point from Y . Joshuas Tree Cabrillo Channel Islands Cross Country Death Valley Ski Trail Devils Postpile Kings Canyon Redwood Sequoia Boating Golden Gate Point Rayes Lassen Volcanic Examples: Santa Monica Mountains Yosemite Whiskeytown-Shasta-Trinity { Swimming } Ñ { Hiking } { Boating } Ñ { Swimming, Hiking, NPS Guided Tours, Fishing, Horseback Riding } { Bicycle Trail, NPS Guided Tours } Ñ { Swimming, Hiking, Horseback Riding } Robert J¨ aschke (FG KBS) Formal Concept Analysis 3 / 25

  4. Attribute Logic overlap disjoint parallel common vertex common segment common edge We are dealing with implications over an possibly infinite set of objects! Robert J¨ aschke (FG KBS) Formal Concept Analysis 4 / 25

  5. Concept Intents and Implications Def.: A subset T Ď M respects an implication A Ñ B , if A Ę T or B Ď T holds. (We then also say that T is a model of A Ñ B .) T respects a set L of implications, if T respects every implication in L . Lemma: An implication A Ñ B holds in a context, iff B Ď A 2 ( ô A 1 Ď B 1 ). It is then respected by all concept intents. Robert J¨ aschke (FG KBS) Formal Concept Analysis 5 / 25

  6. Implications and Closure Systems Lemma: If L is a set of implications in M , then Mod p L q : “ t X Ď M | X respects L u is a closure system on M . The respective closure operator X ÞÑ L p X q is constructed in the following way: For a set X Ď M , let X L : “ X Y ď t B | A Ď X u . A Ñ B P L We form the sets X L , X LL , X LLL , . . . until a set L p X q : “ X L ... L is obtained with L p X q L “ L p X q (i.e., a fixpoint). 1 L p X q is then the closure of X for the closure system Mod p L q . 1 If M is infinite, this may require infinitely many iterations. Robert J¨ aschke (FG KBS) Formal Concept Analysis 6 / 25

  7. Implications and Closure Systems Def.: An implication A Ñ B follows (semantically) from a set L of implications in M if each subset of M respecting L also respects A Ñ B . A family of implications is called closed if every implication following from L is already contained in L . Lemma: A set L of implications in M is closed, iff the following conditions ( Armstrong Rules ) are satisfied for all W, X, Y, Z Ď M : 1 X Ñ X P L , 2 If X Ñ Y P L , then X Y Z Ñ Y P L , 3 If X Ñ Y P L and Y Y Z Ñ W P L , then X Y Z Ñ W P L . Remark: You should know these rules from the database lecture! Robert J¨ aschke (FG KBS) Formal Concept Analysis 7 / 25

  8. Pseudo-Intents and the Stem Base Def.: A set L of implications of a context p G, M, I q is called complete , if every implication that holds in p G, M, I q follows from L . A set L of implications is called non-redundant if no implication in L follows from other implications in L . Def.: P Ď M is called pseudo intent of p G, M, I q , if P ­“ P 2 , and if Q Ĺ P is a pseudo intent, then Q 2 Ď P . Theorem: The set of implications L : “ t P Ñ P 2 | P is pseudo intent u is non-redundant and complete. We call L the stem base . Robert J¨ aschke (FG KBS) Formal Concept Analysis 8 / 25

  9. Pseudo-Intents and the Stem Base Example: membership of developing countries in supranational groups (Source: Lexikon Dritte Welt. Rowohlt-Verlag, Reinbek 1993) Robert J¨ aschke (FG KBS) Formal Concept Analysis 9 / 25

  10. Robert J¨ aschke (FG KBS) Formal Concept Analysis 10 / 25

  11. Robert J¨ aschke (FG KBS) Formal Concept Analysis 11 / 25

  12. Pseudo-Intents and the Stem Base stem base of the developing countries context: t OPEC u Ñ t Group of 77, Non-Alligned u t MSAC u Ñ t Group of 77 u t Non-Alligned u Ñ t Group of 77 u t Group of 77, Non-Alligned, MSAC, OPEC u Ñ t LLDC, AKP u t Group of 77, Non-Alligned, LLDC, OPEC u Ñ t MSAC, AKP u Robert J¨ aschke (FG KBS) Formal Concept Analysis 12 / 25

  13. Computing the Stem Base With Next Closure The computation is based on the following theorem: Theorem: The set of all concept intents and pseudo-intents is a closure system. The corresponding closure operator is given by: Starting with a set X we compute X L ‚ : “ X Y ď t B | A Ă X, A ‰ X u A Ñ B P L X L ‚ L ‚ : “ X L ‚ Y ď t B | A Ă X L ‚ , A ‰ X L ‚ u A Ñ B P L etc., until we reach a set L ‚ p X q with L ‚ p X q “ L ‚ p x q L ‚ . This is then the wanted intent or pseudo-intent. Robert J¨ aschke (FG KBS) Formal Concept Analysis 13 / 25

  14. Computing the Stem Base With Next Closure The algorithm Next Closure to compute all concept intents and the stem base: 1 The set L of all implications is initialized to L “ H . 2 The lectically first concept intent or pseudo-intent is H . 3 If A is an intent or a pseudo-intent, the lectically next intent/pseudo-intent is computed by checking all i P M z A in descending order, until A ă i L ‚ p A ` i q holds. Then L ‚ p A ` i q is the next intent or pseudo-intent. 4 If L ‚ p A ` i q “ p L ‚ p A ` i qq 2 holds, then L ‚ p A ` i q is a concept intent, otherwise it is a pseudo-intent and the implication L ‚ p A ` i q Ñ p L ‚ p A ` i qq 2 is added to L . 5 If L ‚ p A ` i q “ M , finish. Else, set A Ð L ‚ p A ` i q and continue with Step 3. Robert J¨ aschke (FG KBS) Formal Concept Analysis 14 / 25

  15. Computing the Stem Base With Next Closure a b c e 1 ˆ ˆ Example: 2 ˆ ˆ 3 ˆ ˆ ˆ A i A ` i L ‚ p A ` i q A ă i L ‚ p A ` i q ? p L ‚ p A ` i qq 2 L new intent Robert J¨ aschke (FG KBS) Formal Concept Analysis 15 / 25

  16. Agenda Implications 4 Implications Attribute Logic Concept Intents and Implications Implications and Closure Systems Pseudo-Intents and the Stem Base Computing the Stem Base With Next Closure Bases of Association Rules Robert J¨ aschke (FG KBS) Formal Concept Analysis 16 / 25

  17. Bases of Association Rules { veil color: white, gill spacing: close } Ñ { gill attachment: free } support: 78.52% confidence: 99.60% The input data to compute association rules can be represented as a formal context p G, M, I q : M is a set of items (things, products of a market basket), G contains the transaction ids , and the relation I the list of transactions . Robert J¨ aschke (FG KBS) Formal Concept Analysis 17 / 25

  18. Bases of Association Rules { veil color: white, gill spacing: close } Ñ { gill attachment: free } support: 78.52% confidence: 99.60% The support of an implication is the fraction of all objects that have all attributes from the premise and the conclusion. (repetition: the support of an attribute set X Ď M is supp p X q : “ | X 1 | | G | .) Def.: The support of a rule X Ñ Y is given by supp p X Ñ Y q : “ supp p X Y Y q The confidence is the fraction of all objects that fulfill both the premise and the conclusion among those objects that fulfill the premise. Def.: The confidence of a rule X Ñ Y is given by conf p X Ñ Y q : “ supp p X Y Y q supp p X q Robert J¨ aschke (FG KBS) Formal Concept Analysis 17 / 25

  19. Bases of Association Rules { veil color: white, gill spacing: close } Ñ { gill attachment: free } support: 78.52% confidence: 99.60% Classical data mining task: Find for given minsupp, minconf P r 0 , 1 s all rules with a support and confidence above these bounds. Our task: finding a base of rules, i.e., a minimal set of rules from which all other rules follow. Robert J¨ aschke (FG KBS) Formal Concept Analysis 17 / 25

  20. Bases of Association Rules From B 1 “ B 3 follows supp p B q “ | B 1 | | G | “ | B 3 | | G | “ supp p B 2 q Theorem: X Ñ Y and X 2 Ñ Y 2 have the same support and the same confidence. To compute all association rules it is thus sufficient to compute the support of all frequent sets with B “ B 2 (i.e., the intents of the iceberg concept lattice). Robert J¨ aschke (FG KBS) Formal Concept Analysis 18 / 25

  21. Bases of Association Rules The Benefit of Iceberg Concept Lattices (Compared to Frequent Itemsets) veil type: partial gill attachment: free ring number: one 100 % veil color: white gill spacing: close 92.30 % 97.43 % 97.62 % 81.08 % 90.02 % 97.34 % 76.81 % 78.80 % 89.92 % minsupp = 70% 78.52 % 74.52 % 32 frequent itemsets are ➞ more efficient computation (e.g., Titanic ) represented by 12 ➞ fewer rules (without loss of information!) frequent concept intents Robert J¨ aschke (FG KBS) Formal Concept Analysis 19 / 25

  22. Bases of Association Rules The Benefit of Iceberg Concept Lattices (Compared to Frequent Itemsets) gill attachment: free veil type: partial 97.6% 97.4% veil color: white gill spacing: close ring number: one 97.2% 97.5% 99.9% 99.7% 99.6% 99.9% 97.0% Association rules can be visualized in the (iceberg) concept lattice: exact association rules (implications): conf “ 100% (approximate) association rules: conf ă 100% Robert J¨ aschke (FG KBS) Formal Concept Analysis 20 / 25

Recommend


More recommend