preprocessing input data for machine learning by fca
play

Preprocessing input data for machine learning by FCA Jan OUTRATA - PowerPoint PPT Presentation

Preprocessing input data for machine learning by FCA Jan OUTRATA Dept. Computer Science Palack y University, Olomouc, Czech Republic CLA 2010, Oct 1921, Sevilla Jan Outrata (Palack y University) Preprocessing input data . . . CLA


  1. Preprocessing input data for machine learning by FCA Jan OUTRATA Dept. Computer Science Palack´ y University, Olomouc, Czech Republic CLA 2010, Oct 19–21, Sevilla Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 1 / 24

  2. Outline introduction and related work preliminaries on Boolean Factor Analysis (BFA) and decision trees preprocessing input data using BFA example experimental evaluation conclusions and future research Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 2 / 24

  3. Introduction to the problem – FCA often used for data preprocessing for (other) DM or ML methods to improve their results – results of DM and ML methods depend on structure of data = attributes in case of object-attribute data – data preprocessing . . . transformation of attributes Our approach: – formal concepts are used to create new attributes – which ones? → factor concepts obtained by Boolean Factor Analysis (BFA, described by FCA by Belohlavek, Vychodil, 2006) → new attributes = factors added to original attributes 1 replacing original attributes . . . reduction of dimensionality of data 2 (fewer factors) Main question: can factors better describe input data for DM/ML methods? Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 3 / 24

  4. Introduction to the problem – FCA often used for data preprocessing for (other) DM or ML methods to improve their results – results of DM and ML methods depend on structure of data = attributes in case of object-attribute data – data preprocessing . . . transformation of attributes Our approach: – formal concepts are used to create new attributes – which ones? → factor concepts obtained by Boolean Factor Analysis (BFA, described by FCA by Belohlavek, Vychodil, 2006) → new attributes = factors added to original attributes 1 replacing original attributes . . . reduction of dimensionality of data 2 (fewer factors) Main question: can factors better describe input data for DM/ML methods? Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 3 / 24

  5. Introduction to the problem – FCA often used for data preprocessing for (other) DM or ML methods to improve their results – results of DM and ML methods depend on structure of data = attributes in case of object-attribute data – data preprocessing . . . transformation of attributes Our approach: – formal concepts are used to create new attributes – which ones? → factor concepts obtained by Boolean Factor Analysis (BFA, described by FCA by Belohlavek, Vychodil, 2006) → new attributes = factors added to original attributes 1 replacing original attributes . . . reduction of dimensionality of data 2 (fewer factors) Main question: can factors better describe input data for DM/ML methods? Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 3 / 24

  6. Related work (focused on decision tree induction) – constructive induction/feature construction . . . new attributes as conjs./disj., arithm. ops., etc. of original attributes – oblique decision trees . . . multiple attributes used in splitting condition (e.g. linear combinations) – work utilizing FCA? → construction of the whole learning model (lattice-based/concept-based learning, Mephu Nguifo et al., Kuznetsov and others) Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 4 / 24

  7. Boolean Factor Analysis (BFA) = decomposition of (binary) object-attribute data matrix I to boolean product of object-factor matrix A and factor-attribute matrix B I ij = ( A ◦ B ) ij = � k l =1 A il · B lj A il = 1 . . . factor l applies to object i B lj = 1 . . . attribute j is one of the manifestations of factor l ( A ◦ B ) ij . . . “object i has attribute j if and only if there is a factor l such that l applies to i and j is one of the manifestations of l ” factors ≈ new attributes Problem: find the number k of factors as small as possible       1 1 0 0 0 1 0 0 1 1 1 0 0 0 1 1 0 0 1 1 0 1 0 0 0 1 1 0        =  ◦       1 1 1 1 0 1 1 0 0 1 0 0 0 1     1 0 0 0 1 0 0 1 0 0 1 0 0 0 Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 5 / 24

  8. Boolean Factor Analysis – solution using FCA Belohlavek R., Vychodil V.: Discovery of optimal factors in binary data via a novel method of matrix decomposition. J. Comput. System Sci 76 (1)(2010), 3-20. Matrices A and B can be constructed from a set F of formal concepts of input data � X , Y , I � , so-called factor concepts : F = {� A 1 , B 1 � , . . . , � A k , B k �} ⊆ B ( X , Y , I ) l -th column of A F = characteristic vector of A l l -th row of B F = characteristic vector of B l Decomposition using formal concepts to determine factors is optimal: Theorem Let I = A ◦ B for n × k and k × m binary matrices A and B. Then there exists a set F ⊆ B ( X , Y , I ) of formal concepts of I with |F| ≤ k such that for the n × | F | and | F | × m binary matrices A F and B F we have I = A F ◦ B F . Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 6 / 24

  9. Transformations between attribute and factor spaces object . . . vector in Boolean space { 0 , 1 } m of orig. attributes., row of I . . . vector in Boolean space { 0 , 1 } k of factors, row of A = mappings g : { 0 , 1 } m → { 0 , 1 } k and h : { 0 , 1 } k → { 0 , 1 } m : ( g ( P )) l = � m ( h ( Q )) j = � k j =1 ( B lj → P j ) l =1 ( Q l · B lj ) ( g ( P )) l = 1 iff l -th row of B is included in P ( h ( Q )) j = 1 iff attribute j is a manifestation of at least one factor from Q Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 7 / 24

  10. The ML method: Decision tree induction Decision tree . . . approximate representation of a (finite-valued) function over (finite-valued) attributes . . . the function is described by assignment of class labels to vectors of attribute values – used for classification of vectors (objects) into classes A A B C f ( A , B , C ) B G good yes false yes B B C good no false no no yes N Y F T bad no false no yes C N Y B Y false true N Y good no true yes yes no N Y bad yes true yes non-leaf tree node . . . test on a splitting attribute . . . covered collection of objects is split under the possible outcomes of the test (= values of the splitting attribute) leaf tree node . . . covers (majority of) objects with the same class label Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 8 / 24

  11. Decision tree induction problem & algorithms Decision tree induction problem . . . to construct a decision tree that 1 approximates well the function described by (few) objects ( training data ) 2 classifies well “unseen” objects ( testing data ) Algorithms: – common strategy: recursively splitting tree nodes (collections of objects) based on splitting attributes – the problem of selection of a splitting attribute ⇒ local optimization problem – selection criteria . . . based on measures defined in terms of class distribution of objects in nodes before and after splitting → entropy and information gain measures, Gini index, classification error etc. Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 9 / 24

  12. Transformation of input data in ML: logical, categorical (nominal), ordinal, numerical, . . . attributes in FCA: logical – binary (yes/no) or graded attributes → transformation . . . conceptual scaling (Ganter, Wille) – note: we need not transform the class attribute Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 10 / 24

  13. Example: transformation of input data Name body temp. gives birth fourlegged hibernates mammal cat warm yes yes no yes bat warm yes no yes yes salamander cold no yes yes no eagle warm no no no no guppy cold yes no no no ↓ Name bt cold bt warm gb no gb yes fl no fl yes hb no hb yes mammal cat 0 1 0 1 0 1 1 0 yes bat 0 1 0 1 1 0 0 1 yes salamander 1 0 1 0 0 1 0 1 no eagle 0 1 1 0 1 0 1 0 no guppy 1 0 0 1 1 0 1 0 no mammal . . . class label Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 11 / 24

  14. Extending the collection of attributes Recall: new attributes (= factors) are added to original attributes 1 decompose input data matrix I into matrix A describing objects X by factors F and matrix B explaining factors F by attributes Y 2 new attributes Y ′ = Y ∪ F 3 extended data table I ′ ⊆ X × Y ′ : I ′ ∩ ( X × Y ) = I and I ′ ∩ ( X × F ) = A Original decomposition (using FCA): decomposition aim: the number of factors as small as possible existing approx. algorithm (Belohlavek, Vychodil): greedy search for factor concepts which cover the largest area of still uncovered 1s in input data table function of optimality of factor concept = “cover ability” Jan Outrata (Palack´ y University) Preprocessing input data . . . CLA 2010 12 / 24

Recommend


More recommend