Association Rules Extracting Patterns from Large Data Sets
Content � Introduction to Pattern and Rule Analysis � A-priori Algorithm � Generalized Rule Induction � Sequential Patterns � Other WEKA algorithms � Outlook
Introduction � Finding unusual patterns and rules from large data sets � Examples � 10% percent of the customers buy wine and cheese � If someone today buys wine and cheese, tomorrow will buy sparkling water � If alarm A and B occur within 30 seconds, then alarm C occurs within 60 seconds with probability 0.5 � If someone visits derstandard.at, there is a 60% chance that the person will visit faz.net as well � If player X and Y were playing together as strikers, the team won 90% of the games � Application Areas: Unlimited � Question: How we can find such patterns?
General Considerations � Rule Represenation � Left-hand side proposition (antecedent) � Right-hand side proposition (consequent) � Probabilistic Rule � Consequent is true with probability p given that the antecedent is true � conditional probability � Scale Level � Especially suited for categorical data � Setting thresholds for continuous data � Advantages � Easy to compute � Easy to understand
Example Basket ID Milk Bread Water Coffee Kleenex 1 0 0 0 0 1 1 1 1 1 0 2 1 0 1 0 1 3 Example of a market basket: 4 0 0 1 0 0 0 1 1 1 0 5 The aim is to find itemsets in 1 1 1 0 0 6 order to predict accurately 1 0 1 1 0 7 (i.e. with high confidence) a 0 1 1 0 1 8 consequent from one or 1 0 0 1 0 9 more antecedents. 10 0 1 1 0 1 � Algorithms: A-Priori, Tertius and GRI
Mathematical Notations � General Notations … � p Variables , , , N Persons X X X 1 2 p ( ) ( ) θ = = ∧ ∧ = … � Itemset 1 1 k < p X X ( ) (1) ( ) k k ( ) ( ) ( ) θ = = ∧ ∧ = ⇒ = = ϕ … � Rule 1 1 1 X X X + ( ) (1) ( ) ( 1) k k k � Identification of frequent itemsets ( ) fr θ � Itemset frequency: ( ) k ( ) = fr θ ∧ ϕ � Support: s ( ) k � Accuracy (Confidence): ( ) θ ∧ ϕ fr ( ) ( ) θ ⇒ ϕ = ϕ = θ = = ( ) k 1| c p 1 ( ) ( ) k θ fr ( ) k
A-priori Algorithm * � Identification of frequent itemsets θ θ θ … � Start with one variable, i.e. then , , (1) (2) (3) � Compute the support s > s min � � List of frequent itemset � Rule generation � Split the itemset in antecedents A and consequent C � Compute evaluation measure � Evaluation measures = � Prior confidence: C s N prior c = � Posterior confidence: …rule confidence C s s ∧ post a a c * Agrawal & Srikant, 1994
Further Algorithms in WEKA � Predictive Apriori � Rules sorted expected predicted accuracy � Tertius � Confirmation values � TP-/FP-rate � Rules with and/or catenations
Generalized Rule Induction (GRI) * � Quantitative measure for interestingness � Ranks competing rules due to this measure � Information theoretic entropy-calculation � Rule generation � Basically works like a-priori Algorithm � Compute for each rule J-statistic and specialized J s by adding more antecedents � The J-measure [ ] ( ) ( ) = − − − − ∈ � Entropy: Information Measure (small) log 1 log 1 0;1 H p p p p 2 2 ( ) ( ) ⎡ ⎤ ⎛ ⎞ ⎛ ⎞ − | 1 | p x y p x y ( ) ( ) ( ) ( ) ( ) = + − � J-measure: ⎢ ⎥ ⎜ ⎟ ⎜ ⎟ | | log 1 | log J x y p y p x y p x y ( ) ( ) − 2 2 ⎢ 1 ⎥ p x p x ⎝ ⎠ ⎝ ⎠ ⎣ ⎦ ⎡ ⎤ ⎛ ⎞ ⎛ ⎞ 1 ( ) 1 ( ) ( ) ( ) ( ) = − � J s -measure: ⎢ ⎥ ⎜ ⎟ ⎜ ⎟ max | log , 1 | log J p y p x y p y p x y ( ) ( ) − s 2 2 ⎢ 1 ⎥ p x p x ⎝ ⎠ ⎝ ⎠ ⎣ ⎦ * Smyth & Goodman, 1992
Sequential Patterns * Customer Time 1 Time 2 Time 3 Time 4 1 Cheese Wine Beer - � Observations over time Wine Beer Cheese - 2 � Itemsets within each Bread Wine Cheese - 3 time point Crackers Wine Beer - 4 Beer Cheese Bread Cheese 5 � Customer performs Crackers Bread - - 6 transaction � Sequence Notation: X > Y (i.e. Y occurs after X) � Rule generation � Compute s by adding successively time points � CARMA Algorithm as before * Agrawal & Srikant, 1995
Outlook � Decision Trees � CART (Breiman et al., 1984) � C5.0 (Quinlan, 1996)
Recommend
More recommend