extensions loaded every trace has a name every event has a name and a transition start of trace (i.e. classifier = name + transition process instance) name of trace resource timestamp name of event (activity name) transition PAGE 48
end of trace (i.e. process instance) start of trace name of trace resource timestamp name of event (activity name) data associated to event PAGE 49 PAGE 49
Example log case 1 : task A • Minimal information in log: case case 2 : task A id’s and task id’s. case 3 : task A • Additional information: event case 3 : task B type, time, resources, and data. case 1 : task B • Sequences: case 1 : task C • 1: ABCD case 2 : task C • 2: ACBD case 4 : task A case 2 : task B • 3: ABCD case 2 : task D • 4: ACBD case 5 : task E • 5: EF case 4 : task C • So this log there are three case 1 : task D possible sequences: case 3 : task C • ABCD case 3 : task D • ACBD case 4 : task B case 5 : task F • EF case 4 : task D PAGE 50
>, → ,||,# relations case 1 : task A • Direct succession : x>y iff case 2 : task A case 3 : task A for some case x is directly case 3 : task B case 1 : task B case 1 : task C followed by y. case 2 : task C case 4 : task A case 2 : task B • Causality : x → y iff x>y and case 2 : task D case 5 : task E ABCD case 4 : task C not y>x. case 1 : task D ACBD case 3 : task C case 3 : task D EF • Parallel : x||y iff x>y and case 4 : task B case 5 : task F case 4 : task D y>x • Choice : x#y iff not x>y and A>B A → B A>C not y>x. A → C B>C B||C B → D B>D C||B C → D C>B C>D E → F E>F PAGE 51
Basic idea (1) x y x → y PAGE 52
Basic idea (2) y x z x → y, x → z, and y||z PAGE 53
Basic idea (3) y x z x → y, x → z, and y#z PAGE 54
Basic idea (4) x z y x → z, y → z, and x||y PAGE 55
Basic idea (5) x z y x → z, y → z, and x#y PAGE 56
It is not that simple! Basic Alpha algorithm Let W be a workflow log over T. α (W) is defined as follows. 1. T W = { t ∈ T | ∃ σ ∈ W t ∈ σ }, 2. T I = { t ∈ T | ∃ σ ∈ W t = first ( σ ) }, 3. T O = { t ∈ T | ∃ σ ∈ W t = last ( σ ) }, 4. X W = { (A,B) | A ⊆ T W ∧ A ≠ ø ∧ B ⊆ T W ∧ B ≠ ø ∧ ∀ a ∈ A ∀ b ∈ B a → W b ∧ ∀ a1,a2 ∈ A a 1 # W a 2 ∧ ∀ b1,b2 ∈ B b 1 # W b 2 }, 5. Y W = { (A,B) ∈ X | ∀ (A ′ ,B ′ ) ∈ X A ⊆ A ′ ∧ B ⊆ B ′⇒ (A,B) = (A ′ ,B ′ ) }, 6. P W = { p (A,B) | (A,B) ∈ Y W } ∪ {i W ,o W }, 7. F W = { (a,p (A,B) ) | (A,B) ∈ Y W ∧ a ∈ A } ∪ { (p (A,B) ,b) | (A,B) ∈ Y W ∧ b ∈ B } ∪ { (i W ,t) | t ∈ T I } ∪ { (t,o W ) | t ∈ T O }, and 8. α (W) = (P W ,T W ,F W ). PAGE 57
Example revisited W: case case 1 1 : t : task ask A A case case 2 2 : t : task ask A A case 3 case 3 : t : task ask A A case case 3 3 : t : task ask B B case case 1 1 : t : task ask B B case 1 case 1 : t : task ask C C case case 2 2 : t : task ask C C case case 4 4 : t : task ask A A α (W) case case 2 2 : t : task ask B B case case 2 2 : t : task ask D D B case 5 case 5 : t : task ask E E case 4 case 4 : t : task ask C C case case 1 1 : t : task ask D D case case 3 3 : t : task ask C C A D case case 3 3 : t : task ask D D case case 4 4 : t : task ask B B case case 5 5 : t : task ask F F case case 4 4 : t : task ask D D C A → B A>B A → C A>C B → D B>C C → D B>D E F E → F C>B C>D E>F B||C C||B PAGE 58
Exercise (1) • What does the Alpha algorithm produce for a log consisting only of the following traces? • ABCD • ACBD Let W be a workflow log over T. α (W) is defined as follows. 1. T W = { t ∈ T | ∃ σ ∈ W t ∈ σ }, • AED 2. T I = { t ∈ T | ∃ σ ∈ W t = first ( σ ) }, 3. T O = { t ∈ T | ∃ σ ∈ W t = last ( σ ) }, 4. X W = { (A,B) | A ⊆ T W ∧ A ≠ ø ∧ B ⊆ T W ∧ B ≠ ø ∧ ∀ a ∈ A ∀ b ∈ B a → W b ∧ ∀ a1,a2 ∈ A a 1 # W a 2 ∧ ∀ b1,b2 ∈ B b 1 # W b 2 }, 5. Y W = { (A,B) ∈ X | ∀ (A ′ ,B ′ ) ∈ X A ⊆ A ′ ∧ B ⊆ B ′⇒ (A,B) = • Direct succession: x>y iff for (A ′ ,B ′ ) }, some case x is directly followed by 6. P W = { p (A,B) | (A,B) ∈ Y W } ∪ {i W ,o W }, y. • Causality: x → y iff x>y and not y>x. 7. F W = { (a,p (A,B) ) | (A,B) ∈ Y W ∧ a ∈ A } ∪ { • Parallel: x||y iff x>y and y>x (p (A,B) ,b) | (A,B) ∈ Y W ∧ b ∈ B } ∪ { (i W ,t) | t ∈ T I } ∪ { • Choice: x#y iff not x>y and not (t,o W ) | t ∈ T O }, and y>x. 8. α (W) = (P W ,T W ,F W ). PAGE 59
Another example taken step-by-step ... PAGE 60
A → B A>B A → C A>C A → E A>E B → D B>C C → D D>D E → D C>B C>D B||C E>D C||B PAGE 61
PAGE 62
# # A and B need to be non-empty. A → B A>B A → C A>C A → E A>E B → D B>C C → D D>D E → D C>B C>D B||C E>D C||B PAGE 63
PAGE 64
PAGE 65
Exercise (2) • What does the Alpha algorithm produce for a log consisting only of the following traces? • ACD Let W be a workflow log over T. α (W) is defined as follows. • BCE 1. T W = { t ∈ T | ∃ σ ∈ W t ∈ σ }, 2. T I = { t ∈ T | ∃ σ ∈ W t = first ( σ ) }, 3. T O = { t ∈ T | ∃ σ ∈ W t = last ( σ ) }, 4. X W = { (A,B) | A ⊆ T W ∧ A ≠ ø ∧ B ⊆ T W ∧ B ≠ ø ∧ ∀ a ∈ A ∀ b ∈ B a → W b ∧ ∀ a1,a2 ∈ A a 1 # W a 2 ∧ ∀ b1,b2 ∈ B b 1 # W b 2 }, 5. Y W = { (A,B) ∈ X | ∀ (A ′ ,B ′ ) ∈ X A ⊆ A ′ ∧ B ⊆ B ′⇒ (A,B) = • Direct succession: x>y iff for (A ′ ,B ′ ) }, some case x is directly followed by 6. P W = { p (A,B) | (A,B) ∈ Y W } ∪ {i W ,o W }, y. • Causality: x → y iff x>y and not y>x. 7. F W = { (a,p (A,B) ) | (A,B) ∈ Y W ∧ a ∈ A } ∪ { • Parallel: x||y iff x>y and y>x (p (A,B) ,b) | (A,B) ∈ Y W ∧ b ∈ B } ∪ { (i W ,t) | t ∈ T I } ∪ { • Choice: x#y iff not x>y and not (t,o W ) | t ∈ T O }, and y>x. 8. α (W) = (P W ,T W ,F W ). PAGE 66
Exercise (3) • What does the Alpha algorithm produce for a log consisting only of the following traces? • ACEG Let W be a workflow log over T. α (W) is defined as follows. • AECG 1. T W = { t ∈ T | ∃ σ ∈ W t ∈ σ }, 2. T I = { t ∈ T | ∃ σ ∈ W t = first ( σ ) }, • BDFG 3. T O = { t ∈ T | ∃ σ ∈ W t = last ( σ ) }, • BFDG 4. X W = { (A,B) | A ⊆ T W ∧ A ≠ ø ∧ B ⊆ T W ∧ B ≠ ø ∧ ∀ a ∈ A ∀ b ∈ B a → W b ∧ ∀ a1,a2 ∈ A a 1 # W a 2 ∧ ∀ b1,b2 ∈ B b 1 # W b 2 }, 5. Y W = { (A,B) ∈ X | ∀ (A ′ ,B ′ ) ∈ X A ⊆ A ′ ∧ B ⊆ B ′⇒ (A,B) = • Direct succession: x>y iff for (A ′ ,B ′ ) }, some case x is directly followed by 6. P W = { p (A,B) | (A,B) ∈ Y W } ∪ {i W ,o W }, y. • Causality: x → y iff x>y and not y>x. 7. F W = { (a,p (A,B) ) | (A,B) ∈ Y W ∧ a ∈ A } ∪ { • Parallel: x||y iff x>y and y>x (p (A,B) ,b) | (A,B) ∈ Y W ∧ b ∈ B } ∪ { (i W ,t) | t ∈ T I } ∪ { • Choice: x#y iff not x>y and not (t,o W ) | t ∈ T O }, and y>x. 8. α (W) = (P W ,T W ,F W ). PAGE 67
More on Process Discovery
Examples of process discovery techniques • Algorithmic techniques • Alpha miner • Alpha+, Alpha++, Alpha# • FSM miner • Fuzzy miner • Heuristic miner • Multi phase miner • Genetic process mining • Single/duplicate tasks • Distributed GM • Region-based process mining • State-based regions • Language based regions • Classical approaches not dealing with concurrency • Inductive inference (Mark Gold, Dana Angluin et al.) Sequence mining • PAGE 69
Genetic Mining (Ana Karla Alves de Medeiros et al.) 1. initial population B E H A J M C F I L 6. mutation 7. new population 2. fitness test 5. children D 4. crossover B E H K A J M 3. select best parents C F I L H K B E H K A J M A J M C F I L C F I L PAGE 70
Design choices representation 1. initial population fitness 6. mutation 7. new population 2. fitness test mutation 5. children 4. crossover 3. select best parents crossover PAGE 71
Properties of Genetic Mining • Requires a lot of computing power. • Can deal with noise, infrequent behavior, duplicate tasks, invisible tasks, etc. • Allows for incremental improvement and combinations with other approaches (heuristics post-optimization, etc.). PAGE 72
Challenge: Balancing Between Underfitting and Overfitting PAGE 73
The essence ABCD ACBD B AED ABCD A E D ABCD AED ACBD C ... PAGE 74
But ... B Any log A C containg activities start end A, B, C, D, and E. D E PAGE 75
Finding a balance (c) A D ACD BCE C ... B E (a) more more behavior behavior A D ACD ACE BCE C BCD ... (d) B E (b) PAGE 76
A D C ACD 99 B E ACE 0 BCE 85 A D BCD 0 C B E PAGE 77
A D C ACD 99 B E ACE 88 BCE 85 A D BCD 78 C B E PAGE 78
A D C ACD 99 B E ACE 2 BCE 85 A D BCD 3 C B E PAGE 79
Evaluating process mining results Fitness: Is the event log possible according to the model? Precision: Is the model Generalization: Is the model not underfitting (allow for not overfitting (only allow for too much)? the “accidental” examples)? Structure: Is this the simplest model (Occam's Razor)? PAGE 80
PAGE 81
Representing process models PAGE 82
B Need for trip has arisen Entry of a travel request A E Trip is requested Approval C of travel request Need Planned Planned to correct trip trip planned is rejected is approved trip is transmitted D Advance payment Trip Unrequested Approved advance trip trip is transmitted/ has taken has taken paid place place Entry of trip facts Trip facts and receipts have been released for checking Approval of trip facts Planned Trip Trip Approval trip expenses facts of trip must reimbursement are released facts be canceled is rejected for accounting is transmitted Accounting date is reached Travel Expenses Trip Payment Trip Amounts Amounts Trip expenses amount costs Payments Payment relevant liable costs reimbursement transmitted must must must to accounting to employment statement must to bank/ be included be released be effected transmitted tax transmitted is transmitted be canceled payee in cost accounting to payroll accounting to payroll Cancellation Trip costs Trip cancelation is canceled statement PAGE 83 is transmitted
More significant nodes are emphasized Highlights more important paths PAGE 84
More to learn from maps... Abstraction Aggregation Removing isolated, less Clustering of coherent, significant structures less significant structures PAGE 85
Fuzzy miner PAGE 86
Showing reality PAGE 87
Back to the future …
supports/ “world” controls business software processes system people machines components organizations records events, e.g., messages, specifies transactions, models configures etc. analyzes implements analyzes discovery (process) event conformance model logs extension PAGE 89
Rec ecomme ommend: How ow to to get home ASAP et home ASAP? Take Take a a lef eft tu t turn! Detec etect: You You d drive too ve too fas ast! t! Pre redi dict ct: When wil will I b be h home? ? At 1 11.26! PAGE 90
Operational Support: Detect, Predict, and Recommend detect alerts current predict predictions data recommend recommendations (simulation) models learn (discover and enhance) historic data PAGE 91
Operational Support and Conformanc Checking Based on Replay
Play Out (Classical use of models) B p1 p3 A E D start end p2 p4 C A E D A E D A B C D A C B D A B C D A C B D A E D A C B D PAGE 93
Play In (Process Discovery) ABCD ACBD a process discovery AED algorithm like the α ACBD algorithm AED ABCD … B p1 p3 A E D start end p2 p4 C PAGE 94
Replay A B C D B p1 p3 A E D start end p2 p4 C PAGE 95
Replay can detect problems AC D Problem! Problem! B token left behind missing token p1 p3 A E D start end p2 p4 C PAGE 96
Replay can extract timing information A 5 B 8 C 9 D 13 8 5 6 7 4 3 B 5 2 8 p1 p3 A E D start end 13 5 4 p2 p4 4 3 C 4 3 7 7 6 9 PAGE 97
Example: Conformance Checker
Conformance checker (Anne Rozinat et al.) How to quantify this? PAGE 99
Fitness by replay m=missing,r=remaining,c=consumed,p=produced PAGE 100
No problem (m=0, r=0) PAGE 101
Another (impossible) trace PAGE 102
PAGE 103
Recommend
More recommend