33 Processes @ your Service Using Process Mining to Turn Big Data into Real Value prof.dr.ir. Wil van der Aalst PAGE 0
Web Engineering model, specify, observe configure, implement behavior (e.g., processes event data) PAGE 1
PAGE 2
Big Data ? PAGE 3
Big … or fast and efficient? PAGE 4
What is process What are the mining? What are the main main research pitfalls of process challenges? modeling? The future is bright, but how to get started? Why is process discovery difficult? PAGE 5
What is process What are the mining? What are the main main research pitfalls of process challenges? modeling? The future is bright, but how to get started? Why is process discovery difficult? PAGE 6
examine e1 thoroughly pay start OR OR e4 e5 end compensation examine e2 casually register XOR AND AND decide request check e3 ticket reject end XOR e6 request b examine thoroughly g c1 c3 pay c compensation a e examine start register casually c5 decide end request examine thoroughly h d c2 pay c4 reject OR-split OR-join compensation request check ticket f register reinitiate request c1 request examine casually start end decide examine pay reinitiate thoroughly compensation request check register examine reject c2 check ticket request reject request casually ticket [c5] [start] [c2,c3] [end] [c1,c2] request decide check examine ticket casually [c1,c4] [c3,c4] new examine thoroughly information c3 examine thoroughly pay examine compensation casually register decide request start reject end check ticket request reinitiate request
• enormous investments in process models • large collections of "dead" process models • not taken seriously, unrelated to reality PAGE 8
problem #1 Aiming for one model that suits all purposes PAGE 9
PAGE 10
problem #2 Straightjacketing smaller interacting processes into one monolithic model PAGE 11
examine thoroughly pay examine compensation casually register decide request start reject end check ticket request reinitiate request PAGE 12
What is the process instance? Orderline Order OrderLineID : OrderLineID 1 1..* OrderID : OrderID OrderID : OrderID Customer : CustID Product : ProdID Amount : Euro NofItems : PosInt Created : DateTime TotalWeight : Weight Paid : DateTime Entered : DateTime Completed : DateTime BackOrdered : DateTime Secured : DateTime DelID : DelID 1..* 0..1 Attempt Delivery 0..* 1 DelID : DelID DelID : DelID When : DateTime DelAddress : Address Successful : Bool Contact : PhoneNo PAGE 13
problem #3 Using a static hierarchical decomposition as the only abstraction mechanism PAGE 14
most process modeling notations assume a fixed hierarchy no seamless zoom-in and zoom out! traditional hierarchy concepts don't support "Google Maps" abstraction PAGE 15
problem #4 Modeling humans as if they are machines doing a single task PAGE 16
"My processes are unique, my people are artists!" PAGE 17
? PAGE 18
problem #5 Being vague about vagueness PAGE 19
examine thoroughly pay examine compensation casually register decide request start reject end check ticket request reinitiate request PAGE 20
problem #6 Abstracting from the things that matter PAGE 21
PAGE 22
What is process What are the mining? What are the main main research pitfalls of process challenges? modeling? The future is bright, but how to get started? Why is process discovery difficult? PAGE 23
Positioning Process Mining process model analysis (simulation, verification, optimization, gaming, etc.) performance- compliance- oriented process oriented questions, questions, mining problems and problems and solutions solutions data-oriented analysis (data mining, machine learning, business intelligence) PAGE 24
www.olifantenpaadjes.nl PAGE 25
PAGE 26
Let us take a step back and see how models and behavior relate: Let's play! PAGE 27
Play-Out process model event log PAGE 28
Play-Out (Classical use of models) B p1 p3 A E D start end p2 p4 C A E D A E D A B C D A C B D A B C D A C B D A E D A C B D PAGE 29
Play-In event log process model PAGE 30
Play-In A E D A E D A B C D A C B D A B C D A C B D A E D A C B D B p1 p3 A E D start end p2 p4 C PAGE 31
Example Process Discovery (Vestia, Dutch housing agency, 208 cases, 5987 events) PAGE 32
Example Process Discovery (ASML, test process lithography systems, 154966 events) PAGE 33
Example Process Discovery (AMC, 627 gynecological oncology patients, 24331 events) PAGE 34
Replay · extended model showing times, frequencies, etc. · diagnostics · predictions · recommendations process model event log PAGE 35
Replay A B C D B p1 p3 A E D start end p2 p4 C PAGE 36
Replay A E D B p1 p3 A E D start end p2 p4 C PAGE 37
Replay can detect problems A C D Problem! Problem! B token left behind missing token p1 p3 A E D start end p2 p4 C PAGE 38
Conformance Checking (WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988) PAGE 39
Replay can extract timing information A 5 B 8 C 9 D 13 8 5 6 7 4 3 B 5 2 8 p1 p3 A E D start end 13 5 4 p2 p4 4 3 C 4 3 7 7 6 9 PAGE 40
Performance Analysis Using Replay (WOZ objections Dutch municipality, 745 objections, 9583 event, f= 0.988) PAGE 41
Models are like the glasses required to see and understand event data! PAGE 42
PAGE 43
Alignments are essential! • conformance checking to diagnose deviations • squeezing reality into the model to do model-based analysis PAGE 44
process event log model synchronous move move on move on log model only only PAGE 45
Example: BPI Challenge 2012 (Dutch financial institute, doi:10.4121/uuid:3926db30-f712-4394-aebc-75976070e91f ) “O _ DECLINED” and “W _Wijzigen Loops of “W _ Completeren aanvraag” and contractgegevens” are often skipped “W _ Nabellen offertes” are often performed Many moves on log of “O _ CANCELLED” , ”O _ CREATED” , ”O _ SELECTED” , Many moves on log of “O _ SENT” occurred “W _Afhandelen with the same leads” ( > 2200 times) frequency value (i.e. occurred in the end of 60) before parallel traces branch Work of Arya Adriansyah (Replay project) PAGE 46
“O _ DECLINED” and “W _Wijzigen Synchronous moves of Loops of “W _ Completeren aanvraag” and contractgegevens” are often skipped Move on log of “Completeren aanvraag” “W _ Nabellen offertes” are often performed “Completeren aanvraag” The average waiting time for the input place of “W _Nabellen offertes+ START” is very long (2.83 days) compares to the average waiting time of other places Many moves on log of “O _ CANCELLED” , ”O _ CREATED” , ”O _ SELECTED” , Many moves on log of Move on log of “O _ CANCELLED” and “A _ CANCELLED” “O _ SENT” occurred “W _Afhandelen with the same Moves on model towards end of traces leads” ( > 2200 times) frequency value (i.e. occurred in the end of 60) before parallel traces branch “O _ ACCEPTED” has average sojourn time of 27.07 minutes, while “A _ REGISTERED” , ”A _ ACTIVATED” , and “A _ APPROVED” have average sojourn time of 29.56 minutes Activity “W _ Wijzigen contractgegevens” is the bottleneck, but it occured rarely (only 4 times) PAGE 47
What is process What are the mining? What are the main main research pitfalls of process challenges? modeling? The future is bright, but how to get started? Why is process discovery difficult? PAGE 48
Language identification in the limit (Mark Gold 1967) abc abd ad abc ? abbc ab(c|d) ? ac (ad)|(ab(c|d)) ? … ab*(c|d) ? A language is learnable in the limit if there exists a perfect child that generates only finitely many hypotheses. PAGE 49 Language identification in the limit by E Mark Gold, Information and Control, 10(5):447 – 474, 1967.
Learning is not easy … • Even simple languages like regular languages are not learnable in the limit. • Many settings: evil or well- behaving mothers, with or without negative examples, frequencies, etc. sentence trace in event log language process model PAGE 50
at the start of the century, process mining emerged as a new research topic remarkable progress over a relatively short period See keynote at Process Mining Camp 2013, http://fluxicon.com/camp/2013/ PAGE 51
Process discovery challenge (oversimplied no resources, data, etc.) a = register request event b = examine file c = check ticket log d = decide e = reinitiate request a,c,d,f,g f = send acceptance letter a,b,c,d,e,c,d,g,f g = pay compensation a,c,d,h h = send rejection letter f c7 c8 t8 t7 t11 process model g t2 c6 c9 t9 c1 c3 b a d t3 start c5 end t1 t5 h c c2 c4 t10 t4 e t6 PAGE 52
Recommend
More recommend