PRETSA: Event Log Sanitization for Privacy-aware Process Discovery Stephan A. Fahrenkrog-Petersen, Han van der Aa & Matthias Weidlich
Motivation hu-berlin.de/pda � 2
Related Work [Sweeney et al., 2002] Process Mining [Monreale et al., 2014] Sanitized Process Mining Event Log Sanitization Event Data Artifact Data Contribution Privatized Process Mining Data Extraction Process Mining Individual Information System Event Data Artifact [Mannhardt et al., 2019] hu-berlin.de/pda � 3
Research Problem • Use Case: Process Discovery with performance data • Privacy Issue: Surveillance of individual process workers —> Illegal e.g. in Germany • Preserve as much utility as possible hu-berlin.de/pda � 4
Attack Model • Trace Linkage Attack • Link trace with background knowledge • Identity Disclosure • Membership Disclosure • Attribute Disclosure hu-berlin.de/pda � 5
Background: k-anonymity hu-berlin.de/pda � 6
Background: t-closeness • Extension of k-anonymity • Limiting di ff erence in global and local distribution • Earth Mover’s Distance as measure hu-berlin.de/pda � 7
PRETSA: PREfix-Tree based event log SAnitization Process Discovery PRETSA Event Log Process Model Event Log with with Performance Data k-anonymity & t-closeness hu-berlin.de/pda � 8
PRETSA - Walkthrough Sequence variant # create po , update po , receive gd , check in , pay in 10 σ 1 create po , update po , receive gd , check in , reject in 5 σ 2 create po , receive gd , update po , check in , pay in 7 σ 3 create po , receive gd , update po , check in , reject in 5 σ 4 create po , receive gd , update po , update po , check in , pay in 1 σ 5 • Example with an Order-to-Cash process • Assume k=8 hu-berlin.de/pda � 9
PRETSA - Prefix tree Root create_po (28) update_po (15) receive_gd (13) • PRETSA generates a prefix tree from an event update_po (13) receive_gd (15) log update_po (1) check_in (12) • Each node in the tree is check_in (15) check_in (1) an equivalence class reject_in pay_in (7) (5) pay_in (10) pay_in (1) reject_in (5) hu-berlin.de/pda � 10
PRETSA - Walkthrough Root k=8 create_po (28) update_po (15) receive_gd (13) • Go through the tree until update_po (13) receive_gd (15) violation is found update_po (1) check_in (12) check_in (15) check_in (1) reject_in pay_in (7) (5) pay_in (10) pay_in (1) reject_in (5) hu-berlin.de/pda � 11
PRETSA - Walkthrough Root k=8 create_po (28) update_po (15) receive_gd (13) • PRETSA deleted the branch with violation update_po (13) receive_gd (15) • Move the traces into update_po (1) check_in (12) most similar branch check_in (15) check_in (1) reject_in pay_in (7) (5) pay_in (15) pay_in (1) hu-berlin.de/pda � 12
PRETSA - Result Root k=8 create_po (28) update_po (15) receive_gd (13) • Resulting tree receive_gd (15) update_po (13) check_in (15) check_in (13) pay_in (15) pay_in (13) hu-berlin.de/pda � 13
Evaluation Setup • Utility benefit? • PRETSA vs. Baseline • Datasets: Tra ffi c fines, Sepsis & CoSeLog hu-berlin.de/pda � 14
Experimental Setup • Compare… • …generated event logs —> Nr. Variants • …fitness/precision of process models • …performance annotations relative error hu-berlin.de/pda � 15
Utility Evaluation - Baseline Sequence variant # create po , update po , receive gd , check in , pay in 10 σ 1 create po , update po , receive gd , check in , reject in 5 σ 2 create po , receive gd , update po , check in , pay in 7 σ 3 create po , receive gd , update po , check in , reject in 5 σ 4 create po , receive gd , update po , update po , check in , pay in 1 σ 5 • Only release variants that fulfill: • k-anonymity • t-closeness • Delete all other variants hu-berlin.de/pda � 16
Evaluation - Event Logs hu-berlin.de/pda � 17
Evaluation - Process Models hu-berlin.de/pda � 18
Evaluation - Perfomance Annotations hu-berlin.de/pda � 19
PRETSA… …ensures privacy (k-anonymity & t-closeness) for event logs …uses a prefix tree representation of the event log …provides event logs with high utility for process discovery …is available on GitHub under MIT license: github.com/samadeusfp/PRETSA Questions? Reach out to fahrenks@hu-berlin.de hu-berlin.de/pda � 20
Recommend
More recommend