Modeling Recurrent Distributions in Streams using Possible Worlds Michael Geilke, Andreas Karwath, and Stefan Kramer Johannes Gutenberg-UniversitΓ€t Mainz, Germany October 20, 2015
Modeling Recurrent Distributions in Streams using Possible Worlds Smart 2
Modeling Recurrent Distributions in Streams using Possible Worlds Smart 3
Modeling Recurrent Distributions in Streams using Possible Worlds Smart 4
Modeling Recurrent Distributions in Streams using Possible Worlds EDDO Smart 5
Modeling Recurrent Distributions in Streams using Possible Worlds EDDO π Smart 6
Modeling Recurrent Distributions in Streams using Possible Worlds EDDO π Smart Query 1 [marginalize] What is the data distribution of the sensors in the living room? 7
Modeling Recurrent Distributions in Streams using Possible Worlds EDDO π Smart Query 2 [setting hard evidence] Two residents are in the living room. What is the probability that they watch TV? 8
Modeling Recurrent Distributions in Streams using Possible Worlds EDDO π Smart Alg 1 Alg 2 POEt Out 1 Out 2 Itemsets 9
Modeling Recurrent Distributions in Streams using Possible Worlds Smart 10
Modeling Recurrent Distributions in Streams using Possible Worlds Smart 11
Modeling Recurrent Distributions in Streams using Possible Worlds EDDO π Smart 12
Modeling Recurrent Distributions in Streams using Possible Worlds Recurrences ο§ day and night ο§ working days and weekends Smart ο§ seasons 13
Modeling Recurrent Distributions in Streams using Possible Worlds Recurrences ο§ pattern could be more complex ο§ may only affect a part of the house 14
Modeling Recurrent Distributions in Streams using Possible Worlds Goal: a representation that ο§ is constantly updated ο§ is representing current and historical data distributions, ο§ is able to represent recurrences ο§ provides a query mechanism 15
Modeling Recurrent Distributions in Streams using Possible Worlds Tasks for proposed method 1. recognize regions of drift 2. represent density of data stream segments 3. identify recurrences on the density level 4. identify recurrences between parts of different densities do all of that in an online fashion 16
Modeling Recurrent Distributions in Streams using Possible Worlds Tasks for proposed method 1. recognize regions of drift 2. represent density of data stream segments 3. identify recurrences on the density level 4. identify recurrences between parts of different densities do all of that in an online fashion 17
Modeling Recurrent Distributions in Streams using Possible Worlds Tasks for proposed method 1. recognize regions of drift 2. represent density of data stream segments 3. identify recurrences on the density level 4. identify recurrences between parts of different densities do all of that in an online fashion 18
Modeling Recurrent Distributions in Streams using Possible Worlds Tasks for proposed method 1. recognize regions of drift 2. represent density of data stream segments 3. identify recurrences on the density level 4. identify recurrences between parts of different densities do all of that in an online fashion 19
Modeling Recurrent Distributions in Streams using Possible Worlds Tasks for proposed method 1. recognize regions of drift 2. represent density of data stream segments 3. identify recurrences on the density level 4. identify recurrences between parts of different densities do all of that in an online fashion 20
Modeling Recurrent Distributions in Streams using Possible Worlds Tasks for proposed method 1. recognize regions of drift 2. represent density of data stream segments 3. identify recurrences on the density level 4. identify recurrences between parts of different densities do all of that in an online fashion 21
Modeling Recurrent Distributions in Streams using Possible Worlds Recognize Regions of Drift 22
Modeling Recurrent Distributions in Streams using Possible Worlds Recognize Regions of Drift A B C Window-based approach β’ extension of an approach by Dries and RΓΌckert 23
Modeling Recurrent Distributions in Streams using Possible Worlds Recognize Regions of Drift A B C π π Window-based approach β’ extension of an approach by Dries and RΓΌckert β’ compute density values with current estimate f 24
Modeling Recurrent Distributions in Streams using Possible Worlds Recognize Regions of Drift A B C π π Wilcoxon Window-based approach β’ extension of an approach by Dries and RΓΌckert β’ compute density values with current estimate f β’ perform drift detection with Wilcoxon rank-sum test 25
Modeling Recurrent Distributions in Streams using Possible Worlds Recognize Regions of Drift A B C π Window-based approach β’ extension of an approach by Dries and RΓΌckert β’ compute density values with current estimate f β’ perform drift detection with Wilcoxon rank-sum test β’ update f with clean instances only 26
Modeling Recurrent Distributions in Streams using Possible Worlds Recognize Regions of Drift A B C π Window-based approach β’ extension of an approach by Dries and RΓΌckert β’ compute density values with current estimate f β’ perform drift detection with Wilcoxon rank-sum test β’ update f with clean instances only 27
Modeling Recurrent Distributions in Streams using Possible Worlds Recognize Regions of Drift A B C π Window-based approach β’ extension of an approach by Dries and RΓΌckert β’ compute density values with current estimate f β’ perform drift detection with Wilcoxon rank-sum test β’ update f with clean instances only 28
Modeling Recurrent Distributions in Streams using Possible Worlds Recognize Regions of Drift C Window-based approach β’ extension of an approach by Dries and RΓΌckert β’ compute density values with current estimate f β’ perform drift detection with Wilcoxon rank-sum test β’ update f with clean instances only 29
Modeling Recurrent Distributions in Streams using Possible Worlds Recurrences of Densities C Recurrent or new? 30
Modeling Recurrent Distributions in Streams using Possible Worlds Recurrences of Densities C π π Recurrent or new? β’ π compare with pool of existing density 3 π 2 estimates π 1 π β’ use statistical test we proposed earlier 5 π 4 β’ reactivate estimate if one is found β’ initialize a new one otherwise 31
Modeling Recurrent Distributions in Streams using Possible Worlds Recurrences of Densities C π π Recurrent or new? β’ π compare with pool of existing density 3 π 2 estimates π 1 π β’ use statistical test we proposed earlier 5 π 4 β’ reactivate estimate if one is found β’ initialize a new one otherwise 32
Modeling Recurrent Distributions in Streams using Possible Worlds Recurrences of Densities C π π Recurrent or new? β’ π compare with pool of existing density 3 π 2 estimates π 1 π β’ use statistical test we proposed earlier 5 π 4 β’ reactivate estimate if one is found β’ initialize a new one otherwise 33
Modeling Recurrent Distributions in Streams using Possible Worlds Recurrences of Densities C π π Recurrent or new? β’ π compare with pool of existing density 3 π 2 estimates π 1 π β’ use statistical test we proposed earlier 5 π 4 β’ reactivate estimate if one is found β’ initialize a new one otherwise 34
Modeling Recurrent Distributions in Streams using Possible Worlds Recurrences of Densities C Wilcoxon Recurrent or new? β’ π compare with pool of existing density 3 π 2 estimates π 1 π β’ use statistical test we proposed earlier 5 π 4 β’ reactivate estimate if one is found β’ initialize a new one otherwise 35
Modeling Recurrent Distributions in Streams using Possible Worlds Recurrences of Density Parts Introduction of modules π π 1 , π 2 , β¦ , π 8 = π 1 π 1 , π 3 , π 8 β π 2 π 2 , π 4 , π 5 β π 3 π 6 β π 4 π 7 If the π π cannot be decomposed any further, then π 1 , π 2 , π 3 , π 4 are called the modules of π . 36
Modeling Recurrent Distributions in Streams using Possible Worlds p 1 p 2 p 4 p 3 37
Modeling Recurrent Distributions in Streams using Possible Worlds Query Mechanism β’ probabilistic extension of possible worlds semantics p 1 p 2 β’ requires density estimators p 4 supporting inference tasks p 3 38
Modeling Recurrent Distributions in Streams using Possible Worlds Query Mechanism β’ probabilistic extension of possible worlds semantics p 1 p 2 β’ requires density estimators p 4 supporting inference tasks p 3 Query 3 [over multiple worlds] Given world W , what is the probability that the resident will switch on the light in the office room? 39
Modeling Recurrent Distributions in Streams using Possible Worlds Evaluation: Modules ο§ evaluation on Datasets Synthetic synthetic and real-world datasets Bayesian networks with ο§ without modules performance is different numbers of nodes, better in many cases, but only slightly different numbers of instances different numbers of variable groups ο§ more explicit representation that Real-World enables detection of recurrences Electricity Shuttle Waterlevel Covertype 40
Recommend
More recommend