High-dimensional, multiscale online changepoint detection Richard J. Samworth University of Cambridge Virtual Mathematical Methods of Modern Statistics 2, CIRM Luminy 04 June 2020
Collaborators Yudong Chen Tengyao Wang Online changepoint detection 2/28
Changepoint problems ◮ Modern technology has facilitated the real-time monitoring of many types of evolving processes. ◮ Very ofen, a key feature of interest for data streams is a changepoint. Online changepoint detection 3/28
Changepoint problems ◮ Modern technology has facilitated the real-time monitoring of many types of evolving processes. ◮ Very ofen, a key feature of interest for data streams is a changepoint. Online changepoint detection 3/28
From offline to online ◮ The vast majority of the changepoint literature concerns the offline problem (Killick et al., 2012; Wang and Samworth, 2018; Wang et al., 2018; Baranowski et al., 2019; Liu, Gao and Samworth, 2019) . ◮ Univariate online changepoints have been studied within the well-established field of statistical process control (Duncan, 1952; Page, 1954; Barnard, 1959; Fearnhead and Liu, 2007; Oakland, 2007) . ◮ Much less work on multivariate, online changepoint problems (Tartakovsky et al., 2006; Mei, 2010; Zou et al., 2015) . Several methods involve scanning a moving window of fixed size for changes (Xie and Siegmund, 2013; Soh and Chandrasekaran, 2017; Chan, 2017) . Online changepoint detection 4/28
From offline to online ◮ The vast majority of the changepoint literature concerns the offline problem (Killick et al., 2012; Wang and Samworth, 2018; Wang et al., 2018; Baranowski et al., 2019; Liu, Gao and Samworth, 2019) . ◮ Univariate online changepoints have been studied within the well-established field of statistical process control (Duncan, 1952; Page, 1954; Barnard, 1959; Fearnhead and Liu, 2007; Oakland, 2007) . ◮ Much less work on multivariate, online changepoint problems (Tartakovsky et al., 2006; Mei, 2010; Zou et al., 2015) . Several methods involve scanning a moving window of fixed size for changes (Xie and Siegmund, 2013; Soh and Chandrasekaran, 2017; Chan, 2017) . Online changepoint detection 4/28
From offline to online ◮ The vast majority of the changepoint literature concerns the offline problem (Killick et al., 2012; Wang and Samworth, 2018; Wang et al., 2018; Baranowski et al., 2019; Liu, Gao and Samworth, 2019) . ◮ Univariate online changepoints have been studied within the well-established field of statistical process control (Duncan, 1952; Page, 1954; Barnard, 1959; Fearnhead and Liu, 2007; Oakland, 2007) . ◮ Much less work on multivariate, online changepoint problems (Tartakovsky et al., 2006; Mei, 2010; Zou et al., 2015) . Several methods involve scanning a moving window of fixed size for changes (Xie and Siegmund, 2013; Soh and Chandrasekaran, 2017; Chan, 2017) . Online changepoint detection 4/28
Online algorithm Key definition of an online algorithm: Definition. The computational complexity for processing a new observation depends only on the number of bits needed to represent it . ◮ For the purposes of this definition, all real numbers are considered as floating point numbers. ◮ Importantly, the computational complexity is not allowed to depend on the number of previously observed data points. Online changepoint detection 5/28
Online algorithm Key definition of an online algorithm: Definition. The computational complexity for processing a new observation depends only on the number of bits needed to represent it . ◮ For the purposes of this definition, all real numbers are considered as floating point numbers. ◮ Importantly, the computational complexity is not allowed to depend on the number of previously observed data points. Online changepoint detection 5/28
Online algorithm Key definition of an online algorithm: Definition. The computational complexity for processing a new observation depends only on the number of bits needed to represent it . ◮ For the purposes of this definition, all real numbers are considered as floating point numbers. ◮ Importantly, the computational complexity is not allowed to depend on the number of previously observed data points. Online changepoint detection 5/28
Problem seting We consider a high-dimensional online changepoint detection problem for independent random vectors ( X n ) n ∈ N : ◮ Data generating mechanism: for some unknown, deterministic time z ∈ N ∪ { 0 } , we have X 1 , . . . , X z ∼ N p (0 , I p ) X z +1 , X z +2 , . . . ∼ N p ( θ, I p ) . and ◮ θ = 0 : data generated under the null , i.e. no change. ◮ θ � = 0 : data generated under the alternative , i.e. there exists a change. ◮ Assume ϑ := � θ � 2 is at least a known lower bound β > 0 . Online changepoint detection 6/28
Problem seting We consider a high-dimensional online changepoint detection problem for independent random vectors ( X n ) n ∈ N : ◮ Data generating mechanism: for some unknown, deterministic time z ∈ N ∪ { 0 } , we have X 1 , . . . , X z ∼ N p (0 , I p ) X z +1 , X z +2 , . . . ∼ N p ( θ, I p ) . and ◮ θ = 0 : data generated under the null , i.e. no change. ◮ θ � = 0 : data generated under the alternative , i.e. there exists a change. ◮ Assume ϑ := � θ � 2 is at least a known lower bound β > 0 . Online changepoint detection 6/28
Problem seting We consider a high-dimensional online changepoint detection problem for independent random vectors ( X n ) n ∈ N : ◮ Data generating mechanism: for some unknown, deterministic time z ∈ N ∪ { 0 } , we have X 1 , . . . , X z ∼ N p (0 , I p ) X z +1 , X z +2 , . . . ∼ N p ( θ, I p ) . and ◮ θ = 0 : data generated under the null , i.e. no change. ◮ θ � = 0 : data generated under the alternative , i.e. there exists a change. ◮ Assume ϑ := � θ � 2 is at least a known lower bound β > 0 . Online changepoint detection 6/28
Example of an online algorithm (Page, 1954) Let p = 1 and assume θ > 0 . Page’s procedure: n � � � R n := max β ( X i − β/ 2) = max R n − 1 + β ( X n − β/ 2) , 0 . 0 ≤ h ≤ n i = n − h +1 Threshold T ≡ T β for changepoint declaration. Online changepoint detection 7/28
Example of an online algorithm? Let p = 1 and assume θ > 0 . Scanning window-based method with window width w > 0 : n � W n := β ( X i − β/ 2) . i = n − w +1 – Window size w needs to increase when β decreases. – Computational complexity depends on β . Online changepoint detection 8/28
Example of an online algorithm? Let p = 1 and assume θ > 0 . Scanning window-based method with window width w > 0 : n � W n := β ( X i − β/ 2) . i = n − w +1 – Window size w needs to increase when β decreases. – Computational complexity depends on β . Online changepoint detection 8/28
Example of an online algorithm? Let p = 1 and assume θ > 0 . Scanning window-based method with window width w > 0 : n � W n := β ( X i − β/ 2) . i = n − w +1 – Window size w needs to increase when β decreases. – Computational complexity depends on β . Online changepoint detection 8/28
Example of a non-online algorithm Let p = 1 and assume θ > 0 . Shiryaev–Roberts procedure (Shiryaev, 1963; Roberts, 1966): n n � � e b ( X h − b/ 2) . SR n := i =1 h = i The statistics cannot be defined recursively, so this is a sequential algorithm but not an online algorithm. Online changepoint detection 9/28
Procedures and performance measures A sequential changepoint procedure is an extended stopping time N (w.r.t. the natural filtration) taking values in N ∪ {∞} . ◮ The patience of a sequential changepoint procedure N is E 0 ( N ) ; also known as the average run length to false alarm. ◮ Two types of response delays : – (Average case) response delay ¯ � � E θ ( N ) := sup ( N − z ) ∨ 0 ; E z,θ z ∈ N – Worst case response delay ¯ E wc � � θ ( N ) := sup ess sup E z,θ ( N − z ) ∨ 0 | X 1 , . . . , X z . z ∈ N Thus, E θ ( N ) ≤ ¯ ¯ E wc θ ( N ) . Online changepoint detection 10/28
A high-dimensional, multiscale online algorithm: ocd
Diagonal statistics i ) ⊤ ∈ R p . For n ∈ N , b ∈ R \{ 0 } and j ∈ [ p ] , i , . . . , X p ◮ Write X i = ( X 1 define n R j � b ( X j n,b := max i − b/ 2) 0 ≤ h ≤ n i = n − h +1 n t j � b ( X j n,b := argmax i − b/ 2) . 0 ≤ h ≤ n i = n − h +1 R j ◮ � n,b ) j ∈ [ p ] are called the diagonal statistics . Online changepoint detection 12/28
Off-diagonal statistics ◮ For each j ∈ [ p ] , compute tail partial sums of length t j n,b in all coordinates j ′ ∈ [ p ] : n A j ′ ,j � X j ′ n,b := i . i = n − t j n,b +1 ◮ We aggregate to form an off-diagonal statistic anchored at coordinate j : ( A j ′ ,j n,b ) 2 Q j � n,b := � . ✶ � � | A j ′ ,j t j t j n,b |≥ a n,b ∨ 1 n,b j ′ ∈ [ p ]: j ′ � = j ◮ Different values of a can be chosen to detect dense or sparse signals. Online changepoint detection 13/28
Off-diagonal statistics ◮ For each j ∈ [ p ] , compute tail partial sums of length t j n,b in all coordinates j ′ ∈ [ p ] : n A j ′ ,j � X j ′ n,b := i . i = n − t j n,b +1 ◮ We aggregate to form an off-diagonal statistic anchored at coordinate j : ( A j ′ ,j n,b ) 2 Q j � n,b := � . ✶ � � | A j ′ ,j t j t j n,b |≥ a n,b ∨ 1 n,b j ′ ∈ [ p ]: j ′ � = j ◮ Different values of a can be chosen to detect dense or sparse signals. Online changepoint detection 13/28
Recommend
More recommend