high dimensional multiscale online changepoint detection
play

High-dimensional, multiscale online changepoint detection Richard - PowerPoint PPT Presentation

High-dimensional, multiscale online changepoint detection Richard J. Samworth University of Cambridge Virtual Mathematical Methods of Modern Statistics 2, CIRM Luminy 04 June 2020 Collaborators Yudong Chen Tengyao Wang Online changepoint


  1. High-dimensional, multiscale online changepoint detection Richard J. Samworth University of Cambridge Virtual Mathematical Methods of Modern Statistics 2, CIRM Luminy 04 June 2020

  2. Collaborators Yudong Chen Tengyao Wang Online changepoint detection 2/28

  3. Changepoint problems ◮ Modern technology has facilitated the real-time monitoring of many types of evolving processes. ◮ Very ofen, a key feature of interest for data streams is a changepoint. Online changepoint detection 3/28

  4. Changepoint problems ◮ Modern technology has facilitated the real-time monitoring of many types of evolving processes. ◮ Very ofen, a key feature of interest for data streams is a changepoint. Online changepoint detection 3/28

  5. From offline to online ◮ The vast majority of the changepoint literature concerns the offline problem (Killick et al., 2012; Wang and Samworth, 2018; Wang et al., 2018; Baranowski et al., 2019; Liu, Gao and Samworth, 2019) . ◮ Univariate online changepoints have been studied within the well-established field of statistical process control (Duncan, 1952; Page, 1954; Barnard, 1959; Fearnhead and Liu, 2007; Oakland, 2007) . ◮ Much less work on multivariate, online changepoint problems (Tartakovsky et al., 2006; Mei, 2010; Zou et al., 2015) . Several methods involve scanning a moving window of fixed size for changes (Xie and Siegmund, 2013; Soh and Chandrasekaran, 2017; Chan, 2017) . Online changepoint detection 4/28

  6. From offline to online ◮ The vast majority of the changepoint literature concerns the offline problem (Killick et al., 2012; Wang and Samworth, 2018; Wang et al., 2018; Baranowski et al., 2019; Liu, Gao and Samworth, 2019) . ◮ Univariate online changepoints have been studied within the well-established field of statistical process control (Duncan, 1952; Page, 1954; Barnard, 1959; Fearnhead and Liu, 2007; Oakland, 2007) . ◮ Much less work on multivariate, online changepoint problems (Tartakovsky et al., 2006; Mei, 2010; Zou et al., 2015) . Several methods involve scanning a moving window of fixed size for changes (Xie and Siegmund, 2013; Soh and Chandrasekaran, 2017; Chan, 2017) . Online changepoint detection 4/28

  7. From offline to online ◮ The vast majority of the changepoint literature concerns the offline problem (Killick et al., 2012; Wang and Samworth, 2018; Wang et al., 2018; Baranowski et al., 2019; Liu, Gao and Samworth, 2019) . ◮ Univariate online changepoints have been studied within the well-established field of statistical process control (Duncan, 1952; Page, 1954; Barnard, 1959; Fearnhead and Liu, 2007; Oakland, 2007) . ◮ Much less work on multivariate, online changepoint problems (Tartakovsky et al., 2006; Mei, 2010; Zou et al., 2015) . Several methods involve scanning a moving window of fixed size for changes (Xie and Siegmund, 2013; Soh and Chandrasekaran, 2017; Chan, 2017) . Online changepoint detection 4/28

  8. Online algorithm Key definition of an online algorithm: Definition. The computational complexity for processing a new observation depends only on the number of bits needed to represent it . ◮ For the purposes of this definition, all real numbers are considered as floating point numbers. ◮ Importantly, the computational complexity is not allowed to depend on the number of previously observed data points. Online changepoint detection 5/28

  9. Online algorithm Key definition of an online algorithm: Definition. The computational complexity for processing a new observation depends only on the number of bits needed to represent it . ◮ For the purposes of this definition, all real numbers are considered as floating point numbers. ◮ Importantly, the computational complexity is not allowed to depend on the number of previously observed data points. Online changepoint detection 5/28

  10. Online algorithm Key definition of an online algorithm: Definition. The computational complexity for processing a new observation depends only on the number of bits needed to represent it . ◮ For the purposes of this definition, all real numbers are considered as floating point numbers. ◮ Importantly, the computational complexity is not allowed to depend on the number of previously observed data points. Online changepoint detection 5/28

  11. Problem seting We consider a high-dimensional online changepoint detection problem for independent random vectors ( X n ) n ∈ N : ◮ Data generating mechanism: for some unknown, deterministic time z ∈ N ∪ { 0 } , we have X 1 , . . . , X z ∼ N p (0 , I p ) X z +1 , X z +2 , . . . ∼ N p ( θ, I p ) . and ◮ θ = 0 : data generated under the null , i.e. no change. ◮ θ � = 0 : data generated under the alternative , i.e. there exists a change. ◮ Assume ϑ := � θ � 2 is at least a known lower bound β > 0 . Online changepoint detection 6/28

  12. Problem seting We consider a high-dimensional online changepoint detection problem for independent random vectors ( X n ) n ∈ N : ◮ Data generating mechanism: for some unknown, deterministic time z ∈ N ∪ { 0 } , we have X 1 , . . . , X z ∼ N p (0 , I p ) X z +1 , X z +2 , . . . ∼ N p ( θ, I p ) . and ◮ θ = 0 : data generated under the null , i.e. no change. ◮ θ � = 0 : data generated under the alternative , i.e. there exists a change. ◮ Assume ϑ := � θ � 2 is at least a known lower bound β > 0 . Online changepoint detection 6/28

  13. Problem seting We consider a high-dimensional online changepoint detection problem for independent random vectors ( X n ) n ∈ N : ◮ Data generating mechanism: for some unknown, deterministic time z ∈ N ∪ { 0 } , we have X 1 , . . . , X z ∼ N p (0 , I p ) X z +1 , X z +2 , . . . ∼ N p ( θ, I p ) . and ◮ θ = 0 : data generated under the null , i.e. no change. ◮ θ � = 0 : data generated under the alternative , i.e. there exists a change. ◮ Assume ϑ := � θ � 2 is at least a known lower bound β > 0 . Online changepoint detection 6/28

  14. Example of an online algorithm (Page, 1954) Let p = 1 and assume θ > 0 . Page’s procedure: n � � � R n := max β ( X i − β/ 2) = max R n − 1 + β ( X n − β/ 2) , 0 . 0 ≤ h ≤ n i = n − h +1 Threshold T ≡ T β for changepoint declaration. Online changepoint detection 7/28

  15. Example of an online algorithm? Let p = 1 and assume θ > 0 . Scanning window-based method with window width w > 0 : n � W n := β ( X i − β/ 2) . i = n − w +1 – Window size w needs to increase when β decreases. – Computational complexity depends on β . Online changepoint detection 8/28

  16. Example of an online algorithm? Let p = 1 and assume θ > 0 . Scanning window-based method with window width w > 0 : n � W n := β ( X i − β/ 2) . i = n − w +1 – Window size w needs to increase when β decreases. – Computational complexity depends on β . Online changepoint detection 8/28

  17. Example of an online algorithm? Let p = 1 and assume θ > 0 . Scanning window-based method with window width w > 0 : n � W n := β ( X i − β/ 2) . i = n − w +1 – Window size w needs to increase when β decreases. – Computational complexity depends on β . Online changepoint detection 8/28

  18. Example of a non-online algorithm Let p = 1 and assume θ > 0 . Shiryaev–Roberts procedure (Shiryaev, 1963; Roberts, 1966): n n � � e b ( X h − b/ 2) . SR n := i =1 h = i The statistics cannot be defined recursively, so this is a sequential algorithm but not an online algorithm. Online changepoint detection 9/28

  19. Procedures and performance measures A sequential changepoint procedure is an extended stopping time N (w.r.t. the natural filtration) taking values in N ∪ {∞} . ◮ The patience of a sequential changepoint procedure N is E 0 ( N ) ; also known as the average run length to false alarm. ◮ Two types of response delays : – (Average case) response delay ¯ � � E θ ( N ) := sup ( N − z ) ∨ 0 ; E z,θ z ∈ N – Worst case response delay ¯ E wc � � θ ( N ) := sup ess sup E z,θ ( N − z ) ∨ 0 | X 1 , . . . , X z . z ∈ N Thus, E θ ( N ) ≤ ¯ ¯ E wc θ ( N ) . Online changepoint detection 10/28

  20. A high-dimensional, multiscale online algorithm: ocd

  21. Diagonal statistics i ) ⊤ ∈ R p . For n ∈ N , b ∈ R \{ 0 } and j ∈ [ p ] , i , . . . , X p ◮ Write X i = ( X 1 define n R j � b ( X j n,b := max i − b/ 2) 0 ≤ h ≤ n i = n − h +1 n t j � b ( X j n,b := argmax i − b/ 2) . 0 ≤ h ≤ n i = n − h +1 R j ◮ � n,b ) j ∈ [ p ] are called the diagonal statistics . Online changepoint detection 12/28

  22. Off-diagonal statistics ◮ For each j ∈ [ p ] , compute tail partial sums of length t j n,b in all coordinates j ′ ∈ [ p ] : n A j ′ ,j � X j ′ n,b := i . i = n − t j n,b +1 ◮ We aggregate to form an off-diagonal statistic anchored at coordinate j : ( A j ′ ,j n,b ) 2 Q j � n,b := � . ✶ � � | A j ′ ,j t j t j n,b |≥ a n,b ∨ 1 n,b j ′ ∈ [ p ]: j ′ � = j ◮ Different values of a can be chosen to detect dense or sparse signals. Online changepoint detection 13/28

  23. Off-diagonal statistics ◮ For each j ∈ [ p ] , compute tail partial sums of length t j n,b in all coordinates j ′ ∈ [ p ] : n A j ′ ,j � X j ′ n,b := i . i = n − t j n,b +1 ◮ We aggregate to form an off-diagonal statistic anchored at coordinate j : ( A j ′ ,j n,b ) 2 Q j � n,b := � . ✶ � � | A j ′ ,j t j t j n,b |≥ a n,b ∨ 1 n,b j ′ ∈ [ p ]: j ′ � = j ◮ Different values of a can be chosen to detect dense or sparse signals. Online changepoint detection 13/28

Recommend


More recommend