Restarted Bayesian Online Change-point Detector achieves Optimal - PowerPoint PPT Presentation

Restarted Bayesian Online Change-point Detector achieves Optimal Detection Delay Reda ALAMI Joint work with Odalric Maillard and Raphael F´ eraud. reda.alami@total.com Presented at ICML 2020

Overview ◮ A pruning version of the Bayesian Online Change-point Detector. 2/14

Overview ◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of: 2/14

Overview ◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of: ◮ False alarm rate. 2/14

Overview ◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of: ◮ False alarm rate. ◮ Detection delay . 2/14

Overview ◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of: ◮ False alarm rate. ◮ Detection delay . ◮ The detection delay is asymptotically optimal 2/14

Overview ◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of: ◮ False alarm rate. ◮ Detection delay . ◮ The detection delay is asymptotically optimal (reaching the existing lower bound [Lai and Xing, 2010]) . 2/14

Overview ◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of: ◮ False alarm rate. ◮ Detection delay . ◮ The detection delay is asymptotically optimal (reaching the existing lower bound [Lai and Xing, 2010]) . ◮ Empirical comparisons with the original BOCPD [Fearnhead and Liu, 2007] 2/14

Overview ◮ A pruning version of the Bayesian Online Change-point Detector. ◮ High probability guarantees in term of: ◮ False alarm rate. ◮ Detection delay . ◮ The detection delay is asymptotically optimal (reaching the existing lower bound [Lai and Xing, 2010]) . ◮ Empirical comparisons with the original BOCPD [Fearnhead and Liu, 2007] and the Improved Generalized Likelihood Ratio test [Maillard, 2019]. 2/14

Setting & Notations 3/14

Setting & Notations ◮ B ( µ t ) : Bernoulli distribution of mean µ t ∈ [0 , 1] . 3/14

Setting & Notations ◮ B ( µ t ) : Bernoulli distribution of mean µ t ∈ [0 , 1] . ◮ Piece-wise stationary process: ∀ c ∈ [1 , C ] , ∀ t ∈ T c = [ τ c , τ c +1 ) µ t = θ c 3/14

Setting & Notations ◮ B ( µ t ) : Bernoulli distribution of mean µ t ∈ [0 , 1] . ◮ Piece-wise stationary process: ∀ c ∈ [1 , C ] , ∀ t ∈ T c = [ τ c , τ c +1 ) µ t = θ c ◮ Sequence of observations: x s : t = ( x s , ...x t ) . 3/14

Setting & Notations ◮ B ( µ t ) : Bernoulli distribution of mean µ t ∈ [0 , 1] . ◮ Piece-wise stationary process: ∀ c ∈ [1 , C ] , ∀ t ∈ T c = [ τ c , τ c +1 ) µ t = θ c ◮ Sequence of observations: x s : t = ( x s , ...x t ) . ◮ Length: n s : t = t − s + 1 . 3/14

Bayesian Online Change-point Detector Runlength inference Runlength inference 4/14

Bayesian Online Change-point Detector Runlength inference Runlength inference Runlength r t : number of time steps since the last change-point. � ∀ r t ∈ [0 , t − 1] p ( r t | x 1: t ) ∝ p ( r t | r t − 1 ) p ( x t | r t − 1 , x 1: t − 1 ) p ( r t − 1 | x 1: t − 1 ) � �� r t − 1 ∈ [0 ,t − 2] Runlength distribution at t hazard UPM 4/14

Bayesian Online Change-point Detector Runlength inference Runlength inference Runlength r t : number of time steps since the last change-point. � ∀ r t ∈ [0 , t − 1] p ( r t | x 1: t ) ∝ p ( r t | r t − 1 ) p ( x t | r t − 1 , x 1: t − 1 ) p ( r t − 1 | x 1: t − 1 ) � �� r t − 1 ∈ [0 ,t − 2] Runlength distribution at t hazard UPM Constant hazard rate assumption ( h ∈ (0 , 1) ) (geometric inter-arrival time of change-point): 4/14

Bayesian Online Change-point Detector Runlength inference Runlength inference Runlength r t : number of time steps since the last change-point. � ∀ r t ∈ [0 , t − 1] p ( r t | x 1: t ) ∝ p ( r t | r t − 1 ) p ( x t | r t − 1 , x 1: t − 1 ) p ( r t − 1 | x 1: t − 1 ) � �� r t − 1 ∈ [0 ,t − 2] Runlength distribution at t hazard UPM Constant hazard rate assumption ( h ∈ (0 , 1) ) (geometric inter-arrival time of change-point): � p ( r t = r t − 1 + 1 | x 1: t ) ∝ (1 − h ) p ( x t | r t − 1 , x 1: t − 1 ) p ( r t − 1 | x 1: t − 1 ) ∝ h � p ( r t = 0 | x 1: t ) r t − 1 p ( x t | r t − 1 , x 1: t − 1 ) p ( r t − 1 | x 1: t − 1 ) 4/14

Bayesian Online Change-point Detector Runlength inference Runlength inference Runlength r t : number of time steps since the last change-point. � ∀ r t ∈ [0 , t − 1] p ( r t | x 1: t ) ∝ p ( r t | r t − 1 ) p ( x t | r t − 1 , x 1: t − 1 ) p ( r t − 1 | x 1: t − 1 ) � �� r t − 1 ∈ [0 ,t − 2] Runlength distribution at t hazard UPM Constant hazard rate assumption ( h ∈ (0 , 1) ) (geometric inter-arrival time of change-point): � p ( r t = r t − 1 + 1 | x 1: t ) ∝ (1 − h ) p ( x t | r t − 1 , x 1: t − 1 ) p ( r t − 1 | x 1: t − 1 ) ∝ h � p ( r t = 0 | x 1: t ) r t − 1 p ( x t | r t − 1 , x 1: t − 1 ) p ( r t − 1 | x 1: t − 1 ) p ( x t | r t − 1 , x 1: t − 1 ) is computed via the Laplace predictor as MLE: � � t i = s x i +1 if x t +1 = 1 n s : t +2 Lp ( x t +1 | x s : t ) := � t i = s (1 − x i )+1 if x t +1 = 0 n s : t +2 4/14

Bayesian Online Change-point Detector Forecaster Learning Forecaster Learning Instead of runlength r t ∈ [0 , t − 1] , use the forecaster notion. Forecaster weight: ∀ s ∈ [1 , t ] v s,t := p ( r t = t − s | x s : t ) 5/14

Bayesian Online Change-point Detector Forecaster Learning Forecaster Learning Instead of runlength r t ∈ [0 , t − 1] , use the forecaster notion. Forecaster weight: ∀ s ∈ [1 , t ] v s,t := p ( r t = t − s | x s : t ) � (1 − h ) exp ( − l s,t ) v s,t − 1 ∀ s < t, v s,t = h � t − 1 i =1 exp ( − l i,t ) v i,t − 1 s = t . 5/14

Bayesian Online Change-point Detector Forecaster Learning Forecaster Learning Instead of runlength r t ∈ [0 , t − 1] , use the forecaster notion. Forecaster weight: ∀ s ∈ [1 , t ] v s,t := p ( r t = t − s | x s : t ) � (1 − h ) exp ( − l s,t ) v s,t − 1 ∀ s < t, v s,t = h � t − 1 i =1 exp ( − l i,t ) v i,t − 1 s = t . Instantaneous loss: l s,t := − log Lp ( x t | x s ′ : t − 1 ) . 5/14

Bayesian Online Change-point Detector Forecaster Learning Forecaster Learning Instead of runlength r t ∈ [0 , t − 1] , use the forecaster notion. Forecaster weight: ∀ s ∈ [1 , t ] v s,t := p ( r t = t − s | x s : t ) � � � � (1 − h ) n s : t h I { s � =1 } exp − � L s : t V s ∀ s < t, (1 − h ) exp ( − l s,t ) v s,t − 1 ∀ s < t, v s,t = v s,t = h � t − 1 i =1 exp ( − l i,t ) v i,t − 1 s = t . hV t s = t. Instantaneous loss: l s,t := − log Lp ( x t | x s ′ : t − 1 ) . 5/14

Bayesian Online Change-point Detector Forecaster Learning Forecaster Learning Instead of runlength r t ∈ [0 , t − 1] , use the forecaster notion. Forecaster weight: ∀ s ∈ [1 , t ] v s,t := p ( r t = t − s | x s : t ) � � � � (1 − h ) n s : t h I { s � =1 } exp − � L s : t V s ∀ s < t, (1 − h ) exp ( − l s,t ) v s,t − 1 ∀ s < t, v s,t = v s,t = h � t − 1 i =1 exp ( − l i,t ) v i,t − 1 s = t . hV t s = t. Instantaneous loss: l s,t := − log Lp ( x t | x s ′ : t − 1 ) . L s : t := � t s ′ = s l s,t : cumulative loss and V t = � t � s =1 v s,t 5/14

Main difficulty to provide the theoretical guarantees Lemma (Computing the initial weight V t ) 6/14

Main difficulty to provide the theoretical guarantees Lemma (Computing the initial weight V t ) � � k − 1 t − 1 � h V t = (1 − h ) t − 2 ˜ V k : t , 1 − h k =1 . 6/14

Main difficulty to provide the theoretical guarantees Lemma (Computing the initial weight V t ) � � k − 1 t − 1 � h V t = (1 − h ) t − 2 ˜ V k : t , where: 1 − h k =1 t − ( k − 1) � � � � � � t − k t − 2 k − 2 � � � � ˜ − � − � − � V k : t = exp × exp × exp ... L 1: i 1 L i j +1: i j +1 L i k − 1 +1: t − 1 , i 1 =1 i 2 = i 1 +1 i k − 1 = i k − 2 +1 j =1 . 6/14

Main difficulty to provide the theoretical guarantees Lemma (Computing the initial weight V t ) � � k − 1 t − 1 � h V t = (1 − h ) t − 2 ˜ V k : t , where: 1 − h k =1 t − ( k − 1) � � � � � � t − k t − 2 k − 2 � � � � ˜ − � − � − � V k : t = exp × exp × exp ... L 1: i 1 L i j +1: i j +1 L i k − 1 +1: t − 1 , i 1 =1 i 2 = i 1 +1 i k − 1 = i k − 2 +1 j =1 � t − 2 � t − ( k − 1) t − k t − 2 � � � 1 = ... . with: k − 1 i 1 =1 i 2 = i 1 +1 i k − 1 = i k − 2 +1 6/14

Restarted Bayesian Online Change-point Detector achieves Optimal - PowerPoint PPT Presentation

Restarted Bayesian Online Change-point Detector achieves Optimal Detection Delay Reda ALAMI Joint work with Odalric Maillard and Raphael F eraud. reda.alami@total.com Presented at ICML 2020 Overview A pruning version of the Bayesian

Restar Group newly restarted A ye year has s passed fro from th the merger to to for form

A Robust and Efficient Parallel SVD Solver Based on Restarted Lanczos Bidiagonalization Jose E.

A Robust and Efficient Parallel SVD Solver Based on Restarted Lanczos Bidiagonalization Jose E.

Oakland Achieves Progress Report on Public Education A Product of the Oakland Achieves

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Strategic Plan for Detector R&D at Fermilab Petra Merkel Fermilab Detector R&D

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

How Bitcoin achieves Decentralization Centralization vs. Decentralization Distributed

GETTING RESTARTED LEAN GOVERNMENT EXCHANGE JUNE 2020 JENNIFER PETERSON 1 A LITTLE ABOUT

Notes on the Convergence of the Restarted GMRES Eugene Vecharynski Julien Langou Department of

Strategies for Spectrum Slicing Based on Restarted Lanczos Methods Carmen Campos and Jose E.

Km3 neutrino detector workshop January 23 20.30-22.00 KM3 Cerenkov neutrino detector, the

A Medium Size Detector for the ILC ... what used to be the TESLA or LD detector concept Ties

Noble County COVID-19 Responsible Restart Ohio Conversation Noble County Updates by: Noble

WHEN WHEN WE WE RES RESTAR TART NYC NYC Driven by Science, Data and Public Health Prioritize

EDUCATION AND LEARNING ONLINE RESOURCES Continued Professional Development for Archivists

Active Appearance Models Edwards, Taylor, and Cootes Presented by Bryan Russell Overview

18 Concurrency Control Intro to Database Systems Andy Pavlo AP AP 15-445/15-645 Computer

Recent Developments in the Statistical Analysis of Interval Data The Case of Regression Ulrich

Nonparametric Methods Recap Aarti Singh Machine Learning 10-701/15-781 Oct 4, 2010

Big Data - Lecture 2 High dimensional regression with the Lasso S. Gadat Toulouse, Octobre 2014

Restarted Bayesian Online Change-point Detector achieves Optimal - PowerPoint PPT Presentation

Restarted Bayesian Online Change-point Detector achieves Optimal Detection Delay Reda ALAMI Joint work with Odalric Maillard and Raphael F eraud. reda.alami@total.com Presented at ICML 2020 Overview A pruning version of the Bayesian

Restar Group newly restarted A ye year has s passed fro from th the merger to to for form

A Robust and Efficient Parallel SVD Solver Based on Restarted Lanczos Bidiagonalization Jose E.

A Robust and Efficient Parallel SVD Solver Based on Restarted Lanczos Bidiagonalization Jose E.

Oakland Achieves Progress Report on Public Education A Product of the Oakland Achieves

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Strategic Plan for Detector R&amp;D at Fermilab Petra Merkel Fermilab Detector R&amp;D

CS440/ECE448 Lecture 15: Bayesian Inference and Bayesian Learning Slides by Svetlana Lazebnik,

Bayesian Learning 1 Outline MLE, MAP vs. Bayesian Learning Bayesian Linear Regression

CS 331: Bayesian Networks 2 1 Bayesian Networks Youve heard about how Bayesian networks

How Bitcoin achieves Decentralization Centralization vs. Decentralization Distributed

GETTING RESTARTED LEAN GOVERNMENT EXCHANGE JUNE 2020 JENNIFER PETERSON 1 A LITTLE ABOUT

Notes on the Convergence of the Restarted GMRES Eugene Vecharynski Julien Langou Department of

Strategies for Spectrum Slicing Based on Restarted Lanczos Methods Carmen Campos and Jose E.

Km3 neutrino detector workshop January 23 20.30-22.00 KM3 Cerenkov neutrino detector, the

A Medium Size Detector for the ILC ... what used to be the TESLA or LD detector concept Ties

Noble County COVID-19 Responsible Restart Ohio Conversation Noble County Updates by: Noble

WHEN WHEN WE WE RES RESTAR TART NYC NYC Driven by Science, Data and Public Health Prioritize

EDUCATION AND LEARNING ONLINE RESOURCES Continued Professional Development for Archivists

Active Appearance Models Edwards, Taylor, and Cootes Presented by Bryan Russell Overview

18 Concurrency Control Intro to Database Systems Andy Pavlo AP AP 15-445/15-645 Computer

Recent Developments in the Statistical Analysis of Interval Data The Case of Regression Ulrich

Nonparametric Methods Recap Aarti Singh Machine Learning 10-701/15-781 Oct 4, 2010

Big Data - Lecture 2 High dimensional regression with the Lasso S. Gadat Toulouse, Octobre 2014

Strategic Plan for Detector R&D at Fermilab Petra Merkel Fermilab Detector R&D