Bayesian Minimal Description Lengths for Multiple Changepoint Detection Yingbo Li Dept of Mathematical Sciences, Clemson University Co-authors: Robert Lund (Clemson University), Anuradha Hewaarachchi (University of Kelaniya) Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 1 / 26
A Motivating Example Monthly maximum temperature series in Tuscaloosa, AL Observed data 100 Tmax, observed value ( ° F) 90 80 70 60 50 40 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Time Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 2 / 26
A Motivating Example Monthly maximum temperature series in Tuscaloosa, AL Observed data − sample seasonal mean Tmax, seasonal adjusted value ( ° F) 10 5 0 −5 −10 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Time Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 2 / 26
A Motivating Example Monthly maximum temperature series in Tuscaloosa, AL Observed data − sample seasonal mean 1957Mar 1990Jan Tmax, seasonal adjusted value ( ° F) 10 5 0 −5 −10 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Time Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 2 / 26
A Motivating Example Monthly maximum temperature series in Tuscaloosa, AL Observed data − sample seasonal mean 1957Mar 1990Jan Tmax, seasonal adjusted value ( ° F) 10 5 0 −5 −10 x x x x x 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Time Metadata (station history logs) Station relocations: 1921 Nov, 1939 Mar, 1956 Jun, 1987 May Instrumentation changes: 1956 Nov, 1987 May Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 2 / 26
A Motivating Example Monthly maximum temperature series in Tuscaloosa, AL Observed data − sample seasonal mean 1957Mar 1990Jan Tmax, seasonal adjusted value ( ° F) 10 5 0 −5 −10 x x x x x 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Time Metadata (station history logs): more likely to induce mean shifts Station relocations: 1921 Nov, 1939 Mar, 1956 Jun, 1987 May Instrumentation changes: 1956 Nov, 1987 May Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 2 / 26
A Motivating Example Monthly maximum and minimum temperature series Observed data − sample seasonal mean 10 Tmax ( ° F) 5 0 −10 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 15 Tmin ( ° F) 5 0 −10 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 3 / 26
A Motivating Example Monthly maximum and minimum temperature series Observed data − sample seasonal mean 1957Mar 1990Jan 10 Tmax ( ° F) 5 0 −10 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 1918Feb 1957Jul 1990Jan 15 Tmin ( ° F) 5 0 −10 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 3 / 26
A Motivating Example Monthly maximum and minimum temperature series Observed data − sample seasonal mean 1957Mar 1990Jan 10 Tmax ( ° F) 5 0 −10 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 1918Feb 1957Jul 1990Jan 15 Tmin ( ° F) 5 0 −10 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Tmax and Tmin are likely to shift at the same time. Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 3 / 26
A Brief Review of MDL MDL for Changepiont Detection in Piecewise AR Series Automatic MDL by Davis et al. [2006]: a penalized likelihood, with penalty being the code length (CL) of parameters � m +1 � m +1 p r +2 log( m ) + ( m + 1) log( N ) + r =1 log p r + log N r r =1 2 m regime lengths AR orders AR coefficients Automatic MDL rules: ◮ CL of an unbounded positive integer I : log( I ) ◮ CL of a positive integer bounded above by U : log( U ) ◮ CL of the MLE of a real-valued parameter estimated by N 1 observations: 2 log N Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 4 / 26
A Brief Review of MDL Model Selection Using MDL Principle Description length [Risanen, 1989, Hansen and Yu, 2001]: the number of storage units to transmit a random dataset. In model selection, the true model has the smallest MDL. Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 5 / 26
A Brief Review of MDL Model Selection Using MDL Principle Description length [Risanen, 1989, Hansen and Yu, 2001]: the number of storage units to transmit a random dataset. In model selection, the true model has the smallest MDL. Two-part MDL L ( X , θ ) = L ( X | θ ) + L ( θ ) transmit X transmit θ Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 5 / 26
A Brief Review of MDL Model Selection Using MDL Principle Description length [Risanen, 1989, Hansen and Yu, 2001]: the number of storage units to transmit a random dataset. In model selection, the true model has the smallest MDL. Two-part MDL L ( X , θ ) = − log f ( X | θ ) − log π ( θ ) transmit X transmit θ Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 5 / 26
A Brief Review of MDL Model Selection Using MDL Principle Description length [Risanen, 1989, Hansen and Yu, 2001]: the number of storage units to transmit a random dataset. In model selection, the true model has the smallest MDL. Two-part MDL L ( X , θ ) = − log f ( X | θ ) − log π ( θ ) transmit X transmit θ Mixture MDL � L ( X ) = − log f ( X | θ ) π ( θ ) d θ � �� � marginal likelihood Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 5 / 26
A Brief Review of MDL Model Selection Using MDL Principle Description length [Risanen, 1989, Hansen and Yu, 2001]: the number of storage units to transmit a random dataset. In model selection, the true model has the smallest MDL. Two-part MDL L ( X , θ ) = − log f ( X | θ ) − log π ( θ ) transmit X transmit θ Mixture MDL � L ( X ) = − log f ( X | θ ) π ( θ ) d θ � �� � marginal likelihood ◮ If the prior π ( θ | τ ), combine with two-part MDL: � L ( X , ˆ τ ) = − log f ( X | θ ) π ( θ | ˆ τ ) d θ − log π (ˆ τ ) Closely related with empirical Bayes marginal likelihood. Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 5 / 26
Bayesian MDL Method Bayesian MDL Observed time series: X 1: N = ( X 1 , X 2 , . . . , X N ) ′ Suppose m changepoints 1 ≤ τ 1 < τ 2 < · · · < τ m ≤ N partition the timeline m + 1 distinct regimes (segments). Time t is in regime r ⇐ ⇒ τ r − 1 ≤ t < τ r Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 6 / 26
Bayesian MDL Method Bayesian MDL Observed time series: X 1: N = ( X 1 , X 2 , . . . , X N ) ′ Suppose m changepoints 1 ≤ τ 1 < τ 2 < · · · < τ m ≤ N partition the timeline m + 1 distinct regimes (segments). Time t is in regime r ⇐ ⇒ τ r − 1 ≤ t < τ r Any time in t = { p + 1 , p + 2 , . . . , N } can be a changepoint Definition Denote a multiple changepoint configuration as a indicator vector η = ( η p +1 , η p +2 , . . . , η N ) ′ , such that � 1 , if time t is a changepoint η t = 0 , if time t is not a changepoint Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 6 / 26
Bayesian MDL Method Bayesian MDL Observed time series: X 1: N = ( X 1 , X 2 , . . . , X N ) ′ Suppose m changepoints 1 ≤ τ 1 < τ 2 < · · · < τ m ≤ N partition the timeline m + 1 distinct regimes (segments). Time t is in regime r ⇐ ⇒ τ r − 1 ≤ t < τ r Any time in t = { p + 1 , p + 2 , . . . , N } can be a changepoint Definition Denote a multiple changepoint configuration as a indicator vector η = ( η p +1 , η p +2 , . . . , η N ) ′ , such that � 1 , if time t is a changepoint η t = 0 , if time t is not a changepoint Number of changepoints in η : m = � N t = p +1 η t Total number of models: 2 N − p Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 6 / 26
Bayesian MDL Method Prior distribution π ( η ): Beta-Binomial If time t is not in metadata � ρ (1) � � a , b (1) � ρ (1) ∼ Beta iid η t ∼ Bernoulli , If time t is in metadata � ρ (2) � � a , b (2) � ρ (2) ∼ Beta iid η t ∼ Bernoulli , Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 7 / 26
Bayesian MDL Method Prior distribution π ( η ): Beta-Binomial If time t is not in metadata � ρ (1) � � a , b (1) � ρ (1) ∼ Beta iid η t ∼ Bernoulli , If time t is in metadata � ρ (2) � � a , b (2) � ρ (2) ∼ Beta iid η t ∼ Bernoulli , π ( η ) has a closed form: � 1 2 � η t ( k ) | ρ ( k ) � � ρ ( k ) � � � π d ρ ( k ) π ( η ) = π 0 k =1 t ( k ) Yingbo Li (Clemson) Bayesian MDL Changepoint Detection QPRC 2017 7 / 26
Recommend
More recommend