Autoregressive Models Overview Direct Structures P Direct - PowerPoint PPT Presentation

Autoregressive Models Overview Direct Structures P • Direct structures � x ( n ) = − a k x ( n − k ) + w ( n ) • Types of estimators k =1 • Parametric spectral estimation • Notation differs (again) from text • Parametric time-frequency analysis • Essentially all of the techniques that we discussed for FIR filters can be applied • Order selection criteria • Many ways to estimate • Lattice structures? – How to estimate the autocorrelation matrix? • Maximum entropy – Degree of windowing (full, pre-, post-, no/short) • Excitation with line spectra – Weighted least squares? J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 1 J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 2 Properties Problem Unification P P � � x ( n ) = − a k x ( n − k ) + w ( n ) x ( n ) = − a k x ( n − k ) + w ( n ) k =1 k =1 • AR modeling is approximately equivalent to several other useful • If ˆ R is positive definite (and Toeplitz?), the model will be problems – Stable – Estimating the coefficients of a whitening filter – Causal – One-step ahead prediction – Minimum phase – Maximum entropy signal modeling – Invertible • In the MSE case if the process is minimum phase, these are • True of all the estimators except the proposed “unbiased” exactly equivalent technique J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 3 J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 4

Windowing Autoregressive Estimation Windowing We have discussed three types of windowing • Parametric windowing is mostly reserved for nonstationary applications • Data Windowing : x w ( n ) = w ( n ) x ( n ) – Time-frequency analysis – Used to reduce spectral leakage of nonparametric PSD • Text seems to implicity suggest using data windowing estimators – This is a bad idea! • Correlation Windowing : r w ( ℓ ) = ˆ r ( ℓ ) w ( ℓ ) – Biases the estimate – Used to reduce variance of nonparametric PSD estimators – No obvious gain • Weighted Least Squares : E e = � N − 1 n =0 w 2 n | y ( n ) − c T x ( n ) | 2 • Is much better to perform a weighted least squares – Used to weight the influence of some observations more than – Each row of the data matrix and output still correspond to a others specific time – Optimal when error variance is non-constant in the – Estimate is unbiased deterministic data matrix case – Permits you to weight the influence of points near the center of the observation window (block) more heavily – Can be used with any non-negative window (need not be positive definite) J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 5 J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 6 Frequency Domain Representation of the Error Signal AR Spectral Estimation � π � π ∞ | X (e jω ) | 2 | e ( n ) | 2 = 1 H (e jω ) | 2 d ω = 1 x ( n ) = h ( n ) ∗ w ( n ) | X (e jω ) | 2 | ˆ A (e jω ) | 2 d ω � E = | ˆ 2 π 2 π = ˆ − π − π e ( n ) = ˆ w ( n ) h i ( n ) ∗ x ( n ) n = −∞ H i (e jω ) X (e jω ) = X (e jω ) • This means solving for the coefficients that minimize the error also E (e jω ) = ˆ minimize the integral of the ratio of the ESDs ˆ H (e jω ) • Why can’t we just make ˆ H i (e jω ) large or ˆ A (e jω ) small at all ∞ � π | X (e jω ) | 2 = 1 � | e ( n ) | 2 E = H (e jω ) | 2 d ω frequencies? | ˆ 2 π − π n = −∞ � � – Recall the constraint a 0 = 1 in ˆ a = 1 ˆ a 1 . . . ˆ a P 1 � π � π • In the AR case, H ( z ) = a k = 1 a 0 = 1 A ( z ) A (e jω ) e jωk d ω A (e jω ) d ω • Note the frequency domain equations only exist if the signals are 2 π 2 π − π − π energy signals – Thus A (e jω ) is constrained to have unit area – Segments of stationary processes • This means solving for the coefficients that minimize the error also minimize the integral of the ratio of the ESDs J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 7 J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 8

AR Spectral Estimation Properties Bias-Variance Tradeoff � π � π ∞ • The order of the parametric model is the key parameter that | X (e jω ) | 2 | e ( n ) | 2 = 1 H (e jω ) | 2 d ω = 1 | X (e jω ) | 2 | ˆ A (e jω ) | 2 d ω � E = controls the tradeoff between bias and variance | ˆ 2 π 2 π − π − π n = −∞ • P too large • The frequency domain representation of the error makes it clear – The variance is manifest differently than in nonparametric that the impact of the error across the full frequency range in estimators uniform – Spectrum may contain spurious speaks – There is no benefit to fitting lower or higher frequency ranges – Also possible for a single frequency component to be split into more accurately, in general distinct peaks • The regions where | X (e jω ) | > | ˆ H (e jω ) | contribute more to the • P too small total error – The error will be minimized if | ˆ – Insufficient peaks H (e jω ) | is larger in these regions – Peaks that are present are too wide or have the wrong shape – This is part of the reason the estimate is more accurate near spectral peaks, than valleys – Can only do so much with a pair of complex-conjugate poles – Nonparametric estimators were also more accurate near peaks, but for very different reasons (spectral leakage) J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 9 J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 10 Order Selection Problem Order Selection Methods Concept • There are many order selection criteria N f ˆ � 1 | e ( n ) | 2 Unbiased MSE u = • All try to obtain an approximately unbiased estimate of the MSE N f − N i +1 − P n = N i • All essentially add a penalty as the unscaled error increases N f ˆ 1 � | e ( n ) | 2 Biased MSE b = N f − N i +1 n = N i • Goal: Select the value of P that minimizes the MSE • We have two estimates of the MSE • One from Chapter 8 and one from Chapter 9 • The one from Chapter 8 was unbiased • Why don’t we use it to obtain the best value of P ? – Only holds in the deterministic data matrix case – The data matrix is always stochastic with AR models J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 11 J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 12

Possible Order Selection Methods Order Selection Methods Comments Let N t � N f − N i + 1 . Then • The order selection decision is only critical when P is on the same order as N t , say N t / 10 ≤ P < N t N f e � 1 – This is the difficult case of too little data to make a good σ 2 � | e ( n ) | 2 Model Error ˆ N t decision n = N i – None of the order selection criteria work good in this case FPE( P ) = N t + P σ 2 Final Prediction Error N t − P ˆ e • Otherwise σ 2 Akaike Information Criterion AIC( P ) = N t log ˆ e + 2 P – Similar performance will be obtained for a wide range of values of P σ 2 Minimum Description Length MDL( P ) = N t log ˆ e + P log N t – Can simply pick the value where the estimated parameter P CAT( P ) = 1 N t − k − N t − P vector stops changing with increasing values of P � Criterion Autoregressive Transfer σ 2 σ 2 N t N t ˆ N t ˆ e e • Text suggests looking at diagnostic plots (residuals, ACF, PACS, k =1 etc.) and selecting the parameter manually – Works well in many applications J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 13 J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 14 Maximum Entropy Motivation Maximum Entropy Problem • Suppose we know (i.e., have estimated) the autocorrelation of an • This discussion based on [1] WSS process for ℓ ≤ P • Suppose again that we have N observations of a random process • How do we extrapolate the estimated autocorrelation sequence for x ( n ) ℓ > P ? • If we knew the autocorrelation, we could calculate the AR • Let us denote the extrapolated values by r e ( ℓ ) parameters exactly (recall chapters 4 and 6) • Then the estimated PSD is given by • With only N observations, at most we can estimate r ( ℓ ) only for | ℓ | < N P R x (e jω = r x ( ℓ )e − jℓω + ˆ � � r e ( ℓ )e − jℓω • Many signals have autocorrelations that are non zero for ℓ ≥ N ℓ = − P | ℓ | >P • The segmentation may significantly impair the accuracy of the R x (e jω to have the same properties as a real PSD • We would like ˆ estimated parameters – Especially true of narrowband processes – Real-valued • Nonparametric PSD estimation methods simply extrapolate the – Nonnegative estimate r ( ℓ ) with zeros: ˆ r ( ℓ ) = 0 for ℓ ≥ N • These conditions are not sufficient for a unique extrapolation • Can we do better? • Need an additional constraint J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 15 J. McNames Portland State University ECE 539/639 Autoregressive Models Ver. 1.01 16

Autoregressive Models Overview Direct Structures P Direct - PowerPoint PPT Presentation

Autoregressive Models Overview Direct Structures P Direct structures x ( n ) = a k x ( n k ) + w ( n ) Types of estimators k =1 Parametric spectral estimation Notation differs (again) from text Parametric

Autoregressive Models Autoregressive Models In [1]: from mxnet import autograd, nd, gluon, init

Chapter 4: Video 1 - Supplemental slides The Autoregressive Model Autoregressive (AR) processes

Lecture 12: Autoregressive Filters Mark Hasegawa-Johnson ECE 401: Signal and Image Analysis, Fall

Financial Econometrics Econ 40357 ARIMA Part 2: Autoregressive Models N.C. Mark University of

CSC321 Lecture 20: Reversible and Autoregressive Models Roger Grosse Roger Grosse CSC321

Adaptive Estimation of Autoregressive Models with Time-Varying Variances Ke-Li Xu and Peter C.

Hint-Based Training for Non-Autoregressive Translation Zhuohan Li Zi Lin Fei Tian Tao Qin

Agenda Automated Automated Modeling and Modeling and Forecasting Forecasting Vector Vector

Great Lakes Chloride, Inc. Direct Liquid Application (DLA) Direct Liquid Application (DLA)

State of Collaboration Direct Deposit and Payroll Reissuance 1 1 Topics Direct Deposit

Direct loan Direct loan Information Information Feder deral Direct Student Loans l Direct

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

Time Domain Models Box & Jenkins popularized an approach to time series analysis based on

Bayesian Graphical Models for Structural Vector Autoregressive Processes Daniel Ahelegbey, Monica

Autoregressive Models Stefano Ermon, Aditya Grover Stanford University Lecture 3 Stefano Ermon,

Causal analysis within the framework of structural autoregressive models Alessio Moneta Scuola

CS7015 (Deep Learning) : Lecture 22 Autoregressive Models (NADE, MADE) Mitesh M. Khapra

ArDec: Autoregressive-based time series decomposition in R Susana Barbosa Universidade do Porto,

Adaptive Control Chapter 7: Digital Control Strategies 1 Adaptive Control Landau,Lozano,

Quasi-maximum likelihood estimation for multivariate CARMA processes Eckhard Schlemm Institute

Sequence Models Instructor: John Thickstun Discussion Board: Available on Ed! Zoom Link: Available

Learning unknown forces in nonlinear models with Gaussian processes and autoregressive flows Wil

Attention Is All You Need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,

Innovation in Pediatric Healthcare Delivery Utah Regional Healthcare Innovation Day April 27,

Autoregressive Models Overview Direct Structures P Direct - PowerPoint PPT Presentation

Autoregressive Models Overview Direct Structures P Direct structures x ( n ) = a k x ( n k ) + w ( n ) Types of estimators k =1 Parametric spectral estimation Notation differs (again) from text Parametric

Autoregressive Models Autoregressive Models In [1]: from mxnet import autograd, nd, gluon, init

Chapter 4: Video 1 - Supplemental slides The Autoregressive Model Autoregressive (AR) processes

Lecture 12: Autoregressive Filters Mark Hasegawa-Johnson ECE 401: Signal and Image Analysis, Fall

Financial Econometrics Econ 40357 ARIMA Part 2: Autoregressive Models N.C. Mark University of

CSC321 Lecture 20: Reversible and Autoregressive Models Roger Grosse Roger Grosse CSC321

Adaptive Estimation of Autoregressive Models with Time-Varying Variances Ke-Li Xu and Peter C.

Hint-Based Training for Non-Autoregressive Translation Zhuohan Li Zi Lin Fei Tian Tao Qin

Agenda Automated Automated Modeling and Modeling and Forecasting Forecasting Vector Vector

Great Lakes Chloride, Inc. Direct Liquid Application (DLA) Direct Liquid Application (DLA)

State of Collaboration Direct Deposit and Payroll Reissuance 1 1 Topics Direct Deposit

Direct loan Direct loan Information Information Feder deral Direct Student Loans l Direct

Hypo contact and Sasakian SU ( 2 ) -structures in 5-dimensions structures on Lie groups Sasakian

Time Domain Models Box &amp; Jenkins popularized an approach to time series analysis based on

Bayesian Graphical Models for Structural Vector Autoregressive Processes Daniel Ahelegbey, Monica

Autoregressive Models Stefano Ermon, Aditya Grover Stanford University Lecture 3 Stefano Ermon,

Causal analysis within the framework of structural autoregressive models Alessio Moneta Scuola

CS7015 (Deep Learning) : Lecture 22 Autoregressive Models (NADE, MADE) Mitesh M. Khapra

ArDec: Autoregressive-based time series decomposition in R Susana Barbosa Universidade do Porto,

Adaptive Control Chapter 7: Digital Control Strategies 1 Adaptive Control Landau,Lozano,

Quasi-maximum likelihood estimation for multivariate CARMA processes Eckhard Schlemm Institute

Sequence Models Instructor: John Thickstun Discussion Board: Available on Ed! Zoom Link: Available

Learning unknown forces in nonlinear models with Gaussian processes and autoregressive flows Wil

Attention Is All You Need Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones,

Innovation in Pediatric Healthcare Delivery Utah Regional Healthcare Innovation Day April 27,

Time Domain Models Box & Jenkins popularized an approach to time series analysis based on