Machine Learning for Signal Processing Regression and Prediction - PowerPoint PPT Presentation

Machine Learning for Signal Processing Regression and Prediction Class 16. 29 Oct 2015 Instructor: Bhiksha Raj 11755/18797 1

A Common Problem • Can you spot the glitches? 11755/18797 2

How to fix this problem? • “Glitches” in audio – Must be detected – How? • Then what? • Glitches must be “fixed” – Delete the glitch • Results in a “hole” – Fill in the hole – How? 11755/18797 3

Interpolation.. • “Extend” the curve on the left to “predict” the values in the “blank” region – Forward prediction • Extend the blue curve on the right leftwards to predict the blank region – Backward prediction • How? – Regression analysis.. 11755/18797 4

Detecting the Glitch OK NOT OK • Regression-based reconstruction can be done anywhere • Reconstructed value will not match actual value • Large error of reconstruction identifies glitches 11755/18797 5

What is a regression • Analyzing relationship between variables • Expressed in many forms • Wikipedia – Linear regression, Simple regression, Ordinary least squares, Polynomial regression, General linear model, Generalized linear model, Discrete choice, Logistic regression, Multinomial logit, Mixed logit, Probit, Multinomial probit, …. • Generally a tool to predict variables 11755/18797 6

Regressions for prediction • y = f( x ; Q ) + e • Different possibilities – y is a scalar • y is real • y is categorical (classification) – y is a vector – x is a vector • x is a set of real valued variables • x is a set of categorical variables • x is a combination of the two – f( . ) is a linear or affine function – f( . ) is a non-linear function – f( . ) is a time-series model 11755/18797 7

A linear regression Y X • Assumption: relationship between variables is linear – A linear trend may be found relating x and y – y = dependent variable – x = explanatory variable – Given x , y can be predicted as an affine function of x 11755/18797 8

An imaginary regression.. • http://pages.cs.wisc.edu/~kovar/hall.html • Check this shit out (Fig. 1). That's bonafide, 100%-real data, my friends. I took it myself over the course of two weeks. And this was not a leisurely two weeks, either; I busted my ass day and night in order to provide you with nothing but the best data possible. Now, let's look a bit more closely at this data, remembering that it is absolutely first-rate. Do you see the exponential dependence? I sure don't. I see a bunch of crap. Christ, this was such a waste of my time. Banking on my hopes that whoever grades this will just look at the pictures, I drew an exponential through my noise. I believe the apparent legitimacy is enhanced by the fact that I used a complicated computer program to make the fit. I understand this is the same process by which the top quark was discovered. 11755/18797 9

Linear Regressions • y = a T x + b + e – e = prediction error • Given a “training” set of { x, y } values: estimate a and b – y 1 = a T x 1 + b + e 1 – y 2 = a T x 2 + b + e 2 – y 3 = a T x 3 + b + e 3 – … • If a and b are well estimated, prediction error will be small 11755/18797 10

Linear Regression to a scalar y 1 = a T x 1 + b + e 1 y 2 = a T x 2 + b + e 2 y 3 = a T x 3 + b + e 3  Define:      y [ y y y ...] x x x a  b  A 1 2 3 1 2 3 ... X        1 1  1  e [ e e e ...] 1 2 3 • Rewrite   T y A X e 11755/18797 11

Learning the parameters   T y A X e  ˆ T y A X Assuming no error • Given training data: several x , y ˆ • Can define a “divergence”: D( y , ) y ˆ – Measures how much differs from y y – Ideally, if the model is accurate this should be small ˆ • Estimate a , b to minimize D( y , ) y 11755/18797 12

The prediction error as divergence y 1 = a T x 1 + b + e 1 y 2 = a T x 2 + b + e 2 y 3 = a T x 3 + b + e 3     ˆ T y a X e y e      ˆ 2 2 2 D(y, y ) E e e e ... 1 2 3           T 2 T 2 T 2 ( y a x b ) ( y a x b ) ( y a x b ) ... 1 1 2 2 3 3    2 T      T T T E y A X y A X y A X • Define divergence as sum of the squared error in predicting y 11755/18797 13

Prediction error as divergence • y = A T x + e – e = prediction error – Find the “slope” a such that the total squared length of the error lines is minimized 11755/18797 14

Solving a linear regression   T y A X e • Minimize squared error   T 2 E || y A X || T    A y pinv X    T y T A pinv X 11755/18797 15

More Explicitly • Minimize squared error      T 2 T T T E || y X A || ( y A X )( y A X )   T T T T yy A XX A - 2 yX A • Differentiating w.r.t A and equating to 0     T T T d E 2 A XX - 2 yX d A 0     -1   T T T A yX XX y pinv X   -1  T T A XX Xy 11755/18797 16

Regression in multiple dimensions y 1 = A T x 1 + b + e 1 y i is a vector y 2 = A T x 2 + b + e 2 y 3 = A T x 3 + b + e 3 y ij = j th component of vector y i a i = i th column of A • Also called multiple regression b j = j th component of b • Equivalent of saying: T x i + b 1 + e i1 y i1 = a 1 T x i + b 2 + e i2 y i2 = a 2 y i = A T x i + b + e i T x i + b 3 + e i3 y i3 = a 3 • Fundamentally no different from N separate single regressions – But we can use the relationship between y s to our benefit 11755/18797 17

Multiple Regression     A x x x ˆ Y    b 1 2 3 [ y y y ...] X ... A     1 2 3   1 1 1   E  [ e e e ...] 1 2 3 ˆ   T Y A X E  2 ˆ x   T DIV y A i i i • Minimizing     ˆ -1   T T T A Y pinv X YX XX   ˆ -1  T T A XX XY 11755/18797 18

A Different Perspective = + • y is a noisy reading of A T x   T y A x e • Error e is Gaussian  2 I e ~ N ( 0 , ) • Estimate A from   Y [ y y ... y ] X [ x x ... x ] 1 2 N 1 2 N 11755/18797 19

The Likelihood of the data    T 2 I y A x e e ~ N ( 0 , ) • Probability of observing a specific y , given x , for a particular matrix A   T 2 I P ( y | x ; A ) N ( y ; A x , ) • Probability of collection: 1 2 N 1 2 N Y [ y y ... y ] X [ x x ... x ]      T 2 I P ( Y | X ; A ) N ( y ; A x , ) i i i • Assuming IID for convenience (not necessary) 11755/18797 20

A Maximum Likelihood Estimate      2 I T Y [ y y ... y ] X [ x x ... x ] y A x e e ~ N ( 0 , ) 1 2 N 1 2 N    1 1  2    T  P ( Y | X ) exp y A x  i i  2   2 2 D ( 2 ) i 1  2    T log P ( Y | X ; A ) C y A x  i i 2 2 i • Maximizing the log probability is identical to minimizing the error – Identical to the least squares solution       -1 -1    T T T T T A YX XX Y pinv X A XX XY 11755/18797 21

Predicting an output • From a collection of training data, have learned A • Given x for a new instance, but not y , what is y ?  ˆ T • Simple solution: y A X 11755/18797 22

Applying it to our problem • Prediction by regression • Forward regression • x t = a 1 x t-1 + a 2 x t-2 … a k x t-k + e t • Backward regression • x t = b 1 x t+1 + b 2 x t+2 … b k x t+k + e t 11755/18797 23

Applying it to our problem • Forward prediction       x x x .. x e    t t 1 t 2 t K t       x x x .. x e               t 1 t 2 t 3 t K 1 t 1 a       t .. .. .. .. .. ..        x   x x .. x   e     K 1 K K 1 1 K 1   x X a e t  pinv ( X ) x a t 11755/18797 24

Applying it to our problem • Backward prediction       x x x .. x e       t K 1 t t 1 t K t K 1       x x x .. x e                 t K 2 t 1 t 2 t K 1 t K 2 b       t .. .. .. .. .. ..        x   x x .. x   e   1 K 1 K 2 1   x X b e t  pinv ( X ) x b t 11755/18797 25

Finding the burst • At each time – Learn a “forward” predictor a t est = S i a t,k x t-k – At each time, predict next sample x t – Compute error: ferr t =| x t -x t est | 2 – Learn a “backward” predict and compute backward error • berr t – Compute average prediction error over window, threshold – If the error exceeds a threshold, identify burst 11755/18797 26

Machine Learning for Signal Processing Regression and Prediction - PowerPoint PPT Presentation

Machine Learning for Signal Processing Regression and Prediction Class 16. 29 Oct 2015 Instructor: Bhiksha Raj 11755/18797 1 A Common Problem Can you spot the glitches? 11755/18797 2 How to fix this problem? Glitches in

Tx Signal: 1000 Hz sine wave; Attenuation; Random noise with 0.5ms spike Tx Signal Noise Rx

Machine Learning for Signal Processing Lecture 1: Signal Representations Class 1. 27 August

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Speech Processing 15-492/18-492 Speech Synthesis Signal Processing Signal Manipulation Signal

Waveform Generation Fundamental part of signal processing is the signal. Within the

Sampling a Signal an analog signal together with some samples of the signal. The samples

Signal Types Recall even digital signals are just voltages Analog signal Continuous

Signal Types Recall even digital signals are just voltages Analog signal Continuous

Machine Learning for Signal Processing Lecture 1: Signal Representations Class 1. 29 August

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Gravitational Waves from Binary Black Hole Mergers Inside of Stars Joseph M. Fedrow C.D. Ott, U.

On approximation processes defined by the cosine operator function in a Banach space Andi

BBM 413 Fundamentals of Image Processing Erkut Erdem Dept. of Computer Engineering

= + ( ) cos( ), x t A t t Chapter 4 Chapter 4 k k k = 1 k

( ) Isovector & Spin excitation Gamow-Teller transition Mediated by

NA62 Charged Particle Hodoscope Design and performance in 2016 S. Kholodenko on behalf of NA62

ArgonCube Status LBNC Meeting, CERN Cubism - Braques Bottle and Fishes, Paris c.1910-12

Seismic Modeling, Migration and Velocity Inversion North Sea Dip-Directed Imaging Bee Bednar

Sambuz

Useful Links

Newsletter

Mail Us

Machine Learning for Signal Processing Regression and Prediction - PowerPoint PPT Presentation

Machine Learning for Signal Processing Regression and Prediction Class 16. 29 Oct 2015 Instructor: Bhiksha Raj 11755/18797 1 A Common Problem Can you spot the glitches? 11755/18797 2 How to fix this problem? Glitches in

Tx Signal: 1000 Hz sine wave; Attenuation; Random noise with 0.5ms spike Tx Signal Noise Rx

Machine Learning for Signal Processing Lecture 1: Signal Representations Class 1. 27 August

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

A Machine Learning Approach A Machine Learning Approach A Machine Learning Approach A Machine

Speech Processing 15-492/18-492 Speech Synthesis Signal Processing Signal Manipulation Signal

Waveform Generation Fundamental part of signal processing is the signal. Within the

Sampling a Signal an analog signal together with some samples of the signal. The samples

Signal Types Recall even digital signals are just voltages Analog signal Continuous

Signal Types Recall even digital signals are just voltages Analog signal Continuous

Machine Learning for Signal Processing Lecture 1: Signal Representations Class 1. 29 August

Welcome to the Machine Learning Toolbox! Machine Learning Toolbox Supervised learning caret

Introduction to Machine Learning COMPSCI 371D Machine Learning COMPSCI 371D Machine

Gravitational Waves from Binary Black Hole Mergers Inside of Stars Joseph M. Fedrow C.D. Ott, U.

On approximation processes defined by the cosine operator function in a Banach space Andi

BBM 413 Fundamentals of Image Processing Erkut Erdem Dept. of Computer Engineering

= + ( ) cos( ), x t A t t Chapter 4 Chapter 4 k k k = 1 k

( ) Isovector &amp; Spin excitation Gamow-Teller transition Mediated by

NA62 Charged Particle Hodoscope Design and performance in 2016 S. Kholodenko on behalf of NA62

ArgonCube Status LBNC Meeting, CERN Cubism - Braques Bottle and Fishes, Paris c.1910-12

Seismic Modeling, Migration and Velocity Inversion North Sea Dip-Directed Imaging Bee Bednar

Sambuz

Useful Links

Newsletter

Mail Us

( ) Isovector & Spin excitation Gamow-Teller transition Mediated by