Super-resolution using Gaussian Process Regression Final Year - PowerPoint PPT Presentation

Super-resolution using Gaussian Process Regression Final Year Project Interim Report He He Department of Electronic and Information Engineering The Hong Kong Polytechnic Unviersity December 30, 2010 () December 30, 2010 1 / 33

Outline Introduction 1 Gaussian Process Regression 2 Multivariate Normal Distribution Gaussian Process Regression Training GPR for Super-resolution 3 Framework Covariance Function () December 30, 2010 2 / 33

The goal of super-resolution (SR) is to estimate a high-resolution (HR) image from one or a set of low-resolution (LR) images. It is widely applied in face recognition, medical imaging, HDTV etc. Figure: Face recognition in video. () December 30, 2010 4 / 33

The goal of super-resolution (SR) is to estimate a high-resolution (HR) image from one or a set of low-resolution (LR) images. It is widely applied in face recognition, medical imaging, HDTV etc. Figure: Super-resolution in medical imaging. () December 30, 2010 4 / 33

Super-resolution Methods Interpolation-based methods Fast but the HR image is usually blurred. E.g., bicubic interpolation, NEDI. Learning-based methods Hallucinate textures from the HR/LR image pair database. Reconstruction-based methods Formalize an optimization problem constrained by the LR image with various priors. () December 30, 2010 5 / 33

Multivariate Normal Distribution Definition A random vector X = ( X 1 , X 2 , . . . , X p ) is said to be multivariate normally (MVN) distributed if every linear combination of its components Y = a T X has a univariate normal distribution. Real-world random variables can often be approximated as following a multivariate normal distribution. The probability density function of X is � 1 � 1 2( x − µ ) T Σ − 1 ( x − µ ) f ( x ) = (2 π ) ( p / 2) | Σ | 1 / 2 exp (1) where µ is the mean of X and Σ is the covariance matrix. () December 30, 2010 7 / 33

Multivariate Normal Distribution Example Bivariate normal distribution � 1 � 0 µ = [1 1] ′ , Σ = . 0 1 () December 30, 2010 8 / 33

Multivariate Normal Distribution Property 1 The joint distribution of two MVN random variables is also an MVN distribution. � X 1 � Given X 1 ∼ N ( µ 1 , Σ 1 ), X 2 ∼ N ( µ 2 , Σ 2 ) and X = , we have X 2 � µ 1 � Σ 11 � � Σ 12 X ∼ N p ( µ, Σ ) with µ = , Σ = . µ 2 Σ 21 Σ 11 () December 30, 2010 9 / 33

Multivariate Normal Distribution Property 2 The conditional distribution of the components of MVN are (multivariate) normal. The distribution of X 1 , given that X 2 = x 2 , is normal and has Mean = µ 1 + Σ 12 Σ − 1 22 ( x 2 − µ 2 ) (2) Covariance = Σ 11 − Σ 12 Σ − 1 22 Σ 21 (3) () December 30, 2010 10 / 33

Gaussian Process Definition Gaussian Process (GP) defines a distribution over the function f , where f is a mapping from the input space X to R , such that for any finite subset of X , its marginal distribution P ( f ( x 1 ) , f ( x 2 ) , ... f ( x n )) is a multivariate normal distribution. f | X ∼ N ( m ( x ) , K ( X , X )) (4) where X = { x 1 , x 2 , . . . , x n } (5) m ( x ) = E [ f ( x )] (6) � ( f ( x i ) − m ( x ))( f ( x i ) T − m ( x T )) � k ( x i , x j ) = E (7) and K ( X , X ) denotes the covariance matrix such that K ij = k ( x i , x j ). () December 30, 2010 11 / 33

Gaussian Process Formally, we write the Gaussian Process as f ( x ) ∼ GP ( m ( x ) , k ( x i , x j )) (8) Without loss of generality, the mean is usually taken to be zero. Parameterized by the mean function m ( x ) and the covariance function k ( x i , x j ) Infer in the function space directly () December 30, 2010 12 / 33

Gaussian Process Regression Model: f ( x ) ∼ GP ( m ( x ) , k ( x i , x j )) (9) Given the inputs X ∗ , the output f ∗ is f ∗ ∼ N ( 0 , K ( X ∗ , X ∗ )) (10) According to the Gaussian prior, the joint distribution of the training outputs f , and the test outputs f ∗ is � f � K ( X , X ) � � �� K ( X , X ∗ ) ∼ N 0 , . (11) f ∗ K ( X ∗ , X ) K ( X ∗ , X ∗ ) () December 30, 2010 13 / 33

Noisy Model In reality, we do not have access to true function values but rather noisy observations. Assuming independent indentically distributed noise, we have the noisy model f ( x ) + ε, ε ∼ N (0 , σ 2 y = n ) (12) f ( x ) ∼ GP ( m ( x ) , K ( X , X )) (13) Var( f ( x )) + Var( ε ) = K ( X , X ) + σ 2 Var( y ) = n I (14) Thus, the joint distribution for prediction is � y � K ( X , X ) + σ 2 � � �� n I K ( X , X ∗ ) ∼ N 0 , (15) f ∗ K ( X ∗ , X ) K ( X ∗ , X ∗ ) () December 30, 2010 14 / 33

Prediction Referring to the previous property of the conditional distribution, we can obtain N ( ¯ f ∗ ∼ f , V ( f ∗ )) (16) ¯ K ( X ∗ , X )[ K ( X , X ) + σ 2 n I ] − 1 y , f ∗ = (17) V ( f ∗ ) = K ( X ∗ , X ∗ ) − K ( X ∗ , X )[ K ( X , X ) + σ 2 n I ] − 1 K ( X , X ∗ ) . (18) y are the training outputs and f ∗ are the test outputs, which are predicted as the mean ¯ f . () December 30, 2010 15 / 33

Marginal Likelihood GPR model: y = f + ǫ (19) f ∼ GP ( m ( x ) , K ) (20) N ( 0 , σ 2 ǫ ∼ n I ) (21) y is an n-dimensional vector of observations. Without loss of generality, let m ( x ) = 0. Thus y | X follows a normal distribution with E ( y | X ) = 0 (22) K ( X , X ) + σ 2 Var ( y | X ) = n I (23) () December 30, 2010 16 / 33

Maximum a posteriori Matrix derivative: ∂ − Y − 1 ∂ Y Y − 1 ∂ x Y = (26) ∂θ i ∂ tr ( Y − 1 ∂ Y ∂ x log | Y | = ) (27) ∂θ i Gradient ascent: ∂ L = 1 2 y T K − 1 ∂ K K − 1 y − 1 2 tr ( K − 1 ∂ K ) (28) ∂θ i ∂θ i ∂θ i ∂ K ∂θ i is a matrix of derivatives of each element. () December 30, 2010 18 / 33

Graphical Representation Model: y = f ( x ) + ε Squares: observed pixels Circles: unknown Gaussian field Inputs ( x ): neighbors (predictors) of the target pixel Outputs ( y ): pixel at the center of each 3 × 3 patch Thick horizontal line: a set of fully connected nodes. () December 30, 2010 20 / 33

Workflow Stage 1: interpolation Input LR patch () December 30, 2010 21 / 33

Workflow Stage 1: interpolation Sample training targets () December 30, 2010 21 / 33

Workflow Stage 1: interpolation SR based on Bicubic Interpolation Stage 2: deblurring () December 30, 2010 21 / 33

Workflow Stage 1: interpolation Stage 2: deblurring Sample training targets () December 30, 2010 21 / 33

Workflow Stage 1: interpolation Stage 2: deblurring Obtain neighbors from the downsampled patch () December 30, 2010 21 / 33

Workflow Stage 1: interpolation Stage 2: deblurring SR based on the simulated blurring process () December 30, 2010 21 / 33

Covariance Equation defines the similarity between two points (vectors) indicate the underlying distribution of functions in GP Squared Exponential covariance function � ( x i − x j ) ′ ( x i − x j ) � − 1 k ( x i , x j ) = σ 2 f exp (29) ℓ 2 2 σ 2 f represents the signal variance and ℓ defines the characteristic length scale . Given an image I , the covariance between two pixels I i , j and I m , n is calculated as k ( I ( i , j ) , N , I ( m , n ) , N ), where N means to take the 8 nearest pixels around the pixel. Therefore, the similarity is based on the Euclidean distance between the pixels’ neighbors. () December 30, 2010 22 / 33

Covariance Equation (a) Test point (b) Training patch (c) Covariance matrix Local similarity : high responses (red regions) from the training patch are concentrated on edges Global similarity : high-responsive regions also include other similar edges within the patch Conclusion : pixels embedded in a similar structure to that of the target pixel in terms of the neighborhood tend to have higher weights during prediction () December 30, 2010 23 / 33

Hyperparameter Adaptation Hyperparameters : σ 2 f : signal variance σ 2 n : noise variance ℓ : characteristic length scale (a) Test (b) Training (c) ℓ = .50, (d) ℓ = .05, (e) ℓ = 1.65, σ n = .01 σ n = .001 σ n = .14 (c) : MAP estimation (d) : Quickly varying field with low noise (e) : Slowly varyin field with high noise () December 30, 2010 24 / 33

Super-resolution using Gaussian Process Regression Final Year - PowerPoint PPT Presentation

Super-resolution using Gaussian Process Regression Final Year Project Interim Report He He Department of Electronic and Information Engineering The Hong Kong Polytechnic Unviersity December 30, 2010 () December 30, 2010 1 / 33 Outline

Kernel Methods for Regression Support Vector Regression Gaussian Mixture Regression Gaussian

Gaussian Processes Seung-Hoon Na Chonbuk National University Gaussian Process Regression

Gaussian Filter The Gaussian filter 1 2 1 A Gaussian kernel gives less 1 2 4 2 weight to

Gaussian Process Regression with Noisy Inputs Dan Cervone Harvard Statistics Department March 3,

Lecture 3 Capacity of Multiuser Gaussian Channels The Gaussian uplink: 6.1 The fading

UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER ORBITAL UNVEILING THE SUPER-ORBITAL UNVEILING

SYNTHESIS OF SUPER SYNTHESIS OF SUPER NANOPOROUS SYNTHESIS OF SUPER SYNTHESIS OF

Faster Gaussian Lattice Sampling using Information Leakage Gaussian Sampling Our Work Lazy

Non-Gaussian likelihoods for Gaussian Processes Alan Saul Outline Motivation Non-Gaussian

SIGBI Limited General Meeting 2019 Resolutions 1-6 Resolution 1 Resolution 2 Resolution 3

Patagonia Gold Plc 2009 Patagonia Gold VOTING ORDINARY SPECIAL Resolution 1 Resolution 2

Gaussian Process Lei Tang Arizona State University Jul. 31th, 2007 Lei Tang (ASU) Gaussian

Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models

Introduction to Nonparametric Bayesian Modeling and Gaussian Process Regression Piyush Rai Dept.

Variational Model Selection for Sparse Gaussian Process Regression Michalis K. Titsias School of

Regression 3: Logistic Regression Marco Baroni Practical Statistics in R Outline Logistic

A Proposal for an International Virtual Water Trading Council building institutional frameworks at

INTELLECTUAL PROPERTY, REGULATION AND COMPETITION: STANDARDS, TECH-LICENSING AND GLOBAL VALUE

G-24 TECHNICAL GROUP MEETING February 27-28, 2018 Hector Rogelio Torres G-24 TECHNICAL GROUP

Legal Aspects of Seafood Certification Stephanie Showalter National Sea Grant Law Center AFS

Bayesian Inference for Normal Mean Al Nosedal. University of Toronto. November 18, 2015 Al

Statistics, Error Analysis Hypothesis Testing PHY517 / AST443, Lecture 5 Remote Login Issues

Comparison of Ordinal and Metric Gaussian Process Regression as Surrogate Models for CMA

Machine Learning: Foundations Lecturer: Yishay Mansour Lecture 2 Bayesian Inference Kfir Bar