lattice gaussian sampling with markov chain monte carlo
play

Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC) Cong - PowerPoint PPT Presentation

Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC) Cong Ling Imperial College London aaa joint work with Zheng Wang (Huawei Technologies Shanghai) aaa September 20, 2016 Cong Ling (ICL) MCMC September 20, 2016 1 / 23 Outline


  1. Lattice Gaussian Sampling with Markov Chain Monte Carlo (MCMC) Cong Ling Imperial College London aaa joint work with Zheng Wang (Huawei Technologies Shanghai) aaa September 20, 2016 Cong Ling (ICL) MCMC September 20, 2016 1 / 23

  2. Outline Background 1 Markov Chain Monte Carlo (MCMC) 2 Convergence Analysis 3 Open Questions 4 Cong Ling (ICL) MCMC September 20, 2016 2 / 23

  3. Lattice Gaussian Distribution Lattice 0.015 Λ = L ( B ) = { Bx : x ∈ Z n } 0.01 Continuous Gaussian distribution D Z 2 , σ ( λ ) 2 πσ ) n e − � z − c � 2 1 ρ σ, c ( z ) = √ 2 σ 2 0.005 ( 0 Discrete Gaussian distribution over lattice Λ 10 5 10 ρ σ, c ( Bx ) 5 D Λ ,σ, c ( x ) = 0 ρ σ, c (Λ) 0 −5 −5 2 σ 2 � Bx − c � 2 1 e − −10 −10 λ λ 2 = 1 � 1 2 σ 2 � Bx − c � 2 x ∈ Z n e − Fig. 1. Discrete Gaussian distribution over Z 2 . where ρ σ, c (Λ) � � Bx ∈ Λ ρ σ, c ( Bx ) Cong Ling (ICL) MCMC September 20, 2016 3 / 23

  4. Why Does It Matter? Decoding The shape of D Λ ,σ, c ( x ) suggests that a lattice point Bx closer to c will be sampled with a higher probability solve the CVP and SVP problems [Aggarwal et al. 2015, Stephens-Davidowitz 2016] decoding of MIMO systems [Liu, Ling and Stehlé 2011] Closest Vector Problem (CVP) Given a lattice basis B ∈ R n × n and a target point c ∈ R n , find the closest lattice point Bx to c Shortest Vector Problem (SVP) Given a lattice basis B ∈ R n × n , find the shortest nonzero vector of B Cong Ling (ICL) MCMC September 20, 2016 4 / 23

  5. Why Does It Matter? Mathematics prove the transference theorem of lattices [Banaszczyk 1993] Coding obtain the full shaping gain in lattice coding [Forney, Wei 1989, Kschischang, Pasupathy 1993] capacity achieving distribution in information theory: Gaussian channel [Ling, Belfiore 2013], Gaussian wiretap channel [Ling, Luzzi, Belfiore and Stehlé 2013], fading and MIMO channels [Campello, Ling, Belfiore 2016]... Cryptography propose lattice-based cryptosystems based on the worst-case hardness assumptions [Micciancio, Regev 2004] underpin the fully-homomorphic encryption for cloud computing [Gentry 2009] How to sample from lattice Gaussian distribution? The problem that lattice Gaussian sampling aims to solve Cong Ling (ICL) MCMC September 20, 2016 5 / 23

  6. State of the Art Klein’s algorithm [Klein 2000]: works if � log n ) · max 1 ≤ i ≤ n � � σ ≥ ω ( b i � where � b i ’s are Gram-Schmidt vectors [Gentry, Peikert, Vaikuntanathan 2008]. Aggarwal et al. 2015: works for arbitrary σ , exponential time, exponential space. Markov chain Monte Carlo [Wang, Ling 2014]: arbitrary σ , polynomial space, how much time? For special lattices: Construction A, B etc., very fast [Campeloo, Belfiore 2016]; polar lattices, quasilinear complexity [Yan et al.’14]. Cong Ling (ICL) MCMC September 20, 2016 6 / 23

  7. Klein Sampling By sequentially sampling from the 1-dimension conditional Gaussian distribution D Z ,σ i , � x i in a backward order from x n to x 1 , the probability of Klein is n � ρ σ, c ( Bx ) P Klein ( x ) = x i ( x i ) = (1) D Z ,σ i , � � n i =1 ρ σ i , � x i ( Z ) i =1 Klein’s Algorithm P Klein ( x ) has been Input: B , σ, c demonstrated in [GPV,2008] Output: Bx ∈ Λ to be close to D Λ ,σ, c ( x ) let B = QR and c ′ = Q T c 1 within a negligible statistical for i = n , . . . , 1 do distance if 2 i − � n � c ′ j = i +1 r i,j x j log n ) · max 1 ≤ i ≤ n � � let � x i = , 3 σ ≥ ω ( b i � r i,i σ σ i = | r i,i | The operation of Klein’s 4 sample x i from D Z ,σ i , � x i algroithm has polynomial end for 5 complexity O ( n 2 ) excluding 6 return Bx QR decomposition Cong Ling (ICL) MCMC September 20, 2016 7 / 23

  8. Markov Chain Monte Carlo (MCMC) Markov chain Monte Carlo (MCMC) methods were introduced into lattice Gaussian sampling for the range of σ beyond the reach of Klein’s algorithm [Wang, Ling and Hanrot, 2014]. MCMC methods attempt to sample from an intractable target distribution of interest by building a Markov chain, which randomly generates the next sample conditioned on the previous sample. Cong Ling (ICL) MCMC September 20, 2016 8 / 23

  9. Gibbs Sampling Gibbs Sampling At each Markov move, perform sampling over a single component of x 2 σ 2 � Bx t +1 − c � 2 1 − e P ( x t +1 | x t [ − i ] ) = � i 2 σ 2 � Bx t +1 − c � 2 1 − ∈ Z e xt +1 i where x t [ − i ] = ( x t 1 , ..., x t i − 1 , x t i +1 , ..., x t n ) . Gibbs-Klein Sampling Algorithm [Wang, Ling and Hanrot, 2014] At each Markov move, perform the sampling over a block of components of x , while keeping the complexity at the same level as that of componentwise sampling 2 σ 2 � Bx t +1 − c � 2 1 − e P ( x t +1 block | x t [ − block ] ) = � 2 σ 2 � Bx t +1 − c � 2 1 − block ∈ Z m e xt +1 Cong Ling (ICL) MCMC September 20, 2016 9 / 23

  10. Metropolis-Hastings Sampling In 1970’s, the original Metropolis sampling was extended to a more general scheme known as the Metropolis-Hastings (MH) sampling, which can be summarized as: Given the current state x for Markov chain X t , a state candidate y for the next Markov move X t +1 is generated from the proposal distribution q ( x , y ) Then the acceptance decision ratio α about y is computed � � 1 , π ( y ) q ( y , x ) α ( x , y ) = min , (2) π ( x ) q ( x , y ) where π ( x ) is the target invariant distribution y and x will be accepted as the state by X t +1 with probability α and 1 − α , respectively In MH sampling, q ( x , y ) can be any fixed distribution. However, as the dimension goes up, finding a suitable q ( x , y ) could be difficult Cong Ling (ICL) MCMC September 20, 2016 10 / 23

  11. Independent Metropolis-Hastings-Klein Sampling In [Wang, Ling 2015], Klein’s sampling is used to generate the state candidate y for the Markov move X t +1 , namely, q ( x , y ) = P Klein ( y ) The generation of y for X t +1 does not depend on the previous state X t q ( x , y ) = q ( y ) is a special case of MH sampling known as independent MH sampling [Tierney, 1991] Independent MHK sampling algorithm Sample from the independent proposal distribution through Klein’s algorithm to obtain the candidate state y for X t +1 ρ σ, c ( By ) � n q ( x , y ) = q ( y ) = P Klein ( y ) = , i =1 ρ σ i , � y i ( Z ) where y ∈ Z n Calculate the acceptance ratio α ( x , y ) � � � � 1 , π ( y ) q ( y , x ) 1 , π ( y ) q ( x ) α ( x , y ) = min = min , π ( x ) q ( x , y ) π ( x ) q ( y ) where π = D Λ ,σ, c Make a decision for X t +1 based on α ( x , y ) to accept X t +1 = y or not Cong Ling (ICL) MCMC September 20, 2016 11 / 23

  12. Ergodicity A Markov chain is ergodic if there exists a limiting distribution π ( · ) such that t →∞ � P t ( x ; · ) − π ( · ) � T V = 0 lim where � · � T V is the total variation distance All the afore-mentioned Markov chains are ergodic Modes of ergodicity 1 � P t ( x ; · ) − π ( · ) � T V = M · Polynomial Ergodicity f ( t ) � P t ( x ; · ) − π ( · ) � T V = M (1 − δ ) t Uniform Ergodicity � P t ( x ; · ) − π ( · ) � T V = M ( x )(1 − δ ) t Geometric Ergodicity f ( t ) is a polynomial function of t , M < ∞ , 0 < δ < 1 Mixing Time of a Markov Chain t mix ( ǫ ) = min { t : max � P t ( x , · ) − π ( · ) � T V ≤ ǫ } . Cong Ling (ICL) MCMC September 20, 2016 12 / 23

  13. Ergodicity of Independent MHK The transition probability P ( x , y ) of the independent MHK algorithm is  � �  q ( y ) , π ( y ) q ( x )  min if y � = x , π ( x ) � � q ( x )+ � P ( x , y )= q ( x , y ) · α ( x , y )= (3) 0 ,q ( z ) − π ( z ) q ( x )  max if y = x .  π ( x ) z � = x Lemma 1 Given the invariant distribution D Λ ,σ, c , the Markov chain induced by the independent MHK t →∞ � P t ( x ; · ) − D Λ ,σ, c � T V = 0 algorithm is ergodic lim for all states x ∈ Z n . For a countably infinite state space Markov chain, ergodicity is achieved by irreducibility, aperiodicity and reversibility Proof: The Markov chain produced by the proposed algorithm is inherently reversible ( x � = y ) π ( x ) P ( x , y ) = π ( x ) q ( x , y ) α ( x , y ) = min { π ( x ) q ( y ) , π ( y ) q ( x ) } = π ( y ) P ( y , x ) (4) Cong Ling (ICL) MCMC September 20, 2016 13 / 23

  14. Uniform Ergodicity Lemma 2 In the independent MHK algorithm for lattice Gaussian sampling, there exists a constant δ > 0 such that q ( x ) π ( x ) ≥ δ for all x ∈ Z n . Proof: q ( x ) ρ σ, c ( Bx ) · ρ σ, c (Λ) � n = π ( x ) i =1 ρ σ i , � x i ( Z ) ρ σ, c ( Bx ) ρ σ, c (Λ) � n = i =1 ρ σ i , � x i ( Z ) ρ σ, c (Λ) � n ≥ , (5) i =1 ρ σ i ( Z ) where the right-hand side (RHS) of (5) is completely independent of x Cong Ling (ICL) MCMC September 20, 2016 14 / 23

Recommend


More recommend