Prospects of Lattice Field Theory Simulations powered by Deep Neural Networks Julian Urban ITP Heidelberg 2019/11/06 " this will " this is never work " revolutionary "
Prospects of Lattice Field Theory Simulations powered by Deep Neural Networks Julian Urban ITP Heidelberg 2019/11/06 " this will " this is never work " revolutionary "
Prospects of Lattice Field Theory Simulations powered by Deep Neural Networks Julian Urban ITP Heidelberg 2019/11/06 " this will " this is never work " revolutionary "
Overview • Stochastic estimation of Euclidean path integrals • Overrelaxation with Generative Adversarial Networks (GAN)* • Ergodic sampling with Invertible Neural Networks (INN) † • Some results for real, scalar φ 4 -theory in d = 2 * Urban, Pawlowski (2018) — “Reducing Autocorrelation Times in Lattice Simulations with Generative Adversarial Networks” — arXiv: 1811.03533 † Albergo, Kanwar, Shanahan (2019) — “Flow-based generative models for Markov chain Monte Carlo in lattice field theory” — arXiv: 1904.12072 1 / 20
Markov Chain Monte Carlo D φ e − S ( φ ) O ( φ ) N � = 1 ∼ � �O ( φ ) � φ ∼ e − S ( φ ) = O ( φ i ) � D φ e − S ( φ ) N i =1 Φ Φ ' • accept φ ′ with probability: � 1 , e − ∆ S � T A ( φ ′ | φ ) = min • autocorrelation function: C O ( t ) = �O i O i + t � − �O i ��O i + t � 2 / 20
Real, Scalar φ 4 -Theory on the Lattice • φ ( x ) ∈ R discretized on d -cubic Euclidean lattice with volume V = L d and periodic boundary conditions � d � µ ) + (1 − 2 λ ) φ ( x ) 2 + λ φ ( x ) 4 � � S = − 2 κ φ ( x ) φ ( x + ˆ x µ =1 • magnetization M = 1 � φ ( x ) V x � M 2 � − � M � 2 � • connected susceptibility χ 2 = V � • connected two-point correlation function G ( x , y ) = � φ ( x ) φ ( y ) � − � φ ( x ) �� φ ( y ) � 3 / 20
Real, Scalar φ 4 -Theory on the Lattice d = 2 <|M|> 0.03 25 "phase_diagram" u 1:2:3 0.025 20 0.02 15 λ 0.015 10 0.01 5 0.005 0 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 κ 4 / 20
Real, Scalar φ 4 -Theory on the Lattice d = 2 , V = 8 2 , λ = 0 . 02 5 10 4 χ 2 5 3 0 <|M|> 0.1 0.2 0.3 0.4 2 1 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 κ 5 / 20
Independent (Black-Box) Sampling Replace p ( φ ) by an approximate distribution q ( φ ) generated from a function g : R V − → R V , χ �− → φ , where the components of χ are i.i.d. random variables (commonly N (0 , 1 )). Theoretical / computational requirements: • ergodic in p ( φ ) • p ( φ ) � = 0 ⇒ q ( φ ) � = 0 • sufficient overlap between q and p for practical use on human timescales • balanced and asymptotically exact • statistical selection or weighting procedure for asymptotically unbiased estimation similar to accept/reject correction 6 / 20
Overrelaxation Φ Φ ', S( Φ ' ) = S( Φ ) n MC T A ( φ ′ | φ ) = 1 for ∆ S = 0 • sampling on hypersurfaces of constant S • ergodicity through normal MC steps • requirements • ability to reproduce all possible S • symmetric a priori selection probability 7 / 20
Generative Adversarial Networks random fake Discriminator loss Generator numbers samples real samples • overrelaxation step: find χ s.t. S [ g ( χ )] = S [ φ ] • iterative gradient descent solution of χ ′ = arg min � S [ g ( χ )] − S [ φ ] � χ 8 / 20
Sample Examples d = 2, V = 32 2 , κ = 0 . 21, λ = 0 . 022 "real_sample.txt" matrix "good_sample.txt" matrix 2 2 1 1 0 0 -1 -1 -2 -2 "good_sample2.txt" matrix "good_sample3.txt" matrix 2 2 1 1 0 0 -1 -1 -2 -2 9 / 20
Magnetization & Action Distributions HMC HMC GAN HMC + GAN 8000 8000 6000 6000 4000 4000 2000 2000 0 0 -0.2 -0.1 0 0.1 0.2 -0.2 -0.1 0 0.1 0.2 M M 20000 10000 HMC HMC GAN HMC + GAN 16000 8000 12000 6000 8000 4000 4000 2000 0 0 300 400 500 600 700 800 900 1000 400 450 500 550 600 S S 10 / 20
Reduced Autocorrelations 0.0025 local HMC n H = 1 0.002 n H = 2 n H = 3 0.0015 C M (t) 0.001 0.0005 0 1 2 3 4 5 6 7 8 9 10 t 11 / 20
Problems with this Approach • GAN • relies on the existence of an exhaustive dataset • no direct access to sample probability • adversarial learning complicates quantitative error assessment • convergence/stability issues such as mode collapse • Overrelaxation • still relies on traditional MC algorithms • symmetry of the selection probability • little effect on autocorrelations of observables coupled to S • latent space search is computationally rather demanding 12 / 20
Proper Reweighting to Model Distribution � �O� φ ∼ p ( φ ) = D φ p ( φ ) O ( φ ) D φ q ( φ ) p ( φ ) � p ( φ ) � � q ( φ ) O ( φ ) = q ( φ ) O ( φ ) = φ ∼ q ( φ ) Generate q ( φ ) through parametrizable, invertible function g ( χ | ω ) with tractable Jacobian determinant: � det ∂ g − 1 ( φ ) � � � � q ( φ ) = r ( χ ( φ )) � � ∂φ � Optimal choice for q ( φ ) ← → Minimal relative entropy / Kullback-Leibler divergence D φ q ( φ ) log p ( φ ) � log p ( φ ) � � D KL ( q � p ) = − q ( φ ) = − q ( φ ) φ ∼ q ( φ ) 13 / 20
INN / Real NVP Flow Ardizzone, Klessen, K¨ othe, Kruse, Maier-Hein, Pellegrini, Rahner, Rother, Wirkert (2018) — “Analyzing Inverse Problems with Invertible Neural Networks” — arXiv: 1808.04730 Ardizzone, K¨ othe, Kruse, L¨ uth, Rother, Wirkert (2019) — “Guided Image Generation with Conditional Invertible Neural Networks” — arXiv: 1907.02392 14 / 20
Advantages of this Approach • learning is completely data-independent • improved error metrics • Metropolis-Hastings acceptance rate • convergence properties of D KL • ergodicity & balance + asymptotic exactness satisfied a priori • no latent space deformation required Objective: maximization of overlap between q ( φ ) and p ( φ ). 15 / 20
Comparison with HMC Results d = 2 , V = 8 2 , λ = 0 . 02 INN: 8 layers, 4 hidden layers, 512 neurons / layer 5 10 4 χ 2 5 3 0 <|M|> 0.1 0.2 0.3 0.4 2 1 HMC bare weighted Metropolis 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 κ 16 / 20
Comparison with HMC Results κ = 0 . 2 0.16 HMC bare weighted 0.14 Metropolis 0.12 0.1 G(s) 0.08 0.06 0.04 0.02 0 0 1 2 3 4 5 6 7 8 s 17 / 20
Potential Applications & Future Work • accelerated simulations of physically interesting theories (QCD, Yukawa, Gauge-Higgs, Condensed Matter) • additional conditioning (cINN) to encode arbitrary couplings κ, λ • tackling sign problems with generalized thimble / path optimization approaches by latent space disentanglement • efficient minimization of D KL i.t.o. the ground state energy of an interacting hybrid classical-quantum system 18 / 20
Challenges & Problems • scalability to higher dimensions / larger volumes / more d.o.f. (e.g. QCD: ∼ 10 9 floats per configuration) • multi-GPU parallelization • progressive growing to successively larger volumes • architectures that intrinsically respect symmetries and topological properties of the theory • gauge symmetry / equivariance • critical slowing down 19 / 20
Thank you! 20 / 20
Recommend
More recommend