zoom enhance synthesize magic upscaling and material
play

ZOOM, ENHANCE, SYNTHESIZE! MAGIC UPSCALING AND MATERIAL SYNTHESIS - PowerPoint PPT Presentation

ZOOM, ENHANCE, SYNTHESIZE! MAGIC UPSCALING AND MATERIAL SYNTHESIS USING DEEP LEARNING Tuesday, 9 May 2017 Andrew Edelsten - NVIDIA Developer Technologies DEEP LEARNING FOR ART Active R&D but ready now Style transfer Generative


  1. ZOOM, ENHANCE, SYNTHESIZE! MAGIC UPSCALING AND MATERIAL SYNTHESIS USING DEEP LEARNING Tuesday, 9 May 2017 Andrew Edelsten - NVIDIA Developer Technologies

  2. DEEP LEARNING FOR ART Active R&D but ready now ▪ Style transfer ▪ Generative networks creating images and voxels ▪ Adversarial networks (DCGAN) – still early but promising ▪ DL & ML based tools from NVIDIA and partners ▪ NVIDIA ▪ Artomatix ▪ Allegorithmic ▪ Autodesk 2

  3. STYLE TRANSFER Something Fun ▪ Doodle a masterpiece! Content Style ▪ Uses CNN to take the “style” from one image and apply it to another ▪ Sept 2015: A Neural Algorithm of Artistic Style by Gatys et al ▪ Dec 2015: neural-style (github) ▪ Mar 2016: neural-doodle (github) Mar 2016: texture-nets (github) ▪ Oct 2016: fast-neural-style (github) ▪ 2 May 2017 (last week!): Deep Image Analogy (arXiv) ▪ ▪ Also numerous services: Vinci, Prisma, Artisto, Ostagram 3

  4. HTTP :// OSTAGRAM . RU / STATIC _ PAGES / LENTA 4

  5. STYLE TRANSFER Something Useful ▪ Game remaster & texture enhancement ▪ Try Neural Style and use a real- world photo for the “style” ▪ For stylized or anime up-rez try https://github.com/nagadomi/waifu2x ▪ Experiment with art styles ▪ Dream or power-up sequences ▪ “Come Swim” by Kirsten Stewart - https://arxiv.org/pdf/1701.04928v1.pdf 5

  6. GAMEWORKS: MATERIALS & TEXTURES Using DL for Game Development & Content Creation ▪ Set of tools targeting the game industry using machine learning and deep learning ▪ Launched at Game Developer Conference in March, tools run as a web service ▪ Sign up for the Beta at: https://gwmt.nvidia.com ▪ Tools in this initial release: ▪ Photo to Material: 2shot ▪ Texture Multiplier ▪ Super-Resolution 6

  7. PHOTO TO MATERIAL The 2Shot Tool ▪ From two photos of a surface, generate a “material” ▪ Based on a SIGGRAPH 2015 paper by NVIDIA Research & Aalto University (Finland) ▪ “Two - Shot SVBRDF Capture for Stationary Materials” ▪ https://mediatech.aalto.fi/publications/graphics/TwoShotSVBRDF/ ▪ Input is pixel aligned “flash” and “guide” photographs ▪ Use tripod and remote shutter or bracket ▪ Or align later ▪ Use for flat surfaces with repeating patterns 7

  8. MATERIAL SYNTHESIS FROM TWO PHOTOS Flash image Guide image Diffuse Specular Normals Glossiness Anisotropy albedo 8

  9. TEXTURE MULTIPLIER Organic variations of textures ▪ Put simply: texture in, new texture out ▪ Inspired by Gatys, Ecker & Bethge ▪ Texture Synthesis Using Convolutional Neural Networks ▪ https://arxiv.org/pdf/1505.07376.pdf ▪ Artomatix ▪ Similar product “Texture Mutation” ▪ https://artomatix.com/ 9

  10. SUPER RESOLUTION 10

  11. SUPER RESOLUTION Zoom.. ENHANCE! OK! Sure! Can you Zoom in on the enhance that? license plate 11

  12. SUPER RESOLUTION Construct a high-resolution image The task at hand Given a low-resolution image Upscale H n * H (magic?) W n * W 12

  13. UPSCALE: CREATE MORE PIXELS An ill-posed task? Pixels of the upscaled image ? ? ? Pixels of the given image ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 13

  14. TRADITIONAL APPROACH ▪ Interpolation (bicubic, lanczos, etc.) ▪ Interpolation + Sharpening (and other filtration) Interpolation Filter-based sharpening ▪ Rough estimation of the data behavior  too general ▪ Too many possibilities (8x8 grayscale has 256 (8∗8) ≈ 10 153 pixel combinations!) 14

  15. A NEW APPROACH First: narrow the possible set Photos Natural images Textures All possible images Focus on the domain of “natural images” 15

  16. A NEW APPROACH Second: Place image in the domain, then reconstruct Data from natural images is sparse, it’s compressible in some domain Then “reconstruct” images (rather than create new ones) Compress Reconstruct + prior information + constraints 16

  17. PATCH-BASED MAPPING: TRAINING Low-resolution patch High-resolution patch Mapping Model params training , LR,HR pairs of patches Training images 17

  18. PATCH-BASED MAPPING 𝒚 𝑰 𝒚 𝑴 Encode Decode LR patch HR patch High-level information about the patch 18

  19. PATCH-BASED MAPPING: SPARSE CODING 𝒚 𝑰 𝒚 𝑴 Encode Decode LR patch HR patch High-level information about the patch “Features” Sparse code 19

  20. PATCH FEATURES & RECONSTRUCTION Image patch can be reconstructed as a sparse linear combination of features Features are learned from the dataset over time 𝑬 𝒚 = 𝑬𝒜 = 𝒆 𝟐 𝒜 𝟐 + ⋯ + 𝒆 𝑳 𝒜 𝑳 𝑬 - dictionary 𝒚 - patch = 0.8 * + 0.3 * + 0.5 * 𝒜 - sparse code 𝒚 𝒆 𝟒𝟕 𝒆 𝟓𝟑 𝒆 𝟕𝟒 20

  21. GENERALIZED PATCH-BASED MAPPING Mapping in Mapping Mapping feature space LR patch High-level High-level HR patch representation of representation of the LR patch the HR patch “Features” 21

  22. GENERALIZED PATCH-BASED MAPPING Mapping in Mapping Mapping feature space 𝑋 𝑋 𝑋 1 2 3 LR patch HR patch Trainable parameters 22

  23. MAPPING OF THE WHOLE IMAGE Using Convolutions Convolutional operators HR image LR image Mapping Mapping in Mapping feature space 23

  24. AUTO-ENCODERS input output ≈ input 24

  25. AUTO-ENCODER Decode Encode input output ≈ input features 25

  26. AUTO-ENCODER Parameters 𝑋 Inference 𝑧 = 𝐺 𝑋 (𝑦) 𝑦 𝑧 Training 𝑋 = 𝑏𝑠𝑕𝑛𝑗𝑜 ෍ 𝐸𝑗𝑡𝑢(𝑦 𝑗 , 𝐺 𝑋 𝑦 𝑗 ) 𝑗 𝑦 𝑗 - training set 26

  27. AUTO-ENCODER Encode ▪ Our encoder is LOSSY by definition input information loss 27

  28. SUPER-RESOLUTION AUTO-ENCODER Parameters 𝑋 Inference 𝑧 = 𝐺 𝑋 (𝑦) 𝑦 𝑧 Training 𝑋 = 𝑏𝑠𝑕𝑛𝑗𝑜 ෍ 𝐸𝑗𝑡𝑢(𝑦 𝑗 , 𝐺 𝑋 𝑦 𝑗 ) 𝑗 𝑦 𝑗 - training set 28

  29. SUPER RESOLUTION AE: TRAINING y 𝑦 𝑦 ො 𝐺 W 𝐸 Downscaling SR AE 𝑋 LR image Ground-truth HR image Reconstructed HR image 𝑋 = 𝑏𝑠𝑕𝑛𝑗𝑜 ෍ 𝐸𝑗𝑡𝑢(𝑦 𝑗 , 𝐺 𝑋 𝐸(𝑦 𝑗 ) ) 𝑗 𝑦 𝑗 - training set 29

  30. SUPER RESOLUTION AE: INFERENCE y 𝑦 ො 𝐺 W SR AE 𝑋 Given LR image Constructed HR image 𝑧 = 𝐺 𝑋 (ො 𝑦) 30

  31. SUPER-RESOLUTION: ILL-POSED TASK? 31

  32. THE LOSS FUNCTION 32

  33. THE LOSS FUNCTION Measuring the “distance” from a good result Distance function is a key element to obtaining good results. 𝑋 = 𝑏𝑠𝑕𝑛𝑗𝑜 ෍ 𝐸 𝑦 𝑗 , 𝐺 𝑋 (𝑦 𝑗 ) 𝑗 Choice of the loss function is an important decision 33

  34. LOSS FUNCTION MSE Mean Squared Error 1 2 𝑂 𝑦 − 𝐺 𝑦 34

  35. LOSS FUNCTION: PSNR MSE PSNR Mean Squared Error Peak Signal-to-Noise Ratio 1 𝑁𝐵𝑌 2 2 𝑂 𝑦 − 𝐺 𝑦 10 ∗ 𝑚𝑝𝑕 10 𝑁𝑇𝐹 35

  36. LOSS FUNCTION: HFEN MSE PSNR Mean Squared Error Peak Signal-to-Noise Ratio 1 𝑁𝐵𝑌 2 2 𝑂 𝑦 − 𝐺 𝑦 10 ∗ 𝑚𝑝𝑕 10 𝑁𝑇𝐹 HFEN (see A) High-Pass filter High Frequency Error Norm 𝐼𝑄(𝑦 − 𝐺 𝑦 ) 2 Perceptual loss Ref A: http://ieeexplore.ieee.org/document/5617283/ 36

  37. REGULAR LOSS Result 4x Result 4x 37

  38. REGULAR LOSS + PERCEPTUAL LOSS Result 4x Result 4x 38

  39. WARNING… THIS IS EXPERIMENTAL! 39

  40. SUPER-RESOLUTION: GAN-BASED LOSS 𝐺(𝑦) real 𝑦 𝑧 𝐸(𝑧) Generator Discriminator fake = −𝑚𝑜𝐸(𝐺 𝑦 ) GAN loss Total loss = Regular (MSE+PSNR+HFEN) loss + GAN loss 40

  41. QUESTIONS? Extended presentation from Game Developer Conference 2017 https://developer.nvidia.com/deep-learning-games GameWorks: Materials & Textures https://gwmt.nvidia.com

Recommend


More recommend