Scene Represe sentation Networks: ks: Continuous - PowerPoint PPT Presentation

Scene Represe sentation Networks: ks: Continuous 3D-Structure-Aware Neural Scene Representations Vincent Sitzmann Michael Zollhöfer Gordon Wetzstein

single image camera pose Novel Views Surface Normals intrinsics

Self-supervised Scene Representation Learning { } , ,… Lat Latent ent 3D 3D Scenes cenes { } , ,… Obse serva vations + + Image + Pose & Intrinsics What can we learn about latent 3D scenes from observations? Vision: Learn rich representations just by watching video!

Self-supervised Scene Representation Learning Re Re-Re Rende dered d Obse serva vations Obse serva vations , ,… , ,… Model Image Loss

Self-supervised Scene Representation Learning Re Re-Re Rende dered d Obse serva vations Obse serva vations Neur eural al Scene cene Represe sentation , ,… , ,… Persistent feature representation of scene. Image Loss

Self-supervised Scene Representation Learning Re Re-Re Rende dered d Obse serva vations Obse serva vations Neur eural al Scene cene Neur eural al Rend ender erer er Represe sentation , ,… , ,… Persistent feature Render from different representation of camera perspectives. scene. Image Loss

2D baseline: Autoencoder Re Re-Re Rende dered d Obse serva vations Obse serva vations Conv Conv , ,… , ,… Latent Code Encoder Decoder + Output Pose Image Loss

2D baseline: Autoencoder Re Re-Re Rende dered d Obse serva vations Obse serva vations Conv , ,… , ,… Latent Code Decoder Output Pose Image Loss

Doesn’t capture 3D properties of scenes. Trained on ~2500 shapenet cars with 50 observations each. Need 3D inductive bias!

Related Work 3D inductive ve bias s / Self Se lf-su supervi vise sed 3D st structure with pose sed images Scene Represe sentation Learning Tatarchenko et al., 2015 Worrall et al., 2017 Eslami et al., 2018 … 2D Generative ve Models Goodfellow et al., 2014 Kingma et al., 2013 Kingma et al., 2018 … 3D Computer Visi sion Choy et al., 2016 Huang et al., 2018 Park et al., 2018 … Voxe xel-base sed Represe sentations Memory inefficient: ! " # . • Sitzmann et al., 2019 • Doesn’t parameterize scene surfaces smoothly. Lombardi et al., 2019 • Generalization is hard. Phuoc et al., 2019 …

Scene Representation Networks Re Re-Re Rende dered d Obse serva vations Obse serva vations Neur eural al Scene cene Neur eural al Rend ender erer er , ,… , ,… Represe sentation Image Loss

Free Space ! " Objects ! #

Model scene as function Φ that maps coordinates to features. [] Free Space $ % " ∈ … [] Objects Φ: ℝ ) → ℝ + " ∈ … [] … Free " ∈ Space … $ &

Scene Representation Network parameterizes Φ as MLP. [] Free Space ) * Sc Scene " ∈ Represe sentation … Net etwor ork [] Φ: ℝ & → ℝ ( Objects " ∈ … [] … Free " ∈ Space … ) +

Scene Representation Network parameterizes Φ as MLP. Sc Scene Can sample anywhere, Represe sentation Net etwor ork at arbitrary resolutions. Φ: ℝ $ → ℝ & Parameterizes scene surfaces smoothly. Memory scales with scene complexity.

Scene Representation Networks Neur eural al Scene cene Represe sentation Re Re-Re Rende dered d Obse serva vations Obse serva vations Φ: ℝ $ → ℝ & Neur eural al Rend ender erer er , ,… , ,… Image Loss

Neural Renderer. Free Space ! " ! #

Neural Renderer.

Neural Renderer Step 1: Intersection Testing. Idea: march along ray until arrived at surface. ? ? ? ? ?

Neural Renderer Step 1: Intersection Testing. feature $ # vector Scene Represe sentation Φ: ℝ ( → ℝ * ! " ! # world coordinates

Neural Renderer Step 1: Intersection Testing. Ray Marching LSTM feature # " * "+, vector Step length Scene Represe sentation Φ: ℝ ' → ℝ ) Feasible step length: Distance to closest scene surface ! - ! "+, ! " world coordinates

Neural Renderer Step 1: Intersection Testing. Iteration 0

Neural Renderer Step 2: Color Generation Iteration 4

Neural Renderer Step 1: Intersection Testing. Iteration …

Neural Renderer Step 1: Intersection Testing.

Neural Renderer Step 2: Color Generation Scene Represe sentation Φ: ℝ $ → ℝ & Color MLP

Can now train end-to-end with posed images only! Neur eural al Scene cene Neur eural al Rend ender erer er Represe sentation Re-Re Re Rende dered d Obse serva vations Obse serva vations Φ: ℝ $ → ℝ & , ,… , ,… Image Loss

Generalizing across a class of scenes

Each scene represented by its own SRN. parameters ! & ∈ ℝ % parameters ! " ∈ ℝ % parameters ! ' ∈ ℝ % parameters ! ( ∈ ℝ %

Each scene represented by its own SRN. parameters ! * ∈ ℝ $ parameters ! ( ∈ ℝ $ ! " live on k-dimensional subspace of ℝ $ , % < ' . parameters ! + ∈ ℝ $ parameters ! , ∈ ℝ $

Each scene represented by its own SRN. embedding ! & ∈ ℝ % parameters ) & ∈ ℝ * embedding ! " ∈ ℝ % parameters ) " ∈ ℝ * Represent each scene with low-dimensional embedding embedding ! ' ∈ ℝ % parameters ) ' ∈ ℝ * embedding ! ( ∈ ℝ % parameters ) ( ∈ ℝ *

Each scene represented by its own SRN. embedding ) & ∈ ℝ * parameters ! & ∈ ℝ % Hyp ypernetwork k embedding ) " ∈ ℝ * parameters ! " ∈ ℝ % Ψ: ℝ * → ℝ % , z / ↦ Ψ ) 1 = ! 1 embedding ) ' ∈ ℝ * parameters ! ' ∈ ℝ % embedding ) ( ∈ ℝ * parameters ! ( ∈ ℝ %

Results

Novel View Synthesis – Baseline Comparison struction of objects in held-out test set Shapenet v2 – si single-sh shot reconst SRNs (Ours) Tatarchenko et al. Training § Shapenet cars / chairs. § 50 observations per object. Tatarchenko et al. Worrall et al. 2015 Testing • Cars / chairs from unseen test set • Single observation! Worrall et al. Deterministic 2017 Input pose GQN Deterministic GQN, adapted Eslami et al. SRNs 2018

Novel View Synthesis – SRN Output In Input pose se struction of objects in held-out test set Shapenet v2 – si single-sh shot reconst

Sampling at arbitrary resolutions 512x512 32x32 64x64 128x128 256x256 Surface Normals RGB

Generalization to unseen camera poses Camera close-up Camera Roll SRNs

Generalization to unseen camera poses Camera close-up Camera Roll SRNs Doesn’t reconstruct Doesn’t reconstruct Tatarchenko et al. geometry geometry

Latent code interpolation RGB Surface Normals

Can represent room-scale scenes, but aren’t compositional. Training set novel-view synthesis on Work-in-progress: Compositional SRNs GQN rooms (Eslami et al. 2018) with generalize to unseen numbers of objects! Shapenet cars, 50 observations.

Scene Representation Networks: Continuous 3D-structure-aware Neural Scene Representations Vincent Sitzmann Michael Zollhöfer Gordon Wetzstein Find me at Poster # 71! vsitzmann.github.io Looki king fo for rese search posi sitions @vincesitzmann in n sc scene represe sentation lear earni ning ng . Single-shot reconstruction Interpolation Camera pose extrapolation

Scene Represe sentation Networks: ks: Continuous - PowerPoint PPT Presentation

Scene Represe sentation Networks: ks: Continuous 3D-Structure-Aware Neural Scene Representations Vincent Sitzmann Michael Zollhfer Gordon Wetzstein single image camera pose Novel Views Surface Normals intrinsics Self-supervised Scene

Scene Graphs Scene Representation How does one describe the objects in a 3D scene? Scene

Scene Representation How does one describe the objects in a Scene Graphs 3D scene? Scene

Episode 42: I Made Slides 10 February 2019 The Three-Act, Seven Scene Structure Act I:

CMSC427 Scene graphs Credit: slides from Dr. Zwicker Today Scene graphs & hierarchies

a better and faster way Shu Kong CS, ICS, UCI Image Understanding --> Scene Parsing Scene

Volumetric Scene Reconstruction Volumetric Scene Reconstruction Goal Goal from Multiple

Capturing and Represe senting De Delibe ibera ration tion in in Pa Partic rticip ipatory

2016 New York EB-5 & Investment Immigration Convention Represe senting Regional Centers, s,

Managing Street Scene Matthew Wakelam Assistant Director Street Scene Cardiff Council 1.

Scene Understanding Introduction & Overview Outline Motivation The problems Scene

JavaFX Basics Scene Builder CS 2112 Lab 9: JavaFX JavaFX Basics Scene Builder CS 2112 Lab 9:

PENSION SYSTEM OUTLOOK IN UGANDA Present sentation ation by y David id N. Bonyi yi CEO, ,

Meeting 26 July 2019 Agenda Opening Speech Prese sentation on Doors Busin siness

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

Emergency Vehicle and Emergency Vehicle and Roadway Scene Safety Roadway Scene Safety The

Status of the SoLid experiment Benot GUILLON guillon@in2p3.fr for the SoLid Collaboration

OLDTIMERS Berliet 1900 Oldsmobile 1902 Cadillac Model B 1904 National Model C Touring 1904

C OMPUTING ? [Mark Weiser 1991] PART 1 : R EQUIREMENTS , TRENDS , Silicon based

Verb Morphology and Clause Structure in Basque: Allocutive Yulia Adaskina Pavel Grashchenkov

Structural optimization of automotive chassis: theory, set up, design M. Cavazzuti L. Splendi

Content Relevancy starts with understanding your international Audience Rob Zomerdijk

Robert Ikeda Jennifer Widom Stanford University Example CustList 1 Europe CustList 2

Aspectual object marking in Libyan Arabic Kersti Brjars, Khawla Ghadgoud & John Payne The

Scene Represe sentation Networks: ks: Continuous - PowerPoint PPT Presentation

Scene Represe sentation Networks: ks: Continuous 3D-Structure-Aware Neural Scene Representations Vincent Sitzmann Michael Zollhfer Gordon Wetzstein single image camera pose Novel Views Surface Normals intrinsics Self-supervised Scene

Scene Graphs Scene Representation How does one describe the objects in a 3D scene? Scene

Scene Representation How does one describe the objects in a Scene Graphs 3D scene? Scene

Episode 42: I Made Slides 10 February 2019 The Three-Act, Seven Scene Structure Act I:

CMSC427 Scene graphs Credit: slides from Dr. Zwicker Today Scene graphs &amp; hierarchies

a better and faster way Shu Kong CS, ICS, UCI Image Understanding --&gt; Scene Parsing Scene

Volumetric Scene Reconstruction Volumetric Scene Reconstruction Goal Goal from Multiple

Capturing and Represe senting De Delibe ibera ration tion in in Pa Partic rticip ipatory

2016 New York EB-5 &amp; Investment Immigration Convention Represe senting Regional Centers, s,

Managing Street Scene Matthew Wakelam Assistant Director Street Scene Cardiff Council 1.

Scene Understanding Introduction &amp; Overview Outline Motivation The problems Scene

JavaFX Basics Scene Builder CS 2112 Lab 9: JavaFX JavaFX Basics Scene Builder CS 2112 Lab 9:

PENSION SYSTEM OUTLOOK IN UGANDA Present sentation ation by y David id N. Bonyi yi CEO, ,

Meeting 26 July 2019 Agenda Opening Speech Prese sentation on Doors Busin siness

P2P Networks as Content P2P Networks as Content Delivery Networks Delivery Networks FINAL

Current Network Structure for Pediatrics Hospital Networks Country, state, regional, Academic

Emergency Vehicle and Emergency Vehicle and Roadway Scene Safety Roadway Scene Safety The

Status of the SoLid experiment Benot GUILLON guillon@in2p3.fr for the SoLid Collaboration

OLDTIMERS Berliet 1900 Oldsmobile 1902 Cadillac Model B 1904 National Model C Touring 1904

C OMPUTING ? [Mark Weiser 1991] PART 1 : R EQUIREMENTS , TRENDS , Silicon based

Verb Morphology and Clause Structure in Basque: Allocutive Yulia Adaskina Pavel Grashchenkov

Structural optimization of automotive chassis: theory, set up, design M. Cavazzuti L. Splendi

Content Relevancy starts with understanding your international Audience Rob Zomerdijk

Robert Ikeda Jennifer Widom Stanford University Example CustList 1 Europe CustList 2

Aspectual object marking in Libyan Arabic Kersti Brjars, Khawla Ghadgoud &amp; John Payne The

CMSC427 Scene graphs Credit: slides from Dr. Zwicker Today Scene graphs & hierarchies

a better and faster way Shu Kong CS, ICS, UCI Image Understanding --> Scene Parsing Scene

2016 New York EB-5 & Investment Immigration Convention Represe senting Regional Centers, s,

Scene Understanding Introduction & Overview Outline Motivation The problems Scene

Aspectual object marking in Libyan Arabic Kersti Brjars, Khawla Ghadgoud & John Payne The