grains generative recursive autoencoders for indoor scenes
play

GRAINS: Generative Recursive Autoencoders for INdoor Scenes Manyi Li - PowerPoint PPT Presentation

GRAINS: Generative Recursive Autoencoders for INdoor Scenes Manyi Li 1,2 , Akshay Gadi Patil 2 , Kai Xu 3,4 , Siddhartha Chaudhuri 5,6 , Owais Khan 6 , Ariel Shamir 7 , Changhe Tu 1 , Baoquan Chen 8 , Daniel Cohen-Or 9 , Hao (Richard) Zhang 2 1


  1. GRAINS: Generative Recursive Autoencoders for INdoor Scenes Manyi Li 1,2 , Akshay Gadi Patil 2 , Kai Xu 3,4 , Siddhartha Chaudhuri 5,6 , Owais Khan 6 , Ariel Shamir 7 , Changhe Tu 1 , Baoquan Chen 8 , Daniel Cohen-Or 9 , Hao (Richard) Zhang 2 1 Shandong University 2 Simon Fraser University 3 National University of Defense Technology 4 AICFVE Beijing Film Academy 5 Adobe Research 6 IIT Bombay 7 The Interdisciplinary Center 8 Peking University 9 Tel-Aviv University

  2. Outline • Problem & Related work • Method • Scene representation • Network • Ablation study • Results & Application

  3. Scene generation problem • Generate plausible room layouts automatically, to replace or reduce human work. Planner5d

  4. Related works: data-driven • Graphical model methods [Fisher et al. SIGA 2012], [Kermani et al. SGP 2016], [Qi et al. CVPR 2018] J Respect object-object relations L Too many parameters and rules to tune manually L Time-consuming Object appearance + Object positioning Learning Generation

  5. Related works: data-driven • Graphical model methods • Deep neural networks [Wang et al. SIGGRAPH 2018], [Ritchie et al. CVPR 2019] J Easy-to-use model and better plausibility L No object-object relations Indoor scene Multi-channel image CNN training

  6. Our method • Indoor scene structures are inherently hierarchical. Indoor scenes share some common patterns in the sub-scenes.

  7. Our method • Indoor scene structures are inherently hierarchical. + Z Hierarchical scene Recursive neural network - representation Variational Auto-encoder

  8. Scene Representation Step1: Deciding the merge order

  9. Scene Representation Step1: Deciding the merge order Step2: Construct the nodes floor wall wall of the hierarchy Leaf nodes: objects • wall wall rug cabinet cabinet … bed OBB sizes Object label stand lamp stand lamp Leaf vector

  10. Scene Representation Step1: Deciding the merge root node order Step2: Construct the nodes floor wall wall wall node wall node of the hierarchy Leaf nodes: objects • wall co-oc node wall co-oc node Internal nodes: groups • Support node • Surround node rug cabinet cabinet • sur node Co-occur node • Wall node • supp node supp node bed Root node • stand lamp stand lamp

  11. Scene Representation Step1: Deciding the merge root node order Step2: Construct the nodes floor wall rp rp wall rp wall node rp wall node of the hierarchy Leaf nodes: objects • wall co-oc node rp wall co-oc node rp Internal nodes: groups • Step3: Compute relative rug cabinet cabinet sur node rp rp positions between sibling nodes supp node rp supp node rp bed stand lamp stand lamp rp rp

  12. Our method • Indoor scene structures are inherently hierarchical. + Z Hierarchical scene Recursive neural network - representation Variational Auto-encoder

  13. Network • Network: RvNN-VAE (Recursive Neural Network – Variational Auto-Encoder) Z

  14. Encoding Process root node floor wall rp rp wall rp wall node rp wall node wall co-oc node rp wall co-oc node rp rug cabinet cabinet sur node rp rp supp node rp supp node rp bed stand lamp rp stand lamp rp Input hierarchy Encoder module

  15. Decoding process root node floor wall rp rp wall rp wall node rp wall node wall co-oc node rp wall co-oc node rp rug cabinet cabinet sur node rp rp supp node rp supp node rp bed stand lamp rp stand lamp rp Output hierarchy Decoder module

  16. Generation Pipeline • The network learns to map a random vector to a plausible indoor scene. Generation pipeline

  17. Scene representation matters! Indoor scenes are complex and diverse Appropriate scene representation is the key to learning Our key points: (1) hierarchical structure, (2) relative position format. SUNCG dataset [Song et al. 2017]

  18. Key point 1: Hierarchical structure • We specifically define wall node and root node in our hierarchies. • Reason: walls should serve the role of “grounding” the placement of objects in a room. root floor wall1 wall3 wall node wall node wall2 co-oc node co-oc node wall4 surr node cabinet1 cabinet2 rug bed supp node supp node stand1 table lamp1 stand2 table lamp1 Our hierarchical structure

  19. Ablation Studies: Hierarchical structure Hierarchy 1 Hierarchy 2 Hierarchy 3

  20. Ablation Studies: Hierarchical structure Hierarchy 1 Hierarchy 2 Hierarchy 1: Generated scenes L No “wall node”s L No “root node” Hierarchy 3

  21. Ablation Studies: Hierarchical structure Hierarchy 1 Hierarchy 2 Hierarchy 2: Generated scenes J “wall nodes” L No “root nodes” Hierarchy 3

  22. Ablation Studies: Hierarchical structure It is important to have floors, wall nodes, and their relative positions in the last merge, to “ground” the Hierarchy 1 object positions. Hierarchy 2 Hierarchy 3 (Ours): Generated scenes J “wall nodes” J “root nodes” Hierarchy 3

  23. Key point 2: Relative Position format Absolute pos Relative pos 1 Relative pos 2

  24. Ablation Studies : Relative position format Absolute pos Generated scenes Object’s absolute position in the leaf nodes Relative pos 1 Relative pos 2

  25. Ablation Studies : Relative position format Absolute pos Generated scenes Relative position between the object centers Relative pos 1 Relative pos 2

  26. Ablation Studies : Relative position format Absolute pos Generated scenes Relative position with offsets between closest edges (ours) Relative pos 1 Relative pos 2

  27. Results Generated scenes: • Plausible • Novel • Diverse

  28. Comparison against a graphical model method • For comparison, we select scenes with the same object shapes. • 3-12min / scene. • No guarantee on the exact alignment. • More unreasonable object pairs. • 0.1027sec / scene. • Relative positions with attachment and alignment information.

  29. Comparison: Perceptual studies We ask users to score or select the scenes based on their plausibility. Comparisons are done against (1) the training set, (2) [Kermani et al. 2016] (3) [Wang et al. 2018], (4) [Qi et al. 2018]

  30. Applications • Data augmentation method for deep learning tasks Semantic scene segmentation task: • Network: PointNet [Qi et al. 2017] • Dataset: Indoor scenes as point clouds • Results: More relevant training data, better learning performance and generalization.

  31. Applications • Data augmentation method for deep learning tasks • 2D layout guided 3D scene modeling Goal: 2D box layout to 3D indoor scene • Network: Pre-trained RvNN-VAE on 3D scenes • Result: Transform between multi-modal data • which share the same hierarchical structures.

  32. Applications • Data augmentation method for deep learning tasks • 2D layout guided 3D scene modeling • Hierarchy-guided scene editing Hierarchical indoor scene structure helps designers to edit a scene at the sub-scene level.

  33. Conclusion • We present a generative neural network which enables us to generate plausible 3D indoor scenes in large quantities and varieties, easily and highly efficiently. • We study the influence of different scene representations on the learning ability of generative RvNNs. • We show the applications of our generated scenes with the corresponding hierarchies.

  34. Thank you!

Recommend


More recommend