3d shape attributes
play

3D Shape Attributes David Fouhey, Abhinav Gupta and Andrew Zisserman - PowerPoint PPT Presentation

3D Shape Attributes David Fouhey, Abhinav Gupta and Andrew Zisserman CMU & University of Oxford http://www.robots.ox.ac.uk/~vgg To appear: CVPR 2016 Motivation How to describe this object? 1. Label: Henry Moore Sculpture, Oval with


  1. 3D Shape Attributes David Fouhey, Abhinav Gupta and Andrew Zisserman CMU & University of Oxford http://www.robots.ox.ac.uk/~vgg To appear: CVPR 2016

  2. Motivation • How to describe this object? 1. Label: Henry Moore Sculpture, “Oval with Points” 2. Shape description: 3D solid object, smooth for the most part but has pointed/conical parts, has hole, bulbous, rectangular (portrait) aspect ratio, approx. mirror symmetry

  3. Motivation • Objective: represent the shape of 3D objects (in a viewpoint invariant manner) 1. 3D Shape Attributes : • Curvature • Contact • Volumetric • … 2. Vector (embedding) • Address the “open ‐ world” problem • Use sculptures as objects due to their great variety of shape

  4. Motivation 3D shape from single images: • A fundamental goal of computer vision is 3D understanding from images, e.g. Koenderink & Van Doorn, 1971, and work from 1980s: • shape from contour • shape from texture • shape from specularities • …

  5. Motivation 3D shape from single images is somewhat neglected in the ConvNet era, with some exceptions such as: • Regressing pixels ‐ > depth map Image Depth Normals Eigen et al. ‘15 Wang et al. ‘15 Among many others: Saxena et al. ’07, Barron et al. ’11 – ’15, Karsch ‘12, Fouhey ’13, ‘14, Eigen ‘14, ’15, Ladicky ’14, Liu ‘14, Baig ‘15, Wang ’15, etc. • Class ‐ specific reconstructions, e.g. Kar et al., "Category specific object reconstruction from a single image.", CVPR 2015

  6. 3D Shape Attributes (12 of these)

  7. Examples Positives: Has Planar Surfaces

  8. Examples Negatives: Has Planar Surfaces

  9. Examples Positives: Has Point/Line Contact

  10. Examples Negatives: Has Point/Line Contact

  11. Examples Positives: Has Thin Structures

  12. Examples Negatives: Has Thin Structures

  13. Examples Positives: Has Rough Surfaces

  14. Examples Negatives: Has Rough Surfaces

  15. 3D Shape Attributes (12 of these)

  16. Research Question • Can ConvNets learn to predict these 3D shape attributes, and a 3D embedding, in a viewpoint invariant manner? • and can they also generalize to other (non ‐ sculpture) classes?

  17. Data

  18. Data London Malaga Yorkshire Princeton Columbus Toronto

  19. Data

  20. Data 242 Artists 2187 Works 143K Images in 9352 Viewpoint Clusters A. Calder 5 Swords … Gwenfritz … Eagle … H. Moore Two Forms … … The Arch … Knife Edge … R. Serra … …

  21. Data 242 Artists 2187 Works 143K Images in 9352 Viewpoint Clusters A. Calder 5 Swords … Gwenfritz … Eagle … … H. Moore Two Forms … The Arch Knife Edge … R. Serra …

  22. Data Collection 5 Swords A. Calder Eagle Artist / Work B. Hepworth Gwenfritz Vocabulary Two Forms H. Moore Construction The Arch Knife Edge R. Serra ~250 ~2K ~150K Images Artists Works ~9K Clusters 5 Swords A. Calder Viewpoint Eagle Clustering + B. Hepworth Gwenfritz Cleaning + Two Forms Query expansion H. Moore The Arch Knife Edge R. Serra

  23. Data Statistics Artists Works Images Train 122 1196 77K Val 61 459 31K Test 59 532 35K Total 242 2187 143K

  24. Training Loss Functions • Multi ‐ task learning 1. Attribute classification loss • Sum of 12 cross ‐ entropy losses, one for each attribute 2. Embedding loss • Triplet loss to match images of the same work

  25. Training Loss Functions 1. Attribute classification loss • Sum of 12 cross ‐ entropy losses, one for each attribute N L X X Y i,l log( P i,l ) + (1 − Y i,l ) log(1 − P i,l ) , L ( Y, P ) = i =1 l =1 ,Y i,l 6 = ∅ for image i and label l , with labels Y i,l ∈ { 0 , 1 , ∅ } N,L , and predicted probabilities P i,l ∈ [0 , 1] N,L

  26. Training Loss Functions 2. Embedding loss • Triplet loss to match images of the same work CNN encoder Φ embedding space R d φ ( a ) a congruous near pair φ ( p ) far p incongruous φ ( n ) pair n anchor a, positive p, negative n Triplet loss as in Schults and Joachims ’04, Schroff et al. ’14, Wang et al. ‘15, Parkhi et al. ‘15

  27. Embedding loss CNN encoder Φ embedding space R d φ ( a ) a congruous near pair φ ( p ) far p incongruous φ ( n ) pair n distance margin

  28. Embedding loss a p n distance margin || φ ( a ) − φ ( p ) || 2 + α ≤ || φ ( a ) − φ ( n ) || 2 X max(0 , α + || φ ( a ) − φ ( p ) || 2 − || φ ( a ) − φ ( n ) || 2 ) min φ triplets

  29. Learning To Predict 12D Shape Attributes 1024D Shape Embedding Input Conv. Layers FC Layers VGG ‐ M

  30. Goals of Experiments • How well can we do? • Are we modeling 3D shape? • Does this generalize?

  31. Qualitative Results Most Least Point/Line Contact … Rough Surface …

  32. Qualitative Results Most Least Thin Structures …

  33. Quantitative Results Curvature Contact Planar Not Planar Cylinder Rough Point/Line Multiple 82.8 77.2 56.9 76.0 74.4 76.4 Occupancy Has Hole Cubic Ratio Empty 2+ Pieces Is Thin Mirror Sym. 60.8 60.3 60.4 69.3 85.8 87.0 Mean Area Under ROC

  34. Learning To Predict 12D Shape Attributes 1024D Shape Embedding Input Conv. Layers FC Layers

  35. Mental Rotation Shepard and Metzler 1971, Tarr et al. ‘98 Are two 3D objects related by a rotation

  36. Mental Rotation Shepard and Metzler 1971, Tarr et al. ‘98 Video credit: Thomas Fulcher

  37. Mental Rotation • Use works from different locations and with different materials • Classify using distance between vector descriptors

  38. Mental Rotation – Classification Results 100 million test image pairs ROC “Easy”: 0.9% positives ROC “Hard”: 0.3% positives

  39. Does it generalize to other classes?

  40. Synthetic Results – has planar P(Planar)

  41. Synthetic Results – non planar P(Non Planar)

  42. Synthetic Results – roughness P(Rough Surface)

  43. PASCAL VOC Results Most Point/Line Contact Least Least Most Planarity

  44. PASCAL VOC Results Most Toroidal Pieces Least Most Thin Structures Least

  45. Summary • Have learnt to predict 3D shape attributes and shape embedding • Dataset to be released • Improvements: binary vs relative attributes

Recommend


More recommend