in search of art
play

In Search of Art Elliot J. Crowley and Andrew Zisserman Visual - PowerPoint PPT Presentation

In Search of Art Elliot J. Crowley and Andrew Zisserman Visual Geometry Group Department of Engineering Science University of Oxford The Goal An on-the-fly system for searching paintings visually A user can type in the name of any


  1. In Search of Art Elliot J. Crowley and Andrew Zisserman Visual Geometry Group Department of Engineering Science University of Oxford

  2. The Goal • An on-the-fly system for searching paintings visually • A user can type in the name of any category... • Then hundreds of paintings containing that category will be retrieved in a matter of seconds dog

  3. Benefits • In many instances, the retrieved paintings will not have been known to contain the category • Meaning these are new discoveries for the Art History community dog

  4. Why is this good? • Art historians can discover when something first appeared in paintings • They can also observe how things have changed over time

  5. How is this achieved? • Natural images annotated with object categories are everywhere. • These can be used to learn object classifiers. Google images of dog

  6. Dataset of Paintings • We use `Your Paintings’ as the dataset • `Your Paintings’ consists of over 210,000 paintings from UK galleries http://www.bbc.co.uk/arts/yourpaintings/ • Method is independent of dataset however • Can use other datasets e.g. Rijksmuseum or PrintART

  7. Outline • Methodology • Quantitative Evaluation • Aligning retrieved objects

  8. What do we do? • We crawl Google Images for a given category and learn a CNN-based classifier • This classifier is applied to a dataset of paintings, retrieving paintings containing the category

  9. The Architecture

  10. How do we do this quickly? • The bulk of the data has been pre-processed offline (negative training data, dataset of paintings) • Online processing of Google Images is done in parallel across multiple cores

  11. In more detail… • For a given query, the top 200 Google Image Hits are downloaded • For each of these a CNN feature is computed online • This is the positive training data

  12. Negative Training Data • Offline , images are downloaded for Google searches of `things’ and ‘photos’ • The features for these are pre-computed

  13. Classification • A Support Vector Machine is used to learn a classifier that discriminates the positive training data from the negative data beard not beard

  14. Retrieval • The classifier is applied to the pre-processed features of `Your Paintings’ • Each painting is given a score by the classifier

  15. Retrieval • The paintings are displayed in order of score. beard

  16. The Architecture - Timings 2s 0.5s <0.5s 4.5s <0.5s

  17. Example Queries bridge

  18. Example Queries carriage

  19. Example Queries flower

  20. Example Queries house

  21. Outline • Methodology • Quantitative Evaluation • Aligning retrieved objects

  22. Quantitative Evaluation • Evaluating the domain transfer problem of learning classifiers on natural images and applying these to paintings

  23. Test Set • For this an annotated dataset of paintings is required • 10,000 paintings in `Your Paintings’ have been tagged by the public • These tags + painting titles are used to form the `Paintings Dataset’ with annotations corresponding to classes of PASCAL VOC

  24. The Paintings Dataset Class Paintings • Assume complete annotation with Class Aeroplane 200 in the PASCAL sense Bird 805 • Assess by calculating APs Boat 2143 Chair 1202 Dog Cow 625 Dining-table 1201 Dog 1145 Horse Horse 1493 Sheep 751 Train Train 329

  25. Training Datasets • 4 Datasets of natural images are used for training • VOC12, VOC12+, Net Noisy, Net Curated

  26. Experiments Features compared: • Shallow Features - Fisher Vectors VS. • Deep Features - Convolutional Neural Networks (CNNs)

  27. Experiments - Features • Fisher Vector VS. CNN Features • CNN outperforms Fisher Vectors • Added advantage of being lower dimensionality

  28. Augmentation • No augmentation 224 224 • C+F augmentation 224 256

  29. Experiments - Augmentation • Sum Pool: Classifier applied to mean of augmented windows • Max Pool: Classifier applied to each augmented window and maximum score recorded • Best performance is aug + sum pool but almost as good with no aug + sum pool

  30. Experiments - Dimensionality • 1K performs best • Not that different from the others however

  31. Experiment Conclusions • For the on-the-fly system 1K CNN features are used as these performed the best • Sum pooled features are used for `Your Paintings’ as time is not a factor in computing these • No augmentation is used on the images downloaded from Google (0.3s per image per core vs. 2.4s)

  32. Outline • Methodology • Quantitative Evaluation • Aligning retrieved objects

  33. Alignment • Some objects are automatically aligned… moustache

  34. The Pencil Moustache Anonymous Trendsetter, 1565 Copycats, Now

  35. Alignment • Other objects require some work… train

  36. Solution Learn a DPM [1] on either 1. annotated bounding boxes (e.g. PASCAL VOC) or 2. the downloaded Google Images [1] P Felzenszwalb, R Girshick, D McAllester, D Ramanan, Object Detection with Discriminatively Trained Part Based Models, CVPR 2010

  37. Auto-alignment train

  38. Auto-alignment horse

  39. Conclusion • We provide a system that can find objects in paintings with high precision in very little time • The objects found can be further curated using a DPM

  40. Links • VISOR: Visual Search of BBC News [1] http://www.robots.ox.ac.uk/~vgg/research/on-the-fly/ • CNN code [2] http://www.robots.ox.ac.uk/~vgg/research/deep_eval/ • Our system COMING SHORTLY! [1] K Chatfield, A Zisserman, VISOR: Towards On-the-Fly Large-Scale Object Category Retrieval, ACCV, 2012 [2] Ken Chatfield, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman, Return of the Devil in the Details: Delving Deep into Convolutional Nets, BMVC, 2014

  41. Thank you • Any questions? • Or email elliot@robots.ox.ac.uk

Recommend


More recommend