A Kernel Density Based Approach for Large Scale Image Retrieval and - PowerPoint PPT Presentation

A Kernel Density Based Approach for Large Scale Image Retrieval and Its Application to Tattoo Identification Wei Tong CMU

Outline • Why tattoo retrieval – Used in forensic to identify • Victims, criminals • Content-based image retrieval – Bag of visual words model • Our method – Similar to the language model • Examples • Other interesting applications – Graffiti, trademark… • Summary

Introduction to Tattoos • Tattoo facts – A form of body modification – Embedded deeply into the skin – Over 5000 years

Introduction to Tattoos • Tattoo statistics ( 2003) Americans that have at least one tattoo: – Young adults (18 -29 age group): ~40%

Introduction to Tattoos • Tattoo is a (soft) biometric trait – Primary traits • Identify an individual • May not be available or work well in certain conditions – Soft biometric traits • Provide some identifying information • But lack distinctiveness and permanence to sufficiently differentiate between two individuals • Gender, height, weight, tattoo, scar, birthmark

Introduction to Tattoos • Tattoo for Law Enforcement – Victim identification, e.g. 9/11 bombing (a) (b) (c) (d) (a) a tsunami victim, (b) body of unidentified murdered women, (c) and (d) body parts found in a Florida state park

Introduction to Tattoos • Tattoos for Law Enforcement • Identify criminals – Often contain hidden meaning of a suspect’s criminal history » e.g. Previous convictions, years spent in jail – Gang membership. About 800,000 gang members on the streets nationwide; 100,000 in greater LA area alone 18 th Street gang , the largest LA-based street gang

Introduction to Tattoos • Law enforcement agencies photograph tattoos – when a suspect is arrested – They have done this for many years

Introduction to Tattoos • Michigan State Police Database ~100,000 images (JPEG, 640x480 color images)

Tattoo Image Matching and Retrieval • ANSI/NIST Tattoo Classes – 8 major classes – 70 subclasses Human Animal Plant Flag Object Abstract Symbol Other

Tattoo Image Matching and Retrieval • Existing practice is text search – Class label/keyword based – Tedious manual annotation – Same image different label/keyword by different individuals – One image, multiple keywords – Non-uniform class/keyword distribution • a keyword returns thousands of images – Change in keywords requires re-annotation

Tattoo Image Matching and Retrieval • Content-based image retrieval – Given a tattoo image query – Find the most similar tattoos in the database Query

Introduction to Tattoos • Tattoo-ID system • Prototype tattoo retrieval system • Delivered to FBI in 2011 • Licensed the technology to MorphoTrak

Introduction to Tattoos • A recent story on tattoo identification This person gave his name as “Darnell Lewis” to a police officer, but the police man noticed that the man had “Frazier” tattooed on his neck which is his real surname. He was arrested on four misdemeanor warrants.

Introduction to Tattoos • FBI is developing the Next Generation Identification (NGI) system for identifying criminals – $ 1 billion program – SMT (scar, mark, tattoo) is one of the major components – Expected by 2014

Content-based Image Retrieval • What “content” should be used – Difficult to understand the information needs of a user from a query image

Content-based Image Retrieval • Images with similar color

Content-based Image Retrieval • Images with similar shape

Content-based Image Retrieval • Images with similar semantic

Content-based Image Retrieval • Challenges in CBIR – You get drunk, – REALLY drunk, – Hit over the head, – Kidnapped to another city • in a country on the other side of the world – When you wake up, – You try to figure out what city are you in, and what is going on • That’s what it’s like to be a CBIR system !

Content-based Image Retrieval • Near Duplicate Image Retrieval – Given a query image, identify gallery images with high visual similarity

Content-based Image Retrieval • Bag-of-features – Detect local interest points – Represent each interest point by a descriptor – An image is a collection of those points. Each descriptor is 5 dimension 22 0 19 23 1 66 103 45 6 38 232 44 0 11 48 29 55 129 0 1 11 78 110 1 32 220 30 11 34 21 Original Detected key Descriptors of image points the key points

Content-based Image Retrieval • Bag-of-words Model An image A document What is the difference A collection of A collection of the words in the the key points document of the image The same word appears No “same key point”, but in many documents “similar key point” appears in many images which have similar “visual content” Group “similar key point” in different images in to “visual words”

Content-based Image Retrieval • Bag-of-words Model b 1 b 2 b 7 b 3 … b 6 b 4 b 8 … b 5 … b 1 b 2 b 3 b 4 Represent images by Group key points into visual words histograms of visual words

Content-based Image Retrieval • Shortcomings of bag-of-words model • Independent steps – Generating “bag-of words” representation – Image retrieval Separate steps hurt performance • Computationally expensive – Clustering key points into “visual words” • Inconsistent mapping – Distant keypoints may belong to the same cluster • Influenced by outliers – Every keypoints must be mapped to a cluster center

Our Method • Database – Image -> distribution – Keypoints -> sampled from the distribution – Distribution -> kernel density estimation • Retrieval – Query likelihood • Given a distribution, how likely would it generate the query keypoints ……

Efficient Kernel Density Estimation • A straightforward way – Kernel density estimation – Retrieval: query likelihood • Two Challenges: – How to efficiently estimate the density function of each image – How to avoid linear scan of the database for retrieval

Efficient Kernel Density Estimation • Weighted mixture model for each image • When is very large

Efficient Kernel Density Estimation • The image is eventually represented by – Estimate by MLE – Problem • is high dimensional • Need to compute for every image

Efficient Kernel Density Estimation • Use a local kernel • is a sparse vector with proper

Efficient Kernel Density Estimation • Updating rule for MLE – It is globally optimal • Approximation – Only update once – Exact optimal solution when are far apart from each other

Efficient Kernel Density Estimation • Algorithm to compute of each image – Range search – Compute

Regularization • MLE of is poor – with limited number of keypoints • Regularization – global – KL divergence

Regularization • The solution of becomes: Global MLE of • Global is:

Retrieval • Query Likelihood: • Problem: is not sparse anymore • has two parts Sparse Not sparse

Retrieval • Decompose the query likelihood Constant, independent of Sparse, inverted index can be individual images, ignored in utilized to identify a small set image search of candidate images. …

Retrieval • Algorithm: query – Construct for active centers: – Construct candidate image set – Compute query likelihood for every candidate image

Compare to BoW Our Method BoW Unified framework BoW generation and retrieval are separated Randomly sample centers (efficient) Clustering (very slow) Only close by keypoints are mapped Distant keypoints may belong to the to nearby centers same cluster (consistent mapping) (inconsistent mapping) Keypoints far away from all centers Every keypoints must be mapped to will be discard a center (robust to outliers) (influenced by outliers)

Experiments • Three datasets # of images # of features Size of DB # of querys 101,754 10,843,145 3.4GB 995 Tattoo Oxford5K 5,062 14,972,956 4.7GB 55 1,002,805 823,297,045 252.7GB 55 Oxford5K+ Flickr1M

Experiments • Tattoo dataset – Provided by Michigan State Police Department – Images are manually cropped prior to feature extraction – Examples of near duplicates

Experiments • Oxford building dataset – VGG group from Oxford university – 11 Oxford landmarks

Experiments • Metrics – CMC scores for tattoo dataset • Percentage of queries whose matched images are found in the first k retrieved images – mAP for the other two datasets • AP: the area under precision-recall curve • mAP: mean AP of all the queries

Results • Retrieval accuracy Tattoo (CMC Oxford5K Oxford5K+ rank10) Flickr1M Our method 0.82 0.61 0.45 BoW (AKM) +TF-IDF 0.78 0.57 0.39 • Speed – Similar retrieval speed as BoW+TF-IDF – Tattoo: 0.01s/ query – Oxford5K+ Flickr1M: 0.9s/query

Experiments on Tattoo Dataset • Number of random centers:

Experiments on Tattoo Dataset • Radius for range search: – : average pairwise distance

Experiments on Tattoo Dataset • Not sensitive to

A Kernel Density Based Approach for Large Scale Image Retrieval and - PowerPoint PPT Presentation

A Kernel Density Based Approach for Large Scale Image Retrieval and Its Application to Tattoo Identification Wei Tong CMU Outline Why tattoo retrieval Used in forensic to identify Victims, criminals Content-based image

Lecture 7: Kernel Density Estimation Applied Statistics 2015 1 / 20 Kernel Density Estimator

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Relative Density Chapters 3.5 Relative Density 1 2/5/2015 Minimum Density Pluviate soil from

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Polyethylene Monomer: Ethylene High Density Polyethylene (HDPE) Low Density Polyethylene

Bulk Density and Void Content Bulk Density Bulk density ( n .) the mass of a unit volume of bulk

Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Volume Dimension

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

The Loewner framework for model reduction of large-scale systems: An Overview and Sensitivity

Insightful Automatic Performance Modeling Alexandru Calotoiu 1 , Torsten Hoefler 2 , Martin Schulz

Shower & Hadronisation Uncertainties for Precision Top Physics Peter Skands (Monash U) Scale

CYBERDYNE: Automatic bug-finding at scale Peter Goodman COUNTERMEASURE 2016 Cyberdyne

Assimilation of Geostationary Satellite Land Surface Skin Temperature Observations into the GEOS-5

JET SUBSTRUCTURE AT THE LHC & BEYOND Simone Marzani Universit di Genova & INFN

CSCI 5417 Information Retrieval Systems Jim Martin Lecture 13 10/6/2011 Text classification

Scaling Marty Weiner Yashh Nelapati Orodruin, Mordor The Shire Friday, November 9, 12

A Kernel Density Based Approach for Large Scale Image Retrieval and - PowerPoint PPT Presentation

A Kernel Density Based Approach for Large Scale Image Retrieval and Its Application to Tattoo Identification Wei Tong CMU Outline Why tattoo retrieval Used in forensic to identify Victims, criminals Content-based image

Lecture 7: Kernel Density Estimation Applied Statistics 2015 1 / 20 Kernel Density Estimator

A large-scale International IPv6 Network A large-scale International IPv6 Network www.6net.org

Relative Density Chapters 3.5 Relative Density 1 2/5/2015 Minimum Density Pluviate soil from

Non-parametric Methods Oliver Schulte - CMPT 726 Bishop PRML Ch. 2.5 Kernel Density Estimation

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Image Restoration Image Enhancement and Image Restoration both deal with improving images. Image

FINANCING LARGE SCALE SOLAR Large Scale Solar Conference - Sydney Gloria Chan Director, Large

Polyethylene Monomer: Ethylene High Density Polyethylene (HDPE) Low Density Polyethylene

Bulk Density and Void Content Bulk Density Bulk density ( n .) the mass of a unit volume of bulk

Uniform Convergence Rate of the Kernel Density Estimator Adaptive to Intrinsic Volume Dimension

Black Kernel Rot Malady of Pecan B Wood, C Bock, l Wells, T Cottrell, M Hotchkiss Black Kernel

Kernel Properties - Convexity Leila Wehbe October 1st 2013 Leila Wehbe Kernel Properties -

Processes, Protection and the Kernel: Processes, Protection and the Kernel: Mode, Space, and

Linux Kernel Debugging Your kernel just oopsed - What do you do, hotshot? Muli Ben-Yehuda

Introduction to Linux Kernel Modules Luca Abeni luca.abeni@santannapisa.it Linux Kernel Modules

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

The Loewner framework for model reduction of large-scale systems: An Overview and Sensitivity

Insightful Automatic Performance Modeling Alexandru Calotoiu 1 , Torsten Hoefler 2 , Martin Schulz

Shower &amp; Hadronisation Uncertainties for Precision Top Physics Peter Skands (Monash U) Scale

CYBERDYNE: Automatic bug-finding at scale Peter Goodman COUNTERMEASURE 2016 Cyberdyne

Assimilation of Geostationary Satellite Land Surface Skin Temperature Observations into the GEOS-5

JET SUBSTRUCTURE AT THE LHC &amp; BEYOND Simone Marzani Universit di Genova &amp; INFN

CSCI 5417 Information Retrieval Systems Jim Martin Lecture 13 10/6/2011 Text classification

Scaling Marty Weiner Yashh Nelapati Orodruin, Mordor The Shire Friday, November 9, 12

Shower & Hadronisation Uncertainties for Precision Top Physics Peter Skands (Monash U) Scale

JET SUBSTRUCTURE AT THE LHC & BEYOND Simone Marzani Universit di Genova & INFN