Background & Motivation Shape Context Fast Matching Shape Context Matching For Efficient OCR Sudeep Pillai May 14, 2012 Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Shape Context Fast Matching Table of contents 1 Background & Motivation Motivation Background 2 Shape Context What is a Shape Context? Matching Shape Contexts Simliarity Measure 3 Fast Matching Dimensionality Reduction Matching Shape Contexts via Pyramid Matching Efficient Matching Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Motivation Shape Context Background Fast Matching Motivation Automatic translation/transcription of handwritten/printed text Printed text has several geometric constraints that can be utilized for improved performance Significant push for accuracy, not too much on optimization Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Motivation Shape Context Background Fast Matching Object Character Recognition MNIST database performance Digits size normalized, and centered in a fixed-size image 60,000 training examples, 10,000 test examples Classifier Preprocessing Test Error Rate % Linear Classfiers Linear classifier (1-layer NN) None 12.0 Pairwise linear classifier Deskewing 7.6 K-Nearest Neighbors K-NN, Euclidean (L2) None 3.09 K-NN, Euclidean (L3) Deskewing, noise removal 1.22 K-NN, Shape context matching Shape context extraction 0.63 Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Motivation Shape Context Background Fast Matching Object Character Recognition MNIST database performance Digits size normalized, and centered in a fixed-size image 60,000 training examples, 10,000 test examples Classifier Preprocessing Test Error Rate % SVMSs SVM Gaussian Kernel None 1.4 Virtual SVM, deg-9 poly, 2-pixel jittered None 0.56 Neural Nets Deep convex net, unsup pre-training None 0.83 Convolution Nets Committe of 35 conv. net Normalization 0.23 Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Motivation Shape Context Background Fast Matching Object Character Recognition Figure: A few digits from the MNIST database Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Motivation Shape Context Background Fast Matching Object Character Recognition MNIST database performance Digits size normalized, and centered in a fixed-size image 60,000 training examples, 10,000 test examples Classifier Preprocessing Test Error Rate % Linear Classfiers Linear classifier (1-layer NN) None 12.0 Pairwise linear classifier Deskewing 7.6 K-Nearest Neighbors K-NN, Euclidean (L2) None 3.09 K-NN, Euclidean (L3) Deskewing, noise removal 1.22 K-NN, Shape context matching Shape context extraction 0.63 Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation What is a Shape Context? Shape Context Matching Shape Contexts Fast Matching Simliarity Measure What is a Shape Context? Definition (Shape) A shape is represented as a sequence of boundary points: P = { p 1 , . . . , p n } , p i ∈ R 2 Definition (Shape Context) Shape context is a descriptor of interest point i.e. a histogram h i ( k ) = # { p j j � = i, x j − x i ∈ bin ( k ) } , in which bins are uniformly divided in log-polar space Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation What is a Shape Context? Shape Context Matching Shape Contexts Fast Matching Simliarity Measure Shape Context Representation Figure: Graphical representation of shape context bins Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation What is a Shape Context? Shape Context Matching Shape Contexts Fast Matching Simliarity Measure Shape Context Histogram Figure: Graphical representation of shape context histograms ℜ 60 Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation What is a Shape Context? Shape Context Matching Shape Contexts Fast Matching Simliarity Measure Matching Shape Contexts The cost of matching point p i on the first shape to point q j on the second shape (chi-square distance) K [ h i ( k ) − h j ( k )] 2 C ij = 1 � 2 h i ( k ) + h j ( k ) k =1 Minimize the total matching cost: � i C ( p i , q π ( i )) Optimal matching One possible technique to solve this problem is to use Hungarian method in O ( n 3 ) time complexity Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation What is a Shape Context? Shape Context Matching Shape Contexts Fast Matching Simliarity Measure Properties of shape contexts Invariant to translation and scale (as it is normalized by the mean distance of the n 2 point pairs) Can be made invariant to rotation (local tangent orientation) Tolerant to small affine distortion (log-polar, spatial blur proportional to r ) Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation What is a Shape Context? Shape Context Matching Shape Contexts Fast Matching Simliarity Measure Simliarity Measure Definition On employing a cubic spline transformation T, the two shapes’ similarity can be measured via a weighted sum D = aD ac + D sc + bD be D sc Shape context distance D ac Appearance cost D be Bending energy or transformation cost Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Dimensionality Reduction Shape Context Matching Shape Contexts via Pyramid Matching Fast Matching Efficient Matching Dimensionality Reduction Approximate matching is possible with full shape context feature A low-dimensional feature descriptor is desirable for performance purposes Uniform bin approximation will make matching accuracy decline with feature dimension d 2 Multiple modalities are representable even with a reduced subspace Use Principal Components Analysis to determine bases that define this shape context subspace Approximate matching can be performed faster once all ℜ 60 vectors are projected onto ℜ 3 Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Dimensionality Reduction Shape Context Matching Shape Contexts via Pyramid Matching Fast Matching Efficient Matching Dimensionality Reduction Figure: Projecting histograms of contour points onto the shape context subspace. The points on the human figure on the right are colored according to their 3-D shape context subspace feature values Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Dimensionality Reduction Shape Context Matching Shape Contexts via Pyramid Matching Fast Matching Efficient Matching Dimensionality Reduction Figure: Visualization of feature subspace constructed from shape context histograms for two different data sets. The RGB channels of each point on the contours are colored according to its histograms 3-D PCA coefficient values. Set matching in this feature space means that contour points of similar color have a low matching cost, while highly contrasting colors incur a high matching cost Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Dimensionality Reduction Shape Context Matching Shape Contexts via Pyramid Matching Fast Matching Efficient Matching Dimensionality Reduction Tradeoffs Larger d is Smaller the PCA reconstruction error Larger the distortion induced by the L1 embedding Larger the complexity of computing the embedding Do we really need a ℜ 60 feature vector to represent a shape? Shapes are almost never similar Approximate measures make more sense Extract only most discriminating dimensions as descriptor Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Dimensionality Reduction Shape Context Matching Shape Contexts via Pyramid Matching Fast Matching Efficient Matching Pyramid Matching X and Y are two sets of vectors in a ℜ d feature space Find an approximate correspondence between X and Y Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Dimensionality Reduction Shape Context Matching Shape Contexts via Pyramid Matching Fast Matching Efficient Matching Pyramid Matching Overview Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Dimensionality Reduction Shape Context Matching Shape Contexts via Pyramid Matching Fast Matching Efficient Matching Pyramid Matching Kernels Construct a sequence of grids at resolution 0 , . . . , L where a grid at a resolution l has D = 2 dl cells. Compute the histograms H l X and l Y where H l X and H l Y are histograms of X and Y at resolution l H l X ( i ) and H l Y ( i ) are the number of points of X and Y in the i th cell Compute the number of matches for each resolution using: D � I ( H l X , H l min ( H l X ( i ) , H l Y ) = Y ( i )) i =1 Sudeep Pillai Shape Context Matching For Efficient OCR
Background & Motivation Dimensionality Reduction Shape Context Matching Shape Contexts via Pyramid Matching Fast Matching Efficient Matching Pyramid Matching Kernels Summing all the I l giving more importance to the high resolution with: L L 2 L − 1 ( I l − I l +1 ) = 1 1 1 � � K ( X, Y ) = I L + 2 L I 0 + 2 L − l +1 I l − 1 l =0 l =1 where I l − I l +1 is the number of new matches Sudeep Pillai Shape Context Matching For Efficient OCR
Recommend
More recommend