Interframe Coding of Global Image Signatures for Mobile Augmented Reality David Chen 1 , Mina Makar 1,2 , Andre Araujo 1 , Bernd Girod 1 1 Department of Electrical Engineering, Stanford University 2 Qualcomm Inc. 1 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Samsung Galaxy S3 Smartphone 2 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
On-Device Image Matching System Match with Extract Generate Perform Database Local Global Geometric Global Features Signature Verification Signatures Local Database … … … … 3 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Hybrid Image Matching System Match with Extract Generate Perform Database Local Global Geometric Global Features Signature Verification Signatures Send compact interframe Local coded stream of global Database signatures in uplink Wireless Network Match with 0.62 Database Global Send labels, local features, and 0.51 Signatures global signatures for top-ranked database candidates in downlink 0.50 0.49 Remote Database … 4 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Outline • Related work on feature compression • Interframe coding of global signatures – Selective codeword propagation – Selective frame propagation – Selective frame propagation + local search – Global signature embedding • Coding and retrieval results 5 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Related Work on Feature Compression Random projections [Yeo, 2008] Transform coding [Chandrasekhar, 2009] Temporally coherent keypoint detector Local [Makar, 2012] Location histogram coding [Tsai, 2009] Image Descriptor and location predictive coding Features CHoG [Chandrasekhar, 2009] [Makar, 2012] [Baroffio, 2014] Bag of hash bits [He, 2011] Our Current Work Trace transform [Brasnett, 2007] Global Tree histogram coding [Chen, 2009] Image Signatures Residual vectors [Perronnin, 2010] [Jegou, 2010] [Chen, 2011] Intraframe Coding Interframe Coding 6 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Temporally Coherent Keypoint Detection Makar et al., 2012 time t Detection (D) Frame Forward Propagation (FP) Frame 7 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Generating Feature Residuals k = 3 8 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Generating Feature Residuals k = 3 9 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Residual Enhanced Visual Vector (REVV) Chen et al., 2011 k codewords Query Image Regularize Reduce Quantize Normalize Extract Local with Power Dimensions Feature Feature Features Law by PCA Descriptors Residuals Ranked List Database 0.74 Signatures 0.73 0.72 Normalize Compute Binarize Perform Correlation Weighted Components Cell-Specific 0.70 Scores Correlations from Sign LDA 0.63 0.2 0.62 Matching Images Non-matching Images 0.15 -1 +1 … Probability 0.1 0.05 0 0 5 10 15 20 25 30 Hamming Distance 1 0.8 Weights 0.6 0.4 0.2 0 0 5 10 15 20 25 30 Hamming Distance 10 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Interframe Coding of REVV D-Frame FP-Frame FP-Frame D-Frame Keypoints Keypoints Keypoints … Extract Extract Extract Extract REVV REVV REVV REVV R 1 R 2 R n R n+1 S 1 S 2 S n-1 Predictively Predictively … Encode Encode … S 1 S 2 S n S n+1 Continuous stream of REVV signatures Mobile Device Wireless Recognition results + Network Decode Compute Features for top candidates REVV Weighted Signatures Correlations Previous Database REVV REVV Signatures Signatures Server 11 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Interframe Coding of REVV D-Frame FP-Frame FP-Frame D-Frame Keypoints Keypoints Keypoints … Extract Extract Extract Extract REVV REVV REVV REVV R 1 R 2 R n R n+1 S 1 S 2 S n-1 Predictively Predictively … Encode Encode … S 1 S 2 S n S n+1 U t,1 = 1 U t,2 = 1 U t,3 = 0 U t,4 = 1 U t,5 = 0 U t,6 = 1 U t,k = 1 Extracted … REVV Signature R t,1 R t,2 R t,4 R t,6 R t,k V t,1 = 1 V t,2 = 1 V t,3 = 0 V t,4 = 1 V t,5 = 0 V t,6 = 1 V t,k = 1 Transmitted … REVV Signature S t,1 S t,2 S t,4 S t,6 S t,k 12 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Selective Codeword Propagation (SCP) Mobile U t,1 = 1 U t,2 = 1 U t,3 = 0 U t,4 = 1 U t,5 = 1 U t,6 = 1 U t,7 = 0 U t,8 = 1 Frame t … Device D-Frame Extracted R t,1 R t,2 R t,4 R t,5 R t,6 R t,8 U t+1,1 = 1 U t+1,2 = 1 U t+1,3 = 1 U t+1,4 = 1 U t+1,5 = 0 U t+1,6 = 1 U t+1,7 = 0 U t+1,8 = 1 Frame t+1 … FP-Frame Extracted R t+1,1 R t+1,2 R t+1,3 R t+1,4 R t+1,6 R t+1,8 AND AND AND AND AND AND AND AND Frame t+1 V t+1,1 = 1 V t+1,2 = 1 V t+1,3 = 0 V t+1,4 = 1 V t+1,5 = 0 V t+1,6 = 1 V t+1,7 = 0 V t+1,8 = 1 … FP-Frame Sent S t+1,1 S t+1,2 S t+1,4 S t+1,6 S t+1,8 Wireless Network V t,1 = 1 V t,2 = 1 V t,3 = 0 V t,4 = 1 V t,5 = 1 V t,6 = 1 V t,7 = 0 V t,8 = 1 Frame t … D-Frame Received S t,1 S t,2 S t,4 S t,5 S t,6 S t,8 Server Frame t+1 V t+1,1 = 1 V t+1,2 = 1 V t+1,3 = 0 V t+1,4 = 1 V t+1,5 = 0 V t+1,6 = 1 V t+1,7 = 0 V t+1,8 = 1 … FP-Frame Received S t+1,1 S t+1,2 S t+1,4 S t+1,6 S t+1,8 13 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Selective Frame Propagation (SFP) Mobile U t,1 = 1 U t,2 = 1 U t,3 = 0 U t,4 = 1 U t,5 = 1 U t,6 = 1 U t,7 = 0 U t,8 = 1 Frame t … Device D-Frame Extracted R t,1 R t,2 R t,4 R t,5 R t,6 R t,8 U t+1,1 = 1 U t+1,2 = 1 U t+1,3 = 1 U t+1,4 = 1 U t+1,5 = 0 U t+1,6 = 1 U t+1,7 = 0 U t+1,8 = 1 Frame t+1 … FP-Frame Extracted R t+1,1 R t+1,2 R t+1,3 R t+1,4 R t+1,6 R t+1,8 Interframe Codeword Similarity k k r k > t r r t t , 1 AND U , U U k t j , t 1, j t 1, j ? Yes No j 1 j 1 V t+1,1 = 1 V t+1,2 = 1 V t+1,3 = 0 V t+1,4 = 1 V t+1,5 = 0 V t+1,6 = 1 V t+1,7 = 0 V t+1,8 = 1 SFP … Encoding S t+1,1 S t+1,2 S t+1,4 S t+1,5 S t+1,6 S t+1,8 V t+1,1 = 1 V t+1,2 = 1 V t+1,3 = 0 V t+1,4 = 1 V t+1,5 = 0 V t+1,6 = 1 V t+1,7 = 0 V t+1,8 = 1 SCP … Encoding S t+1,1 S t+1,2 S t+1,4 S t+1,6 S t+1,8 14 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
SFP + Local Search (SFP + LS) Number of inliers in geometric verification N geo ³ t geo Terminate query locally N geo < t geo on mobile device Send REVV stream to server by SFP coding Local Remote Database Database Send local features and REVV signatures for top ranked database candidates to mobile device Wireless Network Mobile Device Server 15 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Embedded Global Signatures Codeword Codeword Codeword Codeword Codeword Codeword Codeword Codeword 1 2 3 4 5 6 7 8 Level 1 Highest Bitrate Level 2 Medium Bitrate Level 3 Lowest Bitrate 16 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Outline • Related work on feature compression • Interframe coding of global signatures – Selective codeword propagation – Selective frame propagation – Selective frame propagation + local search – Global signature embedding • Coding and retrieval results 17 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Analysis of Retrieval Performance 18 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Stanford Streaming MAR Dataset Mobile Augmented Reality 32 VGA-resolution query videos recorded with a camera phone Database of 23 labeled objects + 1M distractor images [Makar et al., 2013] 19 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Experimental Setup • Interframe coding parameters – N D-Frames = 1 and N FP-Frames = 29 for frame rate of 30 fps – Interframe codeword similarity threshold: t r = 0.9 – RANSAC threshold: t geo = 25 feature matches • REVV signature parameters – 250 SIFT features extracted for every D-Frame – Dimensionality reduction to d LDA = 32 – Codebook of k = 190 codewords • Retrieval accuracy vs. uplink bitrate comparison 20 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Retrieval Accuracy vs. Uplink Bitrate 88x 24x Embedding: 14x Level 1 < 2 kbps Embedding: Level 2 Embedding: Level 3 21 Chen et al. , Interframe Coding of Global Image Signatures for Mobile Augmented Reality
Recommend
More recommend