Search for similar descriptor in neighborhood in next frame Target Candidate
Compute a descriptor for the new target Target
Search for similar descriptor in neighborhood in next frame Target Candidate
How do we model the target and candidate regions?
Modeling the target M-dimensional target descriptor q = { q 1 , . . . , q M } (centered at target center) a ‘fancy’ (confusing) way to write a weighted histogram X k ( k x n k 2 ) δ [ b ( x n ) � m ] q m = C A normalized n color histogram Normalization function of inverse quantization bin ID (weighted by distance) factor distance function (weight) sum over Kronecker delta all pixels function
Modeling the candidate M-dimensional candidate descriptor p ( y ) = { p 1 ( y ) , . . . , p M ( y } y 0 (centered at location y ) a weighted histogram at y � 2 ! � y − x n X � � p m = C h δ [ b ( x n ) − m ] k � � h � � n bandwidth
Similarity between the target and candidate p d ( y ) = 1 − ρ [ p ( y ) , q ] Distance function X p Bhattacharyya Coefficient ρ ( y ) ≡ ρ [ p ( y ) , q ] = p m ( y ) q u m p ( y ) Just the Cosine distance between two unit vectors ρ ( y ) = cos θ y = p ( y ) > q X p k p kk q k = p m ( y ) q m m θ q
Now we can compute the similarity between a target and multiple candidate regions
target q p ( y ) ρ [ p ( y ) , q ] image similarity over image
target q we want to find this peak p ( y ) ρ [ p ( y ) , q ] image similarity over image
Objective function max ρ [ p ( y ) , q ] min y d ( y ) same as y Assuming a good initial guess ρ [ p ( y 0 + y ) , q ] Linearize around the initial guess (Taylor series expansion) ρ [ p ( y ) , q ] ≈ 1 p m ( y 0 ) q m + 1 r q m X X p p m ( y ) 2 2 p m ( y 0 ) m m function at specified value derivative
Linearized objective ρ [ p ( y ) , q ] ≈ 1 p m ( y 0 ) q m + 1 r q m X X p p m ( y ) 2 2 p m ( y 0 ) m m � 2 ! � Remember y − x n X � � p m = C h δ [ b ( x n ) − m ] k definition of this? � � h � � n Fully expanded ( � 2 ! ) r � ρ [ p ( y ) , q ] ≈ 1 p m ( y 0 ) q m + 1 q m y − x n X X X p � � δ [ b ( x n ) − m ] C h k � � 2 2 p m ( y 0 ) h � � m m n
Fully expanded linearized objective ( � 2 ! ) r � ρ [ p ( y ) , q ] ≈ 1 p m ( y 0 ) q m + 1 q m y − x n X X X p � � δ [ b ( x n ) − m ] C h k � � 2 2 p m ( y 0 ) h � � m m n Moving terms around… � 2 ! � ρ [ p ( y ) , q ] ≈ 1 p m ( y 0 ) q m + C h y − x n X X p � � w n k � � 2 2 h � � m n Does not depend on unknown y Weighted kernel density estimate r q m X w n = p m ( y 0 ) δ [ b ( x n ) − m ] where m q m > p m ( y 0 ) Weight is bigger when
OK, why are we doing all this math?
We want to maximize this max ρ [ p ( y ) , q ] y
We want to maximize this max ρ [ p ( y ) , q ] y Fully expanded linearized objective � 2 ! � ρ [ p ( y ) , q ] ≈ 1 p m ( y 0 ) q m + C h y − x n X X p � � w n k � � 2 2 h � � m n r q m X w n = p m ( y 0 ) δ [ b ( x n ) − m ] where m
We want to maximize this max ρ [ p ( y ) , q ] y Fully expanded linearized objective � 2 ! � ρ [ p ( y ) , q ] ≈ 1 p m ( y 0 ) q m + C h y − x n X X p � � w n k � � 2 2 h � � m n doesn’t depend on unknown y r q m X w n = p m ( y 0 ) δ [ b ( x n ) − m ] where m
We want to maximize this max ρ [ p ( y ) , q ] y only need to maximize this! Fully expanded linearized objective � 2 ! � ρ [ p ( y ) , q ] ≈ 1 p m ( y 0 ) q m + C h y − x n X X p � � w n k � � 2 2 h � � m n doesn’t depend on unknown y r q m X w n = p m ( y 0 ) δ [ b ( x n ) − m ] where m
We want to maximize this max ρ [ p ( y ) , q ] y Fully expanded linearized objective � 2 ! � ρ [ p ( y ) , q ] ≈ 1 p m ( y 0 ) q m + C h y − x n X X p � � w n k � � 2 2 h � � m n doesn’t depend on unknown y r q m X w n = p m ( y 0 ) δ [ b ( x n ) − m ] where m what can we use to solve this weighted KDE? Mean Shift Algorithm!
� 2 ! � C h y − x n X � � w n k � � h 2 � � n the new sample of mean of this KDE is ✓� 2 ◆ � � y 0 − x n P � � n x n w n g h � y 1 = (this was derived earlier) ✓� 2 ◆ � � y 0 − x n P (new candidate � � n w n g h location) �
Mean-Shift Object Tracking For each frame: 1. Initialize location y 0 Compute q Compute p ( y 0 ) 2. Derive weights w n 3. Shift to new candidate location (mean shift) y 1 p ( y 1 ) 4. Compute k y 0 � y 1 k < ✏ 5. If return Otherwise and go back to 2 y 0 ← y 1
Compute a descriptor for the target Target q
Search for similar descriptor in neighborhood in next frame Target Candidate max ρ [ p ( y ) , q ] y
Recommend
More recommend