Bee-Identification Error Exponent with Absentee Bees Anshoo Tandon National University of Singapore Joint work with: Vincent Y. F. Tan, NUS Lav R. Varshney, UIUC ISIT 2020 Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 1 / 18
Bee-Identification Problem To identify bees moving on a beehive Each bee labeled with a unique barcode A camera is employed to picture the beehive Application: ◮ In understanding interactions among honeybees (T. Gernat et al ., PNAS’18) ◮ To study similarity with human social-networks ◮ Model large-scale social networks for studying disease transmission Finite resolution image adds noise to barcodes ◮ Noise may cause bee-identification error We pose this as an information-theoretic problem ◮ Represent each barcode as a binary vector of length n ◮ Let m denote the total number of bees Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 2 / 18
Previous Work: All m bees are present in the image Ref: Tandon, Tan, and Varshney (TCOM’2019) Model: ◮ Each barcode is represented as a binary codeword of length n ◮ Collect all the m codewords to form a codebook C ⋆ Codebook C has size m × n ⋆ Each barcode corresponds to a row-vector (of length n ) in C ◮ Given a beehive image, extract all the barcodes ⋆ Stack the barcodes are in a single column ⋆ The effective channel is as follows: Effective Channel ˜ Codebook C ✲ Row-Permutation π C π C π ✲ BSC( p ) ✲ ◮ The channel permutes the rows of C and then adds noise ⋆ i -th row of C π corresponds to the π ( i )-th row of C ⋆ π is uniformly distributed over the set of all m -letter permutations Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 3 / 18
Absentee Bees: k out of m bees are absent in the image Effective Channel ˜ C π ( m − k ) C π ( m − k ) C π Delete k rows Codebook C ✲ Row-Permutation π BSC( p ) ✲ ✲ ✲ Channel deletes k rows of C π , to model the scenario in which k bees, out of a total of m bees, are absent in the beehive image Without loss of generality, we assume that the channel deletes the last k rows of C π to produce C π ( m − k ) ◮ π ( m − k ) : injective mapping of m − k rows of C π ( m − k ) to m rows of C Noise is modeled as BSC( p ), with 0 < p < 0 . 5 Decoder : has knowledge of codebook C Decoder’s task is to recover the channel-induced mapping π ( m − k ) using the channel output ˜ C π ( m − k ) ◮ π ( m − k ) directly ascertains the identity of the m − k bees in the image Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 4 / 18
Channel Model . . . Effective Channel ˜ C π ( m − k ) C π ( m − k ) C π Delete k rows Codebook C ✲ Row-Permutation π BSC( p ) ✲ ✲ ✲ For 1 ≤ i ≤ m − k , the i -th row of ˜ C π ( m − k ) , denoted ˜ ❝ i , is a noisy version of ❝ π ( i ) , and we have ❝ i | ❝ π ( i ) } = p d i (1 − p ) n − d i , Pr { ˜ 1 ≤ i ≤ m − k , m − k � ˜ � C , π ( m − k ) � � p d i (1 − p ) n − d i , � Pr C π ( m − k ) = i =1 where d i � d H (˜ ❝ i , ❝ π ( i ) ) denotes the Hamming distance between vectors ˜ ❝ i and ❝ π ( i ) Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 5 / 18
Bee-Identification: Problem Formulation Effective Channel ˜ C π ( m − k ) C π ( m − k ) Codebook C ✲ Row-Permutation π C π Delete k rows BSC( p ) ✲ ✲ ✲ The decoder corresponds to a function φ that takes ˜ C π ( m − k ) as an input and produces a map ν : { 1 , . . . , m − k } → { 1 , . . . , m } as output ◮ ν ( i ) corresponds to the index of the transmitted codeword which produced the received word ˜ ❝ i , where 1 ≤ i ≤ m − k The indicator variable for the bee-identification error is defined as � 1 , if ν � = π ( m − k ) , � � � D ν, π ( m − k ) 0 , if ν = π ( m − k ) . The expected bee-identification error probability over the BSC( p ) is D ( C , p , k , φ ) � E π ( m − k ) � � � ��� D ν, π ( m − k ) , E ◮ inner expectation is over the distribution of ˜ C π ( m − k ) given C and π ( m − k ) ◮ outer expectation is over the uniform distribution of π ( m − k ) over the set of all injective maps from { 1 , . . . , m − k } to { 1 , . . . , m } Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 6 / 18
Bee-Identification Error Exponent Let C ( n , m ) denote the set of all binary codebooks of size m × n Given n , m , and k , define the minimum expected bee-identification error probability as D ( n , m , p , k ) � min C ,φ D ( C , p , k , φ ) , where the minimum is over all codebooks C ∈ C ( n , m ), and all decoding functions φ For a given R > 0 and α ∈ (0 , 1), we consider the setting where m = 2 nR , and k = α m = α 2 nR , and study the error exponent n →∞ − 1 n log D ( n , 2 nR , p , α 2 nR ) E D ( R , p , α ) � lim sup Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 7 / 18
Our Contributions We provide an exact characterization of the error exponent ◮ We show that independent barcode decoding is optimal , i.e., joint decoding does not result in a better error exponent relative to the independent decoding of each noisy barcode ◮ This is in contrast to the result without absentee bees, where joint barcode decoding results in a significantly higher error exponent than independent barcode decoding We characterize the bee-identification “capacity” (i.e., supremum of rates for which the error probability can be made arbitrarily small) ◮ We prove the strong converse showing that for rates greater than the capacity, the error probability tends to 1 when n → ∞ We show that for low rates, there is a discontinuity in the error exponent function at α = 0 Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 8 / 18
Independent Decoding: Upper Bound on D ( n , m , p , k ) Here, the decoder picks ˜ ❝ i , and produces the index ν ( i ) = arg min 1 ≤ j ≤ m d H (˜ ❝ i , ❝ j ), for 1 ≤ i ≤ m − k Using the union bound, we get m − k � � � �� D ( C , p , k , φ I ) ≤ E π ( m − k ) Pr ν ( i ) � = π ( m − k ) ( i ) i =1 Now, D ( n , m , p , k ) can be upper bounded as D ( n , m , p , k ) ≤ min { 1 , ( m − k ) P e ( n , m , p ) } , where P e ( n , m , p ) denotes the minimum achievable average error probability over BSC( p ), for transmission of a message, using a codebook with m codewords, each having length n Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 9 / 18
Joint Decoding: Lower Bound on D ( n , m , p , k ) D ( n , m , p , k ) can be lower bounded, using joint ML decoding of bee barcodes, as follows D ( n , m , p , k ) > 1 − 2 ε � � min 1 , ( m − k ) ε P e ( n , ⌊ k ε ⌋ , p ) , 2 where 0 < ε < 1 / 2 and k > 1 /ε D ( n , m , p , k ) can alternatively be lower bounded as follows � � D ( n , m , p , k ) > (1 − 2 ε ) 1 − exp ( − ( m − k ) ε P e ( n , ⌊ k ε ⌋ , p )) ◮ These lower bounds are obtained by only considering the error events where only a single bee barcode is incorrectly decoded Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 10 / 18
Bee-Identification Error Exponent For R > 0, the reliability function is defined as n →∞ − 1 n log P e ( n , 2 nR , p ) E ( R , p ) � lim sup The bee-identification error exponent is defined as n →∞ − 1 n log D ( n , 2 nR , p , α 2 nR ) E D ( R , p , α ) � lim sup Bounds on D ( n , m , p , k ) can be applied to obtain the following: Theorem 1 For 0 < α < 1 , we have E D ( R , p , α ) = | E ( R , p ) − R | + , where | x | + � max(0 , x ) . Further, this exponent is achieved via independent decoding of each barcode. Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 11 / 18
‘Capacity’ of the bee-identification problem Given 0 ≤ ǫ < 1, define the bee-identification ǫ -capacity as � � C D ( p , α, ǫ ) � sup n →∞ D ( n , 2 nR , p , α 2 nR ) ≤ ǫ R : lim inf The capacity is exactly characterized by the following theorem: Theorem 2 For 0 < α < 1 , and every 0 ≤ ǫ < 1 , we have C D ( p , α, ǫ ) = R ∗ p , where R ∗ p is unique positive solution of the following equation E ( R , p ) = R . The strong converse property holds, i.e., if R > R ∗ p , then the error probability D ( n , 2 nR , p , α 2 nR ) tends to 1 as n → ∞ Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 12 / 18
Error Exponent when α ↓ 0 Let E D ( R , p ) be the bee-identification error exponent when all bees are present (with k = 0) Theorem 3 For 0 < R < min { 0 . 169 , R ex / 2 } , we have the following strict inequality lim α ↓ 0 E D ( R , p , α ) < E D ( R , p ) . ◮ The above theorem highlights a discontinuity in the bee-identification error exponent function at α = 0 ◮ Contrasting behaviors: ⋆ Independent decoding of bee barcodes is optimal for the absentee bee scenario, even when arbitrarily small fraction of bees are absent ⋆ When all bees are present, strictly better error exponent is achieved via joint ML decoding of barcodes Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 13 / 18
Numerical Results: Error Exponent 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.05 0.1 0.15 0.2 0.25 0.3057 0.35 Bounds on E D ( R , p , α ) for p = 0 . 05. Anshoo Tandon (NUS) Bee-Identification with Absentee Bees ISIT 2020 14 / 18
Recommend
More recommend