Introduction and Motivation List decoding Uncertainty of Reconstructing Multiple Messages from Uniform-Tandem-Duplication Noise Yonatan Yehezkeally Moshe Schwartz Ben-Gurion University of the Negev Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Uniform tandem-duplication noise Tandem-duplication A substring ( template ) is duplicated, copy inserted next to template. E.g., x = 1012121 → x ′ = 1012012121 , Dfn . The noise is uniform if the length of duplication window is fixed. Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Uniform tandem-duplication noise Tandem-duplication A substring ( template ) is duplicated, copy inserted next to template. E.g., x = 1012121 → x ′ = 1012012121 , Dfn . The noise is uniform if the length of duplication window is fixed. Applications In-vivo DNA storage: Around 3% of the human genome consists of tandem repeats Mundy, Helbig , Journal of Molecular Evolution , 2004. Synchronization noise in magnetic media (sticky-insertions) Mahdavifar and Vardy , ISIT’17 , 2017. In these cases, uniform noise is easier to analyze . Closely related to the permutation / multiset channel (applications to packet networks and in-vitro DNA storage). Kovaˇ cevi´ c and Tan , T-IT , 2018. Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Coding redundancy (state-of-the-art) Unlimited number of errors t = ∞ Rate loss, equivalent to that of an appropriate RLL system. ( 0 , k − 1 ) q -RLL, for alphabet size q and duplication window length k . Jain et.al. , T-IT , 2017. Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Coding redundancy (state-of-the-art) Unlimited number of errors t = ∞ Rate loss, equivalent to that of an appropriate RLL system. ( 0 , k − 1 ) q -RLL, for alphabet size q and duplication window length k . Jain et.al. , T-IT , 2017. Finite number of errors t < ∞ ECC optimal redundancy (lower and upper bounds) : t log q ( n ) + O ( 1 ) . Lenz et.al. , arXiv , 2018. Kovaˇ cevi´ c and Tan , IEEE Comm. Letters , 2018. Efficient en/decoding: t log q ( n ) + o ( log ( n )) (asymptotically optimal) Mahdavifar and Vardy , ISIT’17 , 2017. Multiple distinct reads of noisy data: ( t − 1 ) log q ( n ) + O ( 1 ) . (Reconstruction with sublinear uncertainty.) Yehezkeally and Schwartz , T-IT , 2020. Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Coding redundancy (state-of-the-art) Unlimited number of errors t = ∞ Rate loss, equivalent to that of an appropriate RLL system. ( 0 , k − 1 ) q -RLL, for alphabet size q and duplication window length k . Jain et.al. , T-IT , 2017. Finite number of errors t < ∞ ECC optimal redundancy (lower and upper bounds) : t log q ( n ) + O ( 1 ) . Lenz et.al. , arXiv , 2018. Kovaˇ cevi´ c and Tan , IEEE Comm. Letters , 2018. Efficient en/decoding: t log q ( n ) + o ( log ( n )) (asymptotically optimal) Mahdavifar and Vardy , ISIT’17 , 2017. Multiple distinct reads of noisy data: ( t − 1 ) log q ( n ) + O ( 1 ) . (Reconstruction with sublinear uncertainty.) Yehezkeally and Schwartz , T-IT , 2020. Question : At what cost may redundancy be further reduced? Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Setting Setting : Space of finite strings Σ ∗ over an alphabet Σ of size q > 1. Noise : Strings affected by uniform tandem duplication noise (duplication window length k ) ⇒ y ∈ Σ n + k E.g., for x ∈ Σ n : x = Error spheres : D t ( x ) , with error cones D ∗ ( x ) � � ∞ t = 0 D t ( x ) Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Setting Setting : Space of finite strings Σ ∗ over an alphabet Σ of size q > 1. Noise : Strings affected by uniform tandem duplication noise (duplication window length k ) ⇒ y ∈ Σ n + k E.g., for x ∈ Σ n : x = Error spheres : D t ( x ) , with error cones D ∗ ( x ) � � ∞ t = 0 D t ( x ) Thm. : Error cones (for y , z ∈ Σ ∗ ) Jain et.al. , T-IT , 2017. ⇒ ∃ x ∈ Σ ∗ : y , z ∈ D ∗ ( x ) D ∗ ( y ) ∩ D ∗ ( z ) � = ∅ ⇐ Thus, space is partitioned into disjoint descendant cones of irreducible strings. Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Setting Setting : Space of finite strings Σ ∗ over an alphabet Σ of size q > 1. Noise : Strings affected by uniform tandem duplication noise (duplication window length k ) ⇒ y ∈ Σ n + k E.g., for x ∈ Σ n : x = Error spheres : D t ( x ) , with error cones D ∗ ( x ) � � ∞ t = 0 D t ( x ) Thm. : Error cones (for y , z ∈ Σ ∗ ) Jain et.al. , T-IT , 2017. ⇒ ∃ x ∈ Σ ∗ : y , z ∈ D ∗ ( x ) D ∗ ( y ) ∩ D ∗ ( z ) � = ∅ ⇐ Thus, space is partitioned into disjoint descendant cones of irreducible strings. Metric : Dfn. : For x ∈ Σ ∗ , y , z ∈ D r ( x ) d ( y , z ) � min t ∈ N : D t ( y ) ∩ D t ( z ) � = ∅ � � Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Associative memory Principles Item retreived by association with a set of other items Dfn .: Uncertainty N ( m ) is the cardinality of largest set whose members are associated with an m -subset of the memory code-book. Yaakobi and Bruck , T-IT , 2019. Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Associative memory Principles Item retreived by association with a set of other items Dfn .: Uncertainty N ( m ) is the cardinality of largest set whose members are associated with an m -subset of the memory code-book. Yaakobi and Bruck , T-IT , 2019. Generalizes the reconstruction schema : multiple distinct noisy version of data are available to decoder (e.g., cell replication in in-vivo DNA cannel) � ≤ t errors occur in transmission ⇒ N + 1 noisy outputs suffice to = decode transmitted code-word. N largest intrsection of two t -balls Levenshtein , T-IT , 2001. Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Associative memory Principles Item retreived by association with a set of other items Dfn .: Uncertainty N ( m ) is the cardinality of largest set whose members are associated with an m -subset of the memory code-book. Yaakobi and Bruck , T-IT , 2019. Generalizes the reconstruction schema : multiple distinct noisy version of data are available to decoder (e.g., cell replication in in-vivo DNA cannel) � ≤ t errors occur in transmission ⇒ N + 1 noisy outputs suffice to = decode transmitted code-word. N largest intrsection of two t -balls Levenshtein , T-IT , 2001. Reduction to m = 2, enables reconstruction of unique ( m − 1 = 1) input. m > 2 = ⇒ N ( m ) + 1 outputs yield set of l < m code-words, i.e., enables list decoding Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Results Aim Find the trade-off, as the message length n grows, between N Uncertainty (or required number of reads– minus one ) t Number of uniform tandem duplication errors m Maximal list size (plus one) d Designed minimum distance ( ( d − 1 ) log q ( n ) + O ( 1 ) redundancy) Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Tandem-duplication noise List decoding Setting and results Results Aim Find the trade-off, as the message length n grows, between N Uncertainty (or required number of reads– minus one ) t Number of uniform tandem duplication errors m Maximal list size (plus one) d Designed minimum distance ( ( d − 1 ) log q ( n ) + O ( 1 ) redundancy) Results ( 1 )( 2 ) log n N + ⌈ log n m ⌉ + d = t + ǫ + o ( 1 ) ( 1 ) Coding is done in a typical subspace of Σ n , asymptotically achieving full space size ( 2 ) ǫ ∈ { 0 , 1 } is (generally) an implicit non-increasing function of m Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Typical set List decoding Finding uncertainty / Efficient decoding Associative memory Definitions m Dfn. : Given m , n , t ∈ N and x 1 , . . . , x m ∈ Σ n : S t ( x 1 , . . . , x m ) � D t ( x i ) � i = 1 Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Introduction and Motivation Typical set List decoding Finding uncertainty / Efficient decoding Associative memory Definitions m Dfn. : Given m , n , t ∈ N and x 1 , . . . , x m ∈ Σ n : S t ( x 1 , . . . , x m ) � D t ( x i ) � i = 1 The Uncertainty of C ⊆ Σ n N t ( m , C ) � max | S t ( x 1 , . . . , x m ) | x 1 , ... , x m ∈ C x i � = x j Yehezkeally and Schwartz, ISIT’2020 Reconstructing Multiple Messages
Recommend
More recommend