efficient algorithms for two extensions of lpf table the
play

Efficient algorithms for two extensions of LPF table: the power of - PowerPoint PPT Presentation

Efficient algorithms for two extensions of LPF table: the power of suffix arrays M.Crochemore C.S.Iliopoulos M.Kubica W.Rytter T.Wale SOFSEM 2010 M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Wale Efficient algorithms for two


  1. Efficient algorithms for two extensions of LPF table: the power of suffix arrays M.Crochemore C.S.Iliopoulos M.Kubica W.Rytter T.Waleń SOFSEM 2010 M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  2. Introduction Preliminaries Input: a string y [ 0 . . n − 1 ] . Auxiliary algorithms: the suffix array (SUF), the longest common prefix array (LCP), range minimum/maximum query (RMQ) for SUF and LCP. Can be done in O ( n ) time. M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  3. Introduction We consider two variants of the classical problem: The Longest Previous Factor Problem (LPF) LPF [ i ] = the largest such k , that y [ i . . i + k ] appears before (possibly overlapping). M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  4. Introduction We consider two variants of the classical problem: The Longest Previous Factor Problem (LPF) LPF [ i ] = the largest such k , that y [ i . . i + k ] appears before (possibly overlapping). Well studied. Can be computed in O ( n ) time. M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  5. Introduction The Longest Previous Reversed Factor Problem (LPrF) LPrF [ i ] = the largest such k , that rev ( y [ i . . i + k ]) appears before (without overlapping). M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  6. Introduction The Longest Previous Reversed Factor Problem (LPrF) LPrF [ i ] = the largest such k , that rev ( y [ i . . i + k ]) appears before (without overlapping). Generalises a factorization of strings used to extract certain types of palindromes [Kolpakov, Kucherov, 2008]. Applications in compression of genetic sequences (in combination with LPF) [Grumbach, Tahi, 1993]. M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  7. Introduction The Longest Previous Non-Overlapping Factor Problem (LPnF) LPnF [ i ] = the largest such k , that y [ i . . i + k ] appears before (without overlapping). M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  8. Introduction The Longest Previous Non-Overlapping Factor Problem (LPnF) LPnF [ i ] = the largest such k , that y [ i . . i + k ] appears before (without overlapping). Emerged from a version of Ziv-Lempel factorization. Decomposition of a string into already processed factors. Application in algorithms computing repetitions in strings [Crochemore, 1986], [Main, 1989], [Kolpakov, Kucherov, 1999]. M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  9. Introduction Example position i 0 1 2 3 4 5 6 7 8 y [ i ] a b b a b b a b a LPF [ i ] 0 0 1 5 4 3 2 2 1 LPrF [ i ] 0 0 2 1 3 3 2 2 1 LPnF [ i ] 0 0 1 3 3 3 2 2 1 b b a b b b a b b a a b b a b b a b a a b b a b b a b a a b b a b b a b a b b a b b b a a b b LPF[4]=4 LPnF[4]=3 LPrF[4]=3 M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  10. The Alternating Search Technique Assumptions We assume, that the following operations are given, and take O ( 1 ) time: Val ( k ) — non-increasing (for i ≤ k ≤ j ), Candidate ( k ) — a predicate, FirstMin ( i , j ) — first position k ∈ [ i . . j ] with the minimum value of Val ( k ) , NextCand ( i , j ) — any candidate k ∈ [ i . . j ) . M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  11. The Alternating Search Technique Goal For a given range [ i . . j ] , find a candidate k maximizing Val ( k ) . M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  12. The Alternating Search Technique Goal For a given range [ i . . j ] , find a candidate k maximizing Val ( k ) . Alternating-Search ( i , j ) FirstMin NextCand i k opt k 1 k 0 = j Running time: O ( Val ( k opt ) − Val ( j ) + 1 ) M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  13. Computation of the LPrF table Calculate SUF and LCP for x = y # rev ( y ) . LPrF [ i ] = max { RMQ ( LCP [ i . . j ]) : j > 2 n − i } M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  14. Computation of the LPrF table Calculate SUF and LCP for x = y # rev ( y ) . LPrF [ i ] = max { RMQ ( LCP [ i . . j ]) : j > 2 n − i } Example LPrF>[i] a b a b a a b b a a b a b a y = b a b a b a a b a a b a a b a a b a b a a a b a a b b a a b a i LPrF <[i] b a a b a b a b a a b a b a x = y # rev(y) 2n−i b a b a b a a b a a b a a b a a b a b a a # a a b a b a a b a a b a a b a a b a b a b b a a b a b a a b a M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  15. Computation of the LPrF table LPrF [ i + 1 ] ≥ LPrF [ i ] − 1 LPrF [i] a b a b a a b b a a b a b a b a b a b a a b a a b a a b a a b a b a a i M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  16. Computation of the LPrF table LPrF [ i + 1 ] ≥ LPrF [ i ] − 1 LPrF [i] a b a b a a b b a a b a b a b a b a b a a b a a b a a b a a b a b a a i An instance of the alternating search (using: SUF and LCP for x , and RMQ). O ( n ) running time. M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  17. Computation of the LPnF table LPnF [ i + 1 ] ≥ LPnF [ i ] − 1 LPnF [i] b a a b a b a a b a b a b a b a a b a a b a a b a a b a b a a i M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  18. Computation of the LPnF table LPnF [ i + 1 ] ≥ LPnF [ i ] − 1 LPnF [i] b a a b a b a a b a b a b a b a a b a a b a a b a a b a b a a i Boundary case (squares) — using runs [Kolpakov, Kucherov, 1999]. General case — the alternating search (using: SUF and LCP for y , and RMQ). O ( n ) running time. M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  19. Summary Our results The LPrF and LPnF tables can be computed in O ( n ) time. The optimal parsing of a text, using factors and/or reverse factors can be computed in O ( n ) time. M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

  20. Thank you for your attention! M.Crochemore, C.S.Iliopoulos, M.Kubica, W.Rytter, T.Waleń Efficient algorithms for two extensions of LPF table

Recommend


More recommend