Local Maximal Stack Scores with General Loop Penalty Function EVA 2005, Gothenburg Niels Richard Hansen . – p.1/17
Local Maximal Stack Scores with General Loop Penalty Function EVA 2005, Gothenburg Niels Richard Hansen This talk is based on two papers: Asymptotics for Local Maximal Stack Scores with General Loop Penalty Function. To be submitted shortly . The Maximum of a Random Walk Reflected at a General Barrier. To appear in Ann. Appl. Probab. . – p.1/17
RNA-structures RNA molecules are sequences of nucleotides – some forming functionally important structures. An RNA-molecule is represented as a sequence, X 1 . . . X n , of letters from the alphabet { A , C , G , U } . . – p.2/17
RNA-structures RNA molecules are sequences of nucleotides – some forming functionally important structures. An RNA-molecule is represented as a sequence, X 1 . . . X n , of letters from the alphabet { A , C , G , U } . Its (secondary) structure is a graph with vertex set { 1 , . . . , n } . . – p.2/17
RNA-structures RNA molecules are sequences of nucleotides – some forming functionally important structures. An RNA-molecule is represented as a sequence, X 1 . . . X n , of letters from the alphabet { A , C , G , U } . Its (secondary) structure is a graph with vertex set { 1 , . . . , n } . The graph is a partial matching: A vertex can enter in at most one edge and no loops. . – p.2/17
RNA-structures RNA molecules are sequences of nucleotides – some forming functionally important structures. An RNA-molecule is represented as a sequence, X 1 . . . X n , of letters from the alphabet { A , C , G , U } . Its (secondary) structure is a graph with vertex set { 1 , . . . , n } . The graph is a partial matching: A vertex can enter in at most one edge and no loops. Typically edges between near neighbours (sharp turns) are not allowed. . – p.2/17
RNA-structures RNA molecules are sequences of nucleotides – some forming functionally important structures. An RNA-molecule is represented as a sequence, X 1 . . . X n , of letters from the alphabet { A , C , G , U } . Its (secondary) structure is a graph with vertex set { 1 , . . . , n } . The graph is a partial matching: A vertex can enter in at most one edge and no loops. Typically edges between near neighbours (sharp turns) are not allowed. Typically pseudo-knots are not allowed: Pairs of edges of the form { i 1 , j 1 } and { i 2 , j 2 } with i 1 < i 2 < j 1 < j 2 are not allowed. . – p.2/17
RNA-structures RNA molecules are sequences of nucleotides – some forming functionally important structures. An RNA-molecule is represented as a sequence, X 1 . . . X n , of letters from the alphabet { A , C , G , U } . Its (secondary) structure is a graph with vertex set { 1 , . . . , n } . The graph is a partial matching: A vertex can enter in at most one edge and no loops. Typically edges between near neighbours (sharp turns) are not allowed. Typically pseudo-knots are not allowed: Pairs of edges of the form { i 1 , j 1 } and { i 2 , j 2 } with i 1 < i 2 < j 1 < j 2 are not allowed. An edge represents a hydrogen bond between nucleotides. . – p.2/17
RNA-structures GUG UA AG C GC AU ACCG CCG CUGCAUACUUC UUACAU CCAUA CUAU C |||| ||| ||||||||||| |||||| ||||| |||| A UGGU GGC GAUGUAUGAAG AAUGUA GGUAU GGUA U UGA GG AA A A A AA An example RNA-molecule from the nematode C. elegans . . – p.3/17
RNA-structures GUG UA AG C GC AU ACCG CCG CUGCAUACUUC UUACAU CCAUA CUAU C |||| ||| ||||||||||| |||||| ||||| |||| A UGGU GGC GAUGUAUGAAG AAUGUA GGUAU GGUA U UGA GG AA A A A AA An example RNA-molecule from the nematode C. elegans . Xiong and Waterman (1997) show strong limit results for the maximum of (minus) the free energy score of RNA-structures. The free energy score being an additive score of the hydrogen bonded nucleotides (edges) plus linear penalties on the length of the loops (unpaired vertices). The score depends on a parameter vector α . . – p.3/17
Strong Limits Let X 1 , . . . , X n be an iid RNA-sequence. Let T i,j denote the maximal structure score for X i , . . . , X j for i < j and M n = max { max 1 ≤ i<j ≤ n T i,j , 0 } . . – p.4/17
Strong Limits Let X 1 , . . . , X n be an iid RNA-sequence. Let T i,j denote the maximal structure score for X i , . . . , X j for i < j and M n = max { max 1 ≤ i<j ≤ n T i,j , 0 } . Relying on subadditive techniques Xiong and Waterman show that 1 lim nT 1 ,n = a ( α ) a.s. n →∞ . – p.4/17
Strong Limits Let X 1 , . . . , X n be an iid RNA-sequence. Let T i,j denote the maximal structure score for X i , . . . , X j for i < j and M n = max { max 1 ≤ i<j ≤ n T i,j , 0 } . Relying on subadditive techniques Xiong and Waterman show that 1 lim nT 1 ,n = a ( α ) a.s. n →∞ If a ( α ) > 0 , 1 lim nM n = a ( α ) a.s. n →∞ and if a ( α ) < 0 1 lim log nM n = b ( α ) a.s. n →∞ . – p.4/17
A Conjecture In the logarithmic phase, a ( α ) < 0 , Xiong and Waterman conjecture that P ( M n > t ) ≃ 1 − exp( − K ( α ) n exp( − t/b ( α ))) (1) for suitable large n and t . . – p.5/17
A Conjecture In the logarithmic phase, a ( α ) < 0 , Xiong and Waterman conjecture that P ( M n > t ) ≃ 1 − exp( − K ( α ) n exp( − t/b ( α ))) (1) for suitable large n and t . For a (quite restrictive) class of stack/hairpin-loop structures we show such a result. Our result contains situations corresponding to a ( α ) = 0 but where (1) holds. . – p.5/17
| {z } | {z } | {z } Local scores We proceed as follows: Choose functions f : { A , C , G , U } 2 → R (non-lattice) and g : N 0 → ( −∞ , 0] . . – p.6/17
Local scores We proceed as follows: Choose functions f : { A , C , G , U } 2 → R (non-lattice) and g : N 0 → ( −∞ , 0] . For 1 ≤ i < j ≤ n define � δ � � T i,j = max f ( X i + k , X j − k ) + g ( j − i − 2 δ − 1) . | {z } | {z } | {z } − 2 ≤ 2 δ<j − i k =0 stack hairpin-loop stack X 1 . . . X i − 1 X i . . . X i + δ X i + δ +1 . . . X j − δ − 1 X j − δ . . . X j X j +1 . . . X n . δ +1 j − i − 2 δ − 1 δ +1 . – p.6/17
Local scores We proceed as follows: Choose functions f : { A , C , G , U } 2 → R (non-lattice) and g : N 0 → ( −∞ , 0] . For 1 ≤ i < j ≤ n define � δ � � T i,j = max f ( X i + k , X j − k ) + g ( j − i − 2 δ − 1) . | {z } | {z } | {z } − 2 ≤ 2 δ<j − i k =0 stack hairpin-loop stack X 1 . . . X i − 1 X i . . . X i + δ X i + δ +1 . . . X j − δ − 1 X j − δ . . . X j X j +1 . . . X n . δ +1 j − i − 2 δ − 1 δ +1 Let M n = max 1 ≤ i<j ≤ n T i,j . . – p.6/17
The Recursion The scores T i,j fulfill the recursion T i,j = max { T i +1 ,j − 1 + f ( X i , X j ) , g ( j − i + 1) } . X 1 X 2 X 3 X 4 X 5 X 1 g (1) X 2 0 g (1) X 3 0 g (1) X 4 0 g (1) X 5 0 g (1) . – p.7/17
The Recursion The scores T i,j fulfill the recursion T i,j = max { T i +1 ,j − 1 + f ( X i , X j ) , g ( j − i + 1) } . X 1 X 2 X 3 X 4 X 5 X 1 g (1) T 1 , 2 ր X 2 0 g (1) X 3 0 g (1) X 4 0 g (1) X 5 0 g (1) . – p.7/17
The Recursion The scores T i,j fulfill the recursion T i,j = max { T i +1 ,j − 1 + f ( X i , X j ) , g ( j − i + 1) } . X 1 X 2 X 3 X 4 X 5 X 1 g (1) T 1 , 2 T 1 , 3 ր ր X 2 0 g (1) X 3 0 g (1) X 4 0 g (1) X 5 0 g (1) . – p.7/17
The Recursion The scores T i,j fulfill the recursion T i,j = max { T i +1 ,j − 1 + f ( X i , X j ) , g ( j − i + 1) } . X 1 X 2 X 3 X 4 X 5 X 1 g (1) T 1 , 2 T 1 , 3 T 1 , 4 ր ր ր X 2 0 g (1) T 2 , 3 ր X 3 0 g (1) X 4 0 g (1) X 5 0 g (1) . – p.7/17
The Recursion The scores T i,j fulfill the recursion T i,j = max { T i +1 ,j − 1 + f ( X i , X j ) , g ( j − i + 1) } . X 1 X 2 X 3 X 4 X 5 X 1 g (1) T 1 , 2 T 1 , 3 T 1 , 4 T 1 , 5 ր ր ր ր X 2 0 g (1) T 2 , 3 T 2 , 4 T 2 , 5 ր ր ր X 3 0 g (1) T 3 , 4 T 3 , 5 ր ր X 4 0 g (1) T 4 , 5 ր X 5 0 g (1) . – p.7/17
The Diagonals Suppose ( X k ) k ∈ Z is a doubly infinite sequence of iid variables. Define recursively T 0 k = max { T 1 T 0 k − 1 + f ( X − k , X k ) , g (2 k ) } , 0 = 0 and T 1 k = max { T 2 T 1 k − 1 + f ( X − k , X k ) , g (2 k + 1) } , 0 = g (1) . . – p.8/17
The Diagonals Suppose ( X k ) k ∈ Z is a doubly infinite sequence of iid variables. Define recursively T 0 k = max { T 1 T 0 k − 1 + f ( X − k , X k ) , g (2 k ) } , 0 = 0 and T 1 k = max { T 2 T 1 k − 1 + f ( X − k , X k ) , g (2 k + 1) } , 0 = g (1) . T 0 if j − i is odd D ( j − i +1) / 2 T i,j = T 1 if j − i is even ( j − i ) / 2 . – p.8/17
Reflected Random Walks The processes ( T i k ) k ≥ 0 , i = 0 , 1 are random walks reflected at g . 150 150 150 100 100 100 50 50 50 0 0 0 −50 −50 −50 −100 −100 −100 g(n)=0 −150 −150 −150 g(n) = −15 log(n) g(n) = −n −200 −200 −200 0 0 0 50 50 50 100 100 100 150 150 150 200 200 200 . – p.9/17
Reflected Random Walks If M i := sup k ≥ 0 T i k < ∞ a.s. and θ ∗ > 0 solves E exp( θf ( X − 1 , X 1 )) = 1 . then P ( M i > x ) ∼ K ∗ i exp( − θ ∗ x ) for x → ∞ . . – p.10/17
Recommend
More recommend