rigorous approximated determinization of weighted automata
play

Rigorous Approximated Determinization of Weighted Automata - PowerPoint PPT Presentation

Rigorous Approximated Determinization of Weighted Automata Benjamin Aminof (Hebrew University) Orna Kupferman (Hebrew University) Robby Lampert (Weizmann Institute) Israel Outline Weighted automata Determinizability of weighted


  1. Rigorous Approximated Determinization of Weighted Automata Benjamin Aminof (Hebrew University) Orna Kupferman (Hebrew University) Robby Lampert (Weizmann Institute) Israel

  2. Outline  Weighted automata  Determinizability of weighted automata  Mohri’s determinization algorithm  Approximated-determinization algorithm  Correctness and termination  Summary  Future work

  3. Weighted Automata (WFA) b,2 weight functions A : a,1 c,1 q 1 /0 q 0 q 3 /0 c: transitions ! R ¸ 0 q 2 /0 d,1 a,1 f: accepting states ! R ¸ 0 b,1  w=abc cost(w)=(1+2+1)+0=4  w=abbd cost(w)=(1+1+1+1)+0=4  w=abb cost(w)=min{5,3}=3

  4. Weighted Automata – language  A run of A on a word w=w 1 …w n is a sequence r=r 0 r 1 r 2 … r n over Q such that r 0 2 Q 0 and for all 1 · i · n, w i r i-1 r i we have . r n  A run r is accepting $ r n is accepting. (standard finite-word accepting condition)  L( A )={w: A has an accepting run on w}

  5. Weighted Automata – costs  A cost of a run r=r 0 r 1 r 2 … r n is w i cost(r) = ∑ i=1 c( ) + f( ) r n r i-1 r i n  defined only for accepting runs  A cost of a word w=w 1 …w n is cost(w)=min accepting runs r of A on w cost(r)  If w 62 L( A ) then cost(w)= 1 .

  6. Weighted Automata – more  A WFA A is trim if each of its states is reachable from some initial state, and has a reachable accepting state.  A WFA A is unambiguous (single-run) if it has at most one accepting run on every word.

  7. Applications of WFA  formal verification of quantitative properties  automatic speech recognition  image compression  pattern matching (widely used in computational biology)  …

  8. A 1 is non-determinizable b,2 A 1 : a,1 c,1 q 1 q 0 q 3 /0 q 2 d,1 a,1  cost(ab k c)=2k+2, cost(ab k d)=k+2 b,1  After reading the word ab k , the difference between the costs of reading c and d is k.  For i ≠ j, a deterministic WFA must be in different states after reading ab i and ab j .  A deterministic WFA must have 1 states.

  9. Determinizability  Weighted automata are not necessarily determinizable.  To decide whether a given weighted automaton is determinizable is an open question.  A sufficient condition for determinizability + algorithm [Mohri ’97].

  10. A sufficient condition [Mohri ’97]  The twins property: For every two states q,q’ 2 Q, and two words u,v 2 Σ * , if q,q’ 2 δ (Q 0 ,u), q 2 δ (q,v), and then cost(q,v,q)=cost(q’,v,q’). q’ 2 δ (q’,v),  In case the automaton is trim (no empty u v q states) and unambiguous (single-run), q 0 the twins property is a characterization. u v q’

  11. Determinization algorithm [Mohri ’97] - example word / cost A 2 : ac 8 a,3 c,5 q 1 b,2 bc 7 q 0 q 3 /0 a,4 ad 8 q 2 d,4 b,3 bd 7 4-3 0+5 min A 2 ’: {(q 1 ,0), {3,4} {(q 3 ,0)} 0 (q 2 ,1)} /0 0 a,? 3 c,? 5 0+0 {(q 0 ,0)} d,? 5 b,? 2 {(q 1 ,0), {(q 3 ,0)} 0 (q 2 ,1)} /0 0 min 1+4 {2,3} 3-2

  12. Determinization algorithm - another example word / cost c,2 A 3 : ac i 3+2i+1 a,1 d,2 q 1 bc i 2+2i b,4 q 0 q 3 /0 a,3 ac i d 3+2i q 2 /1 d,1 b,1 bc i d 2+2i c,2 c,2 {(q 1 ,0), {(q 1 ,0), {(q 3 ,0)} A 3 ’: c,2 a,1 (q 2 ,2)} (q 2 ,2)} d,2 /0 /3 /3 {(q 3 ,0)} {(q 0 ,0)} /0 b,1 {(q 1 ,3), {(q 1 ,3), {(q 3 ,0)} c,2 d,1 (q 2 ,0)} (q 2 ,0)} /0 /1 /1 c,2

  13. Determinization algorithm - non-determinizable example b,2 word / cost A 1 : a,1 c,1 q 1 ab i c 2+2i q 0 q 3 /0 ab i d 2+i q 2 d,1 a,1 b,1 A 1 ’: … {(q 1 ,0), {(q 1 ,1), b,1 {(q 1 ,2), {(q 1 ,3), a,1 b,1 b,1 {(q 0 ,0)} (q 2 ,0)} (q 2 ,0)} (q 2 ,0)} (q 2 ,0)}

  14. Determinization algorithm - a bad determinizable example b,2 word / cost A 1 : a,1 d q 1 c,1 ab i c 2+2i q 0 q 3 /0 ab i d 2+i q 2 d,1 a,1 b,1 A 1 ’: … {(q 1 ,0), {(q 1 ,1), b,1 {(q 1 ,2), {(q 1 ,3), a,1 b,1 b,1 {(q 0 ,0)} (q 2 ,0)} (q 2 ,0)} (q 2 ,0)} (q 2 ,0)}

  15. Mohri’s algorithm - remarks  Mohri’s algorithm terminates iff the original automaton has the twins property.  For trim and unambiguous WFAs, there is a polynomial algorithm for testing the twins property.  There are determinizable WFAs that do T not satisfy the twins property.

  16. Approximated determinization Given a WFA A and an approximation factor t≥1, construct a deterministic WFA A ’, such that for every word w we have cost( A ,w) ≤ cost( A ’,w) ≤ t ∙ cost( A ,w).  When exact determinization is impossible.  When the result of exact determinization is too large.

  17. Succinctness Σ,0 … Ln={Σ* . a . Σn- Σ,0 a,1 Σ,0 1} A 4 : n-1 Σ,t A deterministic equivalent Σ,0 requires 2 n states + L( A 4 )=Σ A t-approximate 1 w = ε deterministic? / t cost(w)= 1 w 2 L n 2 states + \L n t w 2 Σ

  18. Approx. determinization algorithm [Buchsbaum- Giancarlo-Westbrook ’01]  Based on Mohri’s algorithm.  Relaxes the condition for unification of states – rather than requiring residuals of corresponding states to be identical, requires them to be close (within 1+ε of the smaller one).  No guarantees about the new costs.  No sufficient condition for termination.

  19. Our algorithm: t-determinization  Determinization up to a factor t  The new cost of any accepted word w is between cost(w) and t ¢ cost(w).  differs from Mohri’s algorithm  Weights are multiplied by t.  For each state in a subset we maintain a range of residues rather than one.  The criterion for unification of states is relaxed (they may be non-identical).

  20. 2-determinization of A 1 b,2 A 1 : a,1 c,1 q 1 q 0 q 3 /0 q 2 d,1 a,1 b,1 b,2 {(q 3 ,-2,0)} A 1 ’: /0 c,2 a,? a,2 b,? b,2 {(q 1 ,-1,0), {(q 1 ,-1,1), -1 0 -1 2 {(q 0 ,0,0)} (q 2 ,-1,0)} -1 0 (q 2 ,-2,0)} -2 0 d,2 {(q 3 ,-2,0)} residual ranges lower upper /0 contain those of bound bound t cost(w) ¢ cost(w)

  21. 2-determinization of A 2 c,2 a,2 a,1 A 5 : q 0 /0 q 1 /0 q 2 /0 b,1 b,2 c,4 {(q 0 ,-1,0), {(q 0 ,-2,0), a,2 b,2 {(q 1 ,-4,0)} {(q 1 ,-2,0)} {(q 0 ,-2,0)} a,2 (q 2 ,0,2)} (q 1 ,0,4)} /0 /0 /0 /0 /0 c,4 b,2 A 5 ’: {(q 0 ,0,0), a,4 a,2 b,4 (q 1 ,0,0)} b,2 /0 b,4 a,2 {(q 1 ,-6,0)} {(q 2 ,-4,0)} {(q 0 ,-1,0)} {(q 0 ,-2,0)} b,2 /0 /0 /0 /0

  22. Correctness of the algorithm  Thm: If the algorithm terminates on a given WFA A , with the result A ’, then for every word w we have cost( A ,w) ≤ cost( A ’,w) ≤ t ∙ cost( A ,w).

  23. Termination of the algorithm  Thm: If a WFA has the t-twins property, then the algorithm terminates on it.  The weights and the factor t are rational.  Thm: For trim unambiguous WFAs, a WFA is t-determinizable iff it has the t-twins property.  Thm: Deciding the t-twins property for trim unambiguous WFAs can be done in polynomial time.

  24. Summary  Why approximate determinization?  Non-determinizable WFA  Equivalent deterministic is large  t-determinization algorithm  Weights multiplied by t  Use ranges rather than single residues  Collapse to a state whose ranges are contained in mine  A sufficient condition  The t-twins property  For unambiguous WFAs – characterizes determinizability  Decidable in polynomial time

  25. Future work  Generalize the termination proof to the case where the weights and the factor t are real numbers ( R ¸ 0 ).  An algorithm to decide whether a WFA is determinizable. Alternatively – prove that it is undecidable.

Recommend


More recommend