Simpler & More General Minimization The Minimization Problem for Weighted Finite-State Automata Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Jason Eisner Jason Eisner Johns Hopkins University a May 28, 2003 — HLT-NAACL b a b a b b First half of talk is setup - reviews past work. b Second half gives outline of the new results. Represents the language { aab, abb, bab, bbb} The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state automaton) Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Complexity: O(|arcs| log |states| ) (Hopcroft 1971) a a b b a a b b a a b b b b b b Represents the language { aab, abb, bab, bbb} Represents the language { aab, abb, bab, bbb} The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state automaton) Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Complexity: O(|arcs| log |states| ) (Hopcroft 1971) a a b b a a b b a a b b b b Represents the language { aab, abb, bab, bbb} Represents the language { aab, abb, bab, bbb} 1
The Minimization Problem The Minimization Problem Input: A DFA (deterministic finite-state automaton) Input: A DFA (deterministic finite-state automaton) Output: An equiv. DFA with as few states as possible Output: An equiv. DFA with as few states as possible Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Complexity: O(|arcs| log |states| ) (Hopcroft 1971) Here’s what you should worry about: a a b b a a b b a b b b b Mergeable because they Mergeable because they Can’t always work backward from final state like this. have the same suffix have the same suffix A bit more complicated because of cycles. language : { ab,bb} language : { b} Don’t worry about it for this talk. An equivalence relation on states … merge the equivalence classes Real-World NLP: The Minimization Problem Automata With Weights or Outputs � Finite-state computation of functions Input: A DFA (deterministic finite-state automaton) � Concatenate strings Output: An equiv. DFA with as few states as possible b:wx d: ε abd → wwx a:w Complexity: O(|arcs| log |states| ) (Hopcroft 1971) acd → wwz c:wz Q: Why minimize # states, rather than # arcs? � Add scores A: Minimizing # states also minimizes # arcs! b:3 abd → 5 a:2 d:0 acd → 9 Q: What if the input is an NDFA (nondeterministic) ? c:7 A: Determinize it first. (could yield exponential blowup � ) � Multiply probabilities Q: How about minimizing an NDFA to an NDFA? b:0.3 abd → 0.06 a:0.2 d:1 A: Yes, could be exponentially smaller ☺ , acd → 0.14 but problem is PSPACE-complete so we don’t try. � c:0.7 Real-World NLP: Real-World NLP: Automata With Weights or Outputs Automata With Weights or Outputs � Want to compute functions on strings: Σ * → K � Want to compute functions on strings: Σ * → K � After all, we’re doing language and speech! � After all, we’re doing language and speech! � Finite-state machines can often do the job � Finite-state machines can often do the job � Easy to build, easy to combine, run fast How do we minimize such DFAs? How do we minimize such DFAs? � Build them with weighted regular expressions � To clean up the resulting DFA, � Didn’t Mohri already answer this question? minimize it to merge redundant portions � This smaller machine is faster to intersect/compose � Only for special cases of the output set K! � More likely to fit on a hand-held device � I s there a general recipe? � More likely to fit into cache memory � What new algorithms can we cook with it? 2
Weight Algebras Weight Algebras � Finite-state computation of fu � Finite-state computation of fu Specify a weight algebra (K, ⊗ ⊗ ) ⊗ ⊗ Specify a weight algebra (K, ⊗ ⊗ ⊗ ) ⊗ � � � Concatenate strings � Concatenate strings Define DFAs over (K, ⊗ ⊗ ) ⊗ ⊗ Define DFAs over (K, ⊗ ⊗ ⊗ ⊗ ) � b:wx � b:wx d: ε d: ε a:w a:w Arcs have weights in set K Arcs have weights in set K � � A path’s weight is also in K: A path’s weight is also in K: � � multiply its arc weights with ⊗ ⊗ multiply its arc weights with ⊗ ⊗ ⊗ ⊗ c:wz ⊗ ⊗ c:wz Examples: � � Add scores � Add scores b:3 Q: Semiring is (K, ⊕ , ⊗ ⊗ ⊗ ). Why ⊗ b:3 � (strings, concatenation) � a:2 a:2 d:0 aren’t you talking about ⊕ too? d:0 � (scores, addition) A: Minimization is about DFAs. � (probabilities, multiplication) � c:7 c:7 � (score vectors, addition) At most one path per input. OT phonology � So no need to ⊕ the weights of conditional random fields, rational kernels � (real weights, multiplication) � Multiply probabilities � � Multiply probabilities multiple accepting paths. � (objective func & gradient, training the parameters of a model b:0.3 b:0.3 product-rule multiplication) a:0.2 d:1 a:0.2 d:1 � (bit vectors, conjunction) membership in multiple languages at once c:0.7 c:0.7 Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: b: wx b: x d: ε abd → wwx d: ε abd → wwx a:w a:ww acd → wwz acd → wwz c: wz c: z Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: b: wx b: wwx d: ε abd → wwx a: ε d: ε abd → wwx a:w acd → wwz acd → wwz c: wz c: wwz 3
Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: b: wx d: ε abd → wwx a:w acd → wwz c: wz b:3 abd → 5 a:2 d:0 acd → 9 c:7 Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: � Doesn’t change the function computed: 2 1 3 4 b:3-1 b:3-2 abd → 5 abd → 5 a:2+1 d:0 a:2+2 d:0 acd → 9 acd → 9 c:7-1 c:7-2 6 5 Shifting Outputs Along Paths Shifting Outputs Along Paths � Doesn’t change the function computed: b: wx d: ε abd → wwx a:w acd → wwz 0 5 c: wz …ebd → uwx b:3-3 abd → 5 e:u a:2+3 d:0 …ecd → uwz acd → 9 c:7-3 4 4
Shifting Outputs Along Paths Shifting Outputs Along Paths � State sucks back a prefix from its out-arcs � State sucks back a prefix from its out-arcs and deposits it at end of its in-arcs. b: x b: x d: ε abd → wwx d: ε abd → wwx a:w a:ww w acd → wwz acd → wwz c: z c: z …ebd → uwx …ebd → uwx e:u e:uw …ecd → uwz …ecd → uwz Shifting Outputs Along Paths Shifting Outputs Along Paths b:wx b: x b: wx d: ε abd → wwx d: ε abd → wwx a:w a:w w acd → wwz acd → wwz c: z c: wz …ebd → uwx …ebd → uwx e:u e:u …ecd → uwz …ecd → uwz …ab n bd → u(wx) n wx …ab n cd → u(wx) n wz Shifting Outputs Along Paths Shifting Outputs Along Paths b: x b: xw b: x b: x d: ε abd → wwx d: ε abd → wwx a:w a:ww w acd → wwz acd → wwz c: z c: z …ebd → uwx …ebd → uwx e:u e:uw …ecd → uwz …ecd → uwz …ab n bd → u(wx) n wx …ab n bd → u(wx) n wx …ab n cd → u(wx) n wz …ab n cd → u(wx) n wz …ab n bd → uw(xw) n x …ab n cd → uw(xw) n z 5
Shifting Outputs Along Paths Shifting Outputs Along Paths b: x b:wx b: x b: wx d: ε abd → wwx d: ε abd → wwx a:w a:w w acd → wwz acd → wwz c: z c: wz …ebd → uwx …ebd → uwx e:u e:u …ecd → uwz …ecd → uwz …ab n bd → u(wx) n wx …ab n bd → u(wx) n wx …ab n cd → u(wx) n wz …ab n cd → u(wx) n wz …ab n bd → uw(xw) n x …ab n bd → uw(xw) n x …ab n cd → uw(xw) n z …ab n cd → uw(xw) n z Shifting Outputs Along Paths (Mohri) Shifting Outputs Along Paths (Mohri) � Here, not all the out-arcs start with w � Here, not all the out-arcs start with w � But all the out- paths start with w � But all the out- paths start with w � Do pushback at later states first: � Do pushback at later states first: now we’re ok! b: wx b: wx d: ε d: ε a:w a:w ε c: c: w e:u e:u ε : ε ε : ε d: ε d:w b:wz b: zw Shifting Outputs Along Paths (Mohri) Shifting Outputs Along Paths (Mohri) � Here, not all the out-arcs start with w � Here, not all the out-arcs start with w � But all the out- paths start with w � But all the out- paths start with w � Do pushback at later states first: now we’re ok! � Do pushback at later states first: now we’re ok! b: x b: x d: ε d: ε a:w a:ww w ε ε c: c: e:u e:uw ε : ε d: ε ε : ε d: ε b: zw b: zw 6
Recommend
More recommend