Easy and Hard Outline Constraint Ranking in OT � The Constraint Ranking problem � Making fast ranking faster Jason Eisner � Extension: Considering all competitors U. of Rochester � How hard is OT generation? � Making slow ranking slower August 6, 2000 – SIGPHON - Luxembourg The Ranking Problem What Is Each Input Datum? Possibilities from Tesar & Smolensky � A pairwise ranking g > h finite � An attested form g positive data Constraint < C3, C1, C2, C5, C4> � An attested set G Ranker � 1 grammatical element - learner doesn’t know which! m items or “fail” � Captures uncertainty about the representation or underlying form of the speaker’s utterance � Find grammar consistent with data C1 � Today we’ll assume learner does know underlying C2 (or just determine whether one exists) C4 C3 � How efficient can this be? gazebo { ga(zé.bo), C5 � Different from Gold learnability (ga.zé)bo } n constraints � Proposed by Tesar & Smolensky Key Results Outline � The Constraint Ranking problem � A pairwise ranking g > h linear time in n � An attested form g coNP-hard � Making fast ranking faster even with Σ 2 -complete m=1 � An attested set G � Extension: Considering all competitors � 1 grammatical element - learner doesn’t know which! � How hard is OT generation? � Captures uncertainty about the representation or underlying form of the speaker’s utterance � Making slow ranking slower � Today we’ll assume learner does know underlying gazebo { ga(zé.bo), (ga.zé)bo } 1
More Pairw ise Rankings … Pairw ise Rankings: g > g > h h g > g > h h favor h favor g evidence from more pairs g > h g’ > h’ g’’ > h’’ C1 C2 C3 C4 C5 C4 or C5 » C1 C2 » C1 g � �� � constraints not C4 or C5 » C2 C1 or C3 or C5 » C2 ranked yet h � � � � C2 » C3 C2 » C4 C1 or C3 or C5 » C4 Must eliminate h before C1 or C2 makes it win C4 or C5 » C1 C4 or C5 » C2 We’ll now use Recursive Constraint Demotion (RCD) (Tesar & Smolensky - easy greedy algorithm) Satisfying these is necessary and sufficient g > h g’ > h’ g’’ > h’’ g > h g’ > h’ g’’ > h’’ C4 or C5 » C1 C2 » C1 C4 or C5 » C1 C2 » C1 C4 or C5 » C2 C1 or C3 or C5 » C2 C4 or C5 » C2 C1 or C3 or C5 » C2 C2 » C3 C2 » C3 C2 » C4 C1 or C3 or C5 » C4 C2 » C4 C1 or C3 or C5 » C4 Needn’t be 5 5 dominated by anyone 2 2 4 4 3 3 1 1 g > h g’ > h’ g’’ > h’’ g > h g’ > h’ g’’ > h’’ C4 or C5 » C1 C2 » C1 C4 or C5 » C1 C2 » C1 C4 or C5 » C2 C1 or C3 or C5 » C2 C4 or C5 » C2 C1 or C3 or C5 » C2 C2 » C3 C2 » C3 C2 » C4 C1 or C3 or C5 » C4 C2 » C4 C1 or C3 or C5 » C4 5 5 » 2 2 4 4 3 3 1 1 2
g > h g’ > h’ g’’ > h’’ Recursive Constraint Demotion C4 or C5 » C1 C2 » C1 C4 or C5 » C2 C1 or C3 or C5 » C2 C2 » C3 g > h g’ > h’ g’’ > h’’ C2 » C4 C1 or C3 or C5 » C4 C4 or C5 » C1 C2 » C1 C4 or C5 » C2 C1 or C3 or C5 » C2 C2 » C3 C2 » C4 C1 or C3 or C5 » C4 5 » 2 » 4 � How to find undominated constraint at each step? � T&S simply search: O(mn) per search ⇒ O(mn 2 ) � But we can do better: 3 � Abstraction: Topological sort of a hypergraph 1 � Ordinary topological sort is linear-time; same here! g > h g’ > h’ g’’ > h’’ g > h g’ > h’ g’’ > h’’ C4 or C5 » C1 C2 » C1 C4 or C5 » C1 C2 » C1 C4 or C5 » C2 C1 or C3 or C5 » C2 C4 or C5 » C2 C1 or C3 or C5 » C2 C2 » C3 C2 » C3 C2 » C4 C1 or C3 or C5 » C4 C2 » C4 C1 or C3 or C5 » C4 Delete that structure in time proportional to its size Maintain list of red nodes: find next in time O(1) maintain count shrink maintain count Total time: O(M+ n), down from O(Mn) 0 5 5 representation of parents of parents of hypergraph 2 0 2 2 4 4 2 1 1 1 3 3 2 1 n= nodes n= nodes 1 1 M= edges ≤ mn M= edges ≤ mn Outline Comparison: Constraint Demotion � Tesar & Smolensky 1996 � The Constraint Ranking problem � Formerly same speed, but now RCD is faster � Making fast ranking faster � Advantage: CD maintains a full ranking at all times � Extension: Considering all competitors � Can be run online (memoryless) � How hard is OT generation? � This eventually converges; but not a conservative strategy � Current grammar is often inconsistent with past data � Making slow ranking slower � To make it conservative: � On each new datum, rerank from scratch using all data (memorized) � Might as well use faster RCD for this � Modifying the previous ranking is no faster, in worst case 3
New Problem But Greedy Algorithm Still Works � Preserves spirit of RCD � Observed data: g , g’ , … � Greedily extend grammar 1 constraint at a time � Must beat or tie all competitors � No compilation into hypergraph � (Not enough to ensure g > h, g’ > h’ …) 1 5 2 » 4 � Just use RCD? 3 chosen so far remaining � Try to divide g ’s competitors h into equiv. classes � But can get exponentially many classes 2 1 5 » » � Hence exponentially many blue nodes � check these partial grammars: 5 2 3 » » pick one making g , g’ , … optimal (maybe with ties to be broken later) 5 2 4 » » Continuous Algorithms But Greedy Algorithm Still Works � Preserves spirit of RCD � Simulated annealing � Greedily extend grammar 1 constraint at a time � Boersma 1997: Gradual Learning Algorithm � No compilation into hypergraph � Constraint ranking is stochastic, with real-valued bias & variance � But must run OT generation mn 2 times � Maximum likelihood � To pick each of n constraints, check m forms under n grammars � Johnson 2000: Generalized Iterative Scaling (maxent) � We’ll see that this is hard … � Constraint weights instead of strict ranking � T&S’s solution also runs OT generation mn 2 times � Deal with noise and free variation! � Error-Driven Constraint Demotion � How many iterations to convergence? � For n 2 CD passes, for m forms, find (profile of) optimal competitor � That requires more info from generation - we’ll return to this! Outline Complexity Classes: Boolean � The Constraint Ranking problem … Σ Σ 2 = NP NP � Making fast ranking faster Σ Σ ∆ ∆ ∆ ∆ 2 = P NP ∃ x ∀ y Ψ (x,y) � Extension: Considering all competitors D p coNP NP polytime w/ � How hard is OT generation? P ∀ x Ψ (x) ∃ x Ψ (x) oracle: NP Ψ subroutines � Making slow ranking slower run in unit time X-hard ≥ X-complete = hardest in X 4
Complexity Classes: Integer OptP-complete Functions � Integer-valued functions have classes too � Traveling Salesperson � Minimum cost for touring a graph? � FP (like P ) Turing-machine polytime � OptP (like NP ∃ x Ψ (x) ) min f(x) � Minimum Satisfying Assignment � FP NP (like P NP = ∆ ∆ 2 ) ∆ ∆ � Minimum bitstring b 1 b 2 … b n satisfying φ (b 1 , b 2 , … b n ), a Boolean formula? � Note: OptP -complete ⇒ FP NP -complete � Optimal violation profile in OT! � Can ask Boolean questions about output of an OptP- � Given underlying form complete function; often yields complete decision problems � Given grammar of bounded finite-state constraints � Clearly in OptP: min f(x) where f computes violation profile � As hard as Minimum Satisfying Assignment Hardness Proof Subtlety in the Proof � Given formula φ (b 1 , b 2 , … b n ) � Turning φ into a DFA for C( φ ) might blow it up � Need minimum satisfier b 1 b 2 … b n (or 11…1 if unsat) exponentially - so not poly reduction! � Reduce to finding minimum violation profile � Luckily, we’re allowed to assume φ is in CNF: φ = D 1 ^ D 2 ^ … D m � Let OT candidates be bitstrings b 1 b 2 … b n � Let constraint C( φ ) be satisfied if φ (b 1 , b 2 , … b n ) C( ¬ b 1 ) C( ¬ b 2 ) C( ¬ b 3 ) C(D 1 ) … C(D m ) C( φ ) C( ¬ b 1 ) C( ¬ b 2 ) C( ¬ b 3 ) 000 equivalent to 0 0 0 000 only 0 0 0 C( φ ) ; 001 0 0 1 satisfiers 001 0 0 1 only satisfiers 010 0 1 0 survive 010 0 1 0 survive past here past here … … … … Another Subtlety Associated Decision Problems FP NP -complete OptVal EDCD � Must ensure that if there is no satisfying assignment, 11…1 wins OptVal < k? NP -complete � Modify each C(D i ) so that 11…1 satisfies it D p -complete OptVal = k? ∆ ∆ ∆ ∆ 2 -complete � At worst, this doubles the size of the DFA Last bit of OptVal? coNP -complete RCD (mult. Is g optimal? C( ¬ b 1 ) C( ¬ b 2 ) C( ¬ b 3 ) C(D 1 ) … C(D m ) Is some g ∈ G ∆ ∆ ∆ 2 -complete ∆ competitors) 000 equivalent to 0 0 0 optimal? C( φ ) ; 001 0 0 1 only satisfiers 010 0 1 0 survive past here … … 5
Recommend
More recommend