Semantic Parsing via Paraphrasing Mateusz Malinowski Based on: J. Berant and P. Liang “Semantic Parsing via Paraphrasing” ACL 2014
Outline Abstract view on the semantic parser ! What party did Clay establish? paraphrase model What political party founded by Henry Clay? ... What event involved the people Henry Clay? Type.PoliticalParty u Founder . HenryClay ... Type.Event u Involved . HenryClay Whig Party Semantic Parsing via Paraphrasing monitor to the left of the mugs � x. ∃ y. monitor ( x ) ∧ left-rel ( x, y ) ∧ mug ( y ) mug to the left of the other mug � x. ∃ y. mug ( x ) ∧ left-rel ( x, y ) ∧ mug ( y ) objects on the table � x. ∃ y. object ( x ) ∧ on-rel ( x, y ) ∧ table ( y ) ( two blue cups are placed near to the computer screen � x. blue ( x ) ∧ cup ( x ) ∧ comp. ( x ) ∧ screen ( x ) Grounding and question-answering based on real-world images 2 M. Malinowski | NLP Reading Group
Natural Language Understanding • Transform the textual input into a logical representation • The logical representation can be executed to return the answer from the database • Three major components of the semantic parser ‣ Over-approximate the meaning (set of logical forms) ‣ Learning-based approach to strive away from bad derivations ‣ Compositionality principle to learn ‘more from less’ What are the objects that surround the sofa? answer(X, ( object(X), close(X,Y), sofa(Y) )). 3 M. Malinowski | NLP Reading Group
Sempre Type.University bridging Education.Institution alignment MichelleObama BarackObama Gender Female USState alignment PlacesLived 1992.10.03 Spouse Type StartDate Which college did Obama go to ? Event21 Event8 Hawaii ContainedBy Location Type UnitedStates Marriage ContainedBy Type.University u Education.Institution . BarackObama ContainedBy Chicago BarackObama Honolulu PlaceOfBirth form z 1 u b.z 2 z 1 ∈ t 1 , z 2 ∈ t 2 , b ∈ ( t 1 , t 2 ) where Location PlacesLived Event3 Type DateOfBirth Profession Type is . Figure Person 1961.08.04 Politician City bridging Marriage.Spouse.Madonna Marriage.StartDate join join 41M entities (nodes) Madonna Marriage.Spouse 2000 alignment alignment 19K properties (edge labels) Who did Madonna marry in 2000 596M assertions (edges) Marriage . ( Spouse.Madonna u StartDate.2000 ) p 1 . ( p 2 .z 0 u b.z ) where p 2 ∈ ( t 1 , ∗ ) , z ∈ t, b ∈ ( t 1 , t ) type . J. Berant et. al. “Semantic Parsing on Freebase from Question Answer pairs” EMNLP 2013 4 M. Malinowski | NLP Reading Group
One derivation Type.City u PeopleBornHere.BarackObama intersect MichelleObama Gender Female USState PlacesLived 1992.10.03 Spouse Type StartDate Event21 Event8 Hawaii what Type.CityTown was PeopleBornHere.BarackObama ? ContainedBy join Location Type UnitedStates Marriage ContainedBy ContainedBy Alignment Chicago BarackObama PlaceOfBirth Honolulu Location PlacesLived Event3 DateOfBirth Type Type Profession city BarackObama PeopleBornHere Person 1961.08.04 Politician City 41M entities (nodes) Alignment Alignment 19K properties (edge labels) Obama born 596M assertions (edges) J. Berant et. al. “Semantic Parsing on Freebase from Question Answer pairs” EMNLP 2013 5 M. Malinowski | NLP Reading Group
Main components Y X J K 2 2 p ( y | y 0 ) p ( y 0 | x ; w ) Y X max w 2 R d ( x,y ) 2 D y 0 2 GEN( x ) Grammar Semantics Logical forms Productions Lexicon Prolog ! SparQL ! Sql ! [(syntax_i, semantics_i)]_i Learning Program induction Ontology Denotations Database P { 2 p ( d | J y 0 K ) p ( y 0 | x ; w ) Y X Y ) = Y X max p ( d | x ; w ) max w 2 R d w 2 R d where GEN(x) ⊆ Y ( x,d ) 2 D y 0 2 GEN( x ) ( x,d ) 2 D exp { φ ( y 0 , x ) T θ } p ( y 0 | x ; w ) = and y 2 GEN( x ) exp { φ ( y, x ) T θ } P 6 M. Malinowski | NLP Reading Group
From grammar to program induction Mental construct one: [(N,1)] ! two: [(N,2)] ! Implementation plus: [(R,+)] ! minus: [(R,-), (U, ~)] N → one Lexicon can be strong or crude Also semantic 1 N → two one : [(N,1), (N,2), …] ! combinators, . 2 plus: [(R,+), (R,-), (U,!)] ! . such as . . minus: [(R,-), (U, ~)] . backward, . R → plus forward + lexicon: words -> (syntax, semantics) R → minus application − R → times × S → minus N -> B N : forward ! " # ! B $ ( N ! $ ) # " ¬ ! " # ! # N -> U N : forward U $ ( N $ ) " N → S N " # ! ! # p S qp N q R $ ( N $ ) B -> N R : backward " N → N L R N R ( p R q p N L q p N R q ) Productions and semantic application grammar semantics Program [(syntax_i, semantics_i)]_i induction 7 M. Malinowski | NLP Reading Group
Outline Abstract view on the semantic parser ! What party did Clay establish? paraphrase model What political party founded by Henry Clay? ... What event involved the people Henry Clay? Type.PoliticalParty u Founder . HenryClay ... Type.Event u Involved . HenryClay Whig Party Semantic Parsing via Paraphrasing monitor to the left of the mugs � x. ∃ y. monitor ( x ) ∧ left-rel ( x, y ) ∧ mug ( y ) mug to the left of the other mug � x. ∃ y. mug ( x ) ∧ left-rel ( x, y ) ∧ mug ( y ) objects on the table � x. ∃ y. object ( x ) ∧ on-rel ( x, y ) ∧ table ( y ) ( two blue cups are placed near to the computer screen � x. blue ( x ) ∧ cup ( x ) ∧ comp. ( x ) ∧ screen ( x ) Grounding and question-answering based on real-world images 8 M. Malinowski | NLP Reading Group
Challenges • “Myriads ways in which knowledge base predicates can be expressed” [1] ‣ “What does X do for a living?” ‣ “What is X’s profession”? • Ontological mismatch problem ‣ “The choice of ontology significantly impacts learning” [2] ‣ Example: Q1: What is the population of Seattle? Q2: How many people live in Seattle? ! λ x.population ( Seattle, x ) MR1: count ( λ x.person ( x ) ∧ live ( x, Seattle )) MR2: ! • Missing coverage ‣ “out of 500,000 relations extracted by the ReVerb Open IE system … only about 10,000 can be aligned to Freebase” [1] [1] J. Berant et. al. “Semantic parsing via paraphrasing” ACL 2014 ! [2] T. Kwiatkowski et. al. “Scaling Semantic Parsers with On-the-fly Ontology Matching” EMNLP 2013 9 M. Malinowski | NLP Reading Group
Overview of the model direct (traditional) underspecified ontology matching logical logical form utterance form (Kwiatkowski et al. 2013) canonical paraphrase utterance (this work) Handling mismatch via paraphrase model Association Vector space 10 M. Malinowski | NLP Reading Group
Canonical utterance construction Set of canonical utterance x , forms Z x , utterances C z . Paraphrase model Utterance Set of logical forms utterances h z 2 Z x for every manageable natural language [1] [2] [3] Already shown in [1] # Template Example Question 1 p.e Who directed Top Gun? Directed.TopGun page 4 2 p 1 .p 2 .e Where does Steve Balmer work? Employment.EmployerOf.SteveBalmer p. ( p 1 .e 1 u p 2 .e 2 ) 3 Character.(Actor.BradPitt u Film.Troy) Who did Brad Pitt play in Troy? Assumption about 4 Type .t u z What composers spoke French? Type.Composer u SpeakerOf.French limited compositionality count ( z ) 5 How many ships were designed by count(BoatDesigner.NatHerreshoff) seems to be crucial Nat Herreshoff? [2] Mapping utterances to logical forms is hard, but generating natural language canonical utterances is not d ( p ) Categ. Rule Example WH d ( t ) has d ( e ) as NP ? p.e NP What election contest has George Bush as winner? d(t), d(e) and d(p) are Freebase descriptions VP WH d ( t ) (AUX) VP d ( e ) ? What radio station serves area New-York ? WH d ( t ) PP d ( e ) ? PP What beer from region Argentina ? for ‘type’, ‘entity’ and ‘property’ . The rules for NP VP WH d ( t ) VP the NP d ( e ) ? What mass transportation system served the area Berlin ? R ( p ) .e WH d ( t ) is the NP of d ( e ) ? What location is the place of birth of Elvis Presley ? NP the remaining templates are omitted. WH d ( t ) AUX d ( e ) VP ? VP What film is Brazil featured in ? Y X J K WH d ( t ) d ( e ) PP ? PP What destination Spanish steps near travel destination ? NP VP WH NP is VP by d ( e ) ? What structure is designed by Herod? P { } [3] The problem of mapping to the ontology is reduced to scoring pairs (c,z) based on the paraphrase model 2 exp { φ ( z, x ) T θ } Y X max p ( d | J z K ) p ( z | x ; w ) p ( z | x ; w ) = P { z 0 2 Z x exp { φ ( z 0 , x ) T θ } P 2 w 2 R d Y X J K ( x,d ) 2 D z 2 Z x Y ) = max p ( d | x ; w ) Y 2 2 w 2 R d ( x,d ) 2 D exp { φ ( x, c, z ) > θ } Y X X max p ( d | J z K ) p ( c, z | x ; w ) p θ ( c, z | x ) = z 0 2 Z x ,c 0 2 C z exp { φ ( x, c 0 , z 0 ) > θ } , P w 2 R d ( x,d ) 2 D z 2 Z x c 2 C z 11 M. Malinowski | NLP Reading Group Y X
Overview of the model direct (traditional) underspecified ontology matching logical logical form utterance form (Kwiatkowski et al. 2013) canonical paraphrase utterance (this work) Handling mismatch via paraphrase model Association Vector space 12 M. Malinowski | NLP Reading Group
Recommend
More recommend