Semant ic Knowledge f or Semant ic Knowledge f or Text ual Ent ailment Text ual Ent ailment Bernardo Magnini J oint work wit h: Elena Cabrio, Milen Kouylekov, Mat t eo Negri FBK-I rst Trent o, I t aly NSF Symposium on Semant ic Knowledge Discovery, Organizat ion and Use November, 14 and 15, 2008 New York Universit y Out line Out line � Text ual Ent ailment � Applied TE : A TE engine f or Quest ion Answering � Open issue : int eract ions and dependencies of t he linguist ic phenomena wit h respect t o ent ailment . � Proposal : a general f ramework, f lexible enough t o allow t he combinat ion of specialized ent ailment engines. 1
Typical Application Inference: Entailment Question Expected answer form Who bought Overture? >> X bought Overture Overture’s acquisition Yahoo bought Overture by Yahoo entails hypothesized answer text • Similar for IE: X acquire Y • Similar for “semantic” IR: t: Overture was bought for … • Summarization (multi-document) – identify redundant info • MT evaluation (and recent ideas for MT) • Educational applications TE tutorial at ACL 2007, Dagan, Roth, Zanzotto “Almost certain” Entailments t: The technological triumph known as GPS … was incubated in the mind of Ivan Getting. h: Ivan Getting invented the GPS. TE tutorial at ACL 2007, Dagan, Roth, Zanzotto 2
Applied Textual Entailment • A directional relation between two text fragments: Text (t) and Hypothesis (h): t entails h ( t ⇒ h ) if humans reading t will infer that h is most likely true � Operational (applied) definition: � Human gold standard - as in NLP applications � Assuming common background knowledge – which is indeed expected from applications TE tutorial at ACL 2007, Dagan, Roth, Zanzotto Probabilistic Interpretation Definition: • t probabilistically entails h if: – P( h is true | t ) > P( h is true ) • t increases the likelihood of h being true • ≡ Positive PMI – t provides information on h’ s truth • P( h is true | t ): entailment confidence – The relevant entailment score for applications – In practice: “most likely” entailment expected TE tutorial at ACL 2007, Dagan, Roth, Zanzotto 3
The Role of Knowledge • For textual entailment to hold we require: – text AND knowledge ⇒ h but – knowledge should not entail h alone • Systems are not supposed to validate h ’s truth regardless of t (e.g. by searching h on the web) t: The technological triumph known as GPS … was incubated in the mind of Ivan Getting. h: Ivan Getting invented the GPS. TE tutorial at ACL 2007, Dagan, Roth, Zanzotto Entailment- -Based Approach in Based Approach in Qallme Qallme Entailment Ontology HasDirector[MOVIE, PERSON] HasAddress[CINEMA, ADDRESS] 4
Entailment Engine for QA Entailment Engine for QA Ontology HasDirector[MOVIE, PERSON] HasAddress[CINEMA, ADDRESS] P1: What is the telephone number of Cinema:X ? P2: Who is the director of Movie:X ? P3: What is the ticket price of Cinema:X ? P4: Give me the address of Cinema:X . … Relational Textual Patterns Entailment Engine for QA Entailment Engine for QA Ontology HasDirector[MOVIE, PERSON] HasAddress[CINEMA, ADDRESS] Q: “ Where is cinema Astra located? ” P1: What is the telephone number of Cinema:X ? P2: Who is the director of Movie:X ? P3: What is the ticket price of Cinema:X ? P4: Give me the address of Cinema:X . … Relational Textual Patterns 5
Entailment Engine for QA Entailment Engine for QA Ontology HasDirector[MOVIE, PERSON] HasAddress[CINEMA, ADDRESS] Q: “ Where is cinema Astra located? ” P1: What is the telephone number of Cinema:X ? P2: Who is the director of Movie:X ? QuickTime™ and a TIFF (Uncompressed) decompressor P3: What is the ticket price of Cinema:X ? are needed to see this picture. P4: Give me the address of Cinema:X . … Entailment engine Relational Textual Patterns Q ⇒ {P?, P?, …} entails Entailment Engine for QA Entailment Engine for QA Ontology HasDirector[MOVIE, PERSON] HasAddress[CINEMA, ADDRESS] Q: “ Where is cinema Astra located? ” P1: What is the telephone number of Cinema:X ? P2: Who is the director of Movie:X ? QuickTime™ and a TIFF (Uncompressed) decompressor P3: What is the ticket price of Cinema:X ? are needed to see this picture. P4: Give me the address of Cinema:X . … Entailment engine Relational Textual Patterns Q ⇒ P4 entails P4: Give me the address of Cinema:X . 6
Entailment Engine for QA Entailment Engine for QA Ontology HasDirector[MOVIE, PERSON] HasAddress[CINEMA, ADDRESS] Q: “ Where is cinema Astra located? ” P1: What is the telephone number of Cinema:X ? P2: Who is the director of Movie:X ? QuickTime™ and a TIFF (Uncompressed) decompressor P3: What is the ticket price of Cinema:X ? are needed to see this picture. P4: Give me the address of Cinema:X . … Entailment engine Relational Textual Patterns SELECT ?street Q ⇒ WHERE {?cinema tourism:name “X”. P4 ?cinema tourism:hasPostalAddress . entails ?addressr tourism:street ?street} P4: Give me the address of Cinema:X . A: Corso Buonarroti, 16 - Trento Semant ic Dependencies Semant ic Dependencies Multiple linguistic aspects are relevant f or entailment: <pair id=“400” entailment=“ ENTAILMENT ” task=“QA”> <t>The polygraph came along in 1921, invented by John A. Larson, a University of California medical student working with help from a police official. The device ostensibly detects when a person is lying by monitoring and recording certain body changes affected by a person’s emotional condition. </t> <h>The polygraph is a device that ostensibly detects when a person is not telling the truth </h></pair> NEGATION ANTONYMS 7
Semant ic Dependencies Semant ic Dependencies � Provide a modular approach t hrough which evaluat e progresses on single aspect s of ent ailment , using specialized t raining and t est dat aset . � Devise a general f ramework, based on t he dist ance bet ween T and H, f lexible enough t o allow t he combinat ion of single ent ailment engines. � I nvest igat e t he int eract ions and t he dependencies of t he dif f erent linguist ic phenomena wit h respect t o ent ailment . TE engines combinat ions TE engines combinat ions � Dif f erent independent entailment engines , each of which able t o deal wit h an aspect of t he language variabilit y (e.g. negat ion, modals). � Output of the whole system: sum of the edit distances produced by each module (alt hough t he dif f erent linguist ic phenomena can be dependent on one anot her in dif f er ent and complex ways) 8
Dist ance- - Based TE Engine Based TE Engine Dist ance Expect ed behavior of each single TE engine: TE Engine-x The ling. phenomenon The ling. phenomenon is present in t he pair is not present in t he pair. it is relevant it is NOT relevant D=0 D=score D=0 Dist ance- - Based TE Engine Based TE Engine Dist ance Det ermines t he best (less cost ly) sequence of edit operat ions t hat allow t o t ransf orm T int o H: - Linear dist ance - Tree Edit Dist ance Det ermines t he cost of t he t hree edit operat ions (insert ion, delet ion, subst it ut ion) Each r ule has a probabilit y represent ing t he degree of conf idence of t he rule. Rules can be at dif f erent levels (e.g. lexical, synt act ic) 9
A TE Engine f or Negat ion A TE Engine f or Negat ion � NEGATI ON , processed f ocusing on direct licensors of negat ion such as overt negat ive markers ( not , n’t ), negat ive quant if iers ( no, not hing ), st rong negat ive adverbs ( never ); � ANTONYMS . Linear Distance Algorithm <t>Giles Chichester's position was viewed as (Levenshtein distance untenable partly because he had been given calculated on tokens) the job of a sleazebuster by Mr Cameron to ensure the integrity of Tory MEP expenses. He is not the leader of the Tory MEPs.</t> Cost schema, e.g.: d neg <rule name=“deletion_not"> <left><syntax><token><text>n ot</text></token></syntax></l eft> <h>Giles Chichester is the leader of the <score>20</score> Tories MEPs.</h> <pair id="107" entailment="CONTRADICTION" task="IR"> A TE f or Negat ion at RTE4 A TE f or Negat ion at RTE4 1st run ACC 0.54 Avg. Pr. 0.4946 EDI TSneg 1000 pairs 164 p. 836 p. presence of NPIs D=0 438 TP + 398 FP 46 p. 116 p. relevant non-relevant D=max D=max 102 TN 62 FN 10
Recommend
More recommend