tricks for statistical semantic tricks for statistical
play

Tricks for Statistical Semantic Tricks for Statistical Semantic - PDF document

Tricks for Statistical Semantic Tricks for Statistical Semantic Knowledge Discovery: Knowledge Discovery: A Selectionally Selectionally Restricted Sample Restricted Sample A Marti A. Hearst Marti A. Hearst UC Berkeley UC Berkeley


  1. Tricks for Statistical Semantic Tricks for Statistical Semantic Knowledge Discovery: Knowledge Discovery: A Selectionally Selectionally Restricted Sample Restricted Sample A Marti A. Hearst Marti A. Hearst UC Berkeley UC Berkeley Statistical Approaches Statistical Approaches ► An alternative to hand ► An alternative to hand- -coded meaning. coded meaning. ► Solve sub ► Solve sub- -problems first. problems first. e.g., Acquiring Semantic Relations e.g., Acquiring Semantic Relations Marti Hearst, NYU Semantics ‘08 1

  2. Tricks I Like Tricks I Like Unambiguous Cues Lots o’ Text Rewrite and Verify Marti Hearst, NYU Semantics ‘08 Trick: Lots o’ ’ Text Text Trick: Lots o ► Idea: words in the same syntactic context are ► Idea: words in the same syntactic context are semantically related. semantically related. � � Hindle, ACL Hindle , ACL’ ’90, 90, “ “Noun classification from predicate Noun classification from predicate- -argument structure. argument structure.” ” Marti Hearst, NYU Semantics ‘08 2

  3. Trick: Lots o’ ’ Text Text Trick: Lots o ► Idea: words in the same syntactic context are ► Idea: words in the same syntactic context are semantically related. semantically related. � � Nakov & Hearst, ACL/HLT Nakov & Hearst, ACL/HLT’ ’08 08 “ “Solving Relational Similarity Problems Using the Web as a Corpus Solving Relational Similarity Problems Using the Web as a Corpus” ” Marti Hearst, NYU Semantics ‘08 Trick: Lots o’ ’ Text Text Trick: Lots o ► Idea: bigger is better than smarter! ► Idea: bigger is better than smarter! � � Banko & Brill ACL Banko & Brill ACL’ ’01: 01: “ “Scaling to Very, Very Large Corpora for Natural Scaling to Very, Very Large Corpora for Natural Language Disambiguation” Language Disambiguation ” Marti Hearst, NYU Semantics ‘08 3

  4. Trick: Lots o’ ’ Text Text Trick: Lots o ► Idea: apply web ► Idea: apply web- -scale n scale n- -grams to every grams to every problem imaginable. problem imaginable. � Lapata � Lapata & Keller, HLT/NACCL & Keller, HLT/NACCL ‘ ‘04 04: : “ “Web as a Baseline: Evaluating Web as a Baseline: Evaluating the Performance of Unsupervised Web- -Based Models for a Range Based Models for a Range the Performance of Unsupervised Web of NLP Tasks” ” of NLP Tasks = supervised > supervised MT candidate selection Noun compound bracketing Article suggestion Adjective ordering Noun compound interpretation Marti Hearst, NYU Semantics ‘08 Limitation Limitation ► Sometimes counts alone are too ambiguous. ► Sometimes counts alone are too ambiguous. Solution Solution ► ► Bootstrap from Bootstrap from unambiguous unambiguous contexts. contexts. Marti Hearst, NYU Semantics ‘08 4

  5. Trick: Use Unambiguous Context Trick: Use Unambiguous Context ► … ► … to build statistics for ambiguous contexts. to build statistics for ambiguous contexts. � Hindle � Hindle & & Rooth Rooth, ACL , ACL ’ ’91 91“ “Structural Ambiguity and Lexical Relations Structural Ambiguity and Lexical Relations” ” Example: PP attachment I eat spaghetti with sauce. Bootstrap from unambiguous contexts: Spaghetti with sauce is delicious. I eat with a fork. Marti Hearst, NYU Semantics ‘08 Trick: Use Unambiguous Context Trick: Use Unambiguous Context ► … ► … to identify semantic relations ( to identify semantic relations (lexico lexico- - syntactic contexts) syntactic contexts) � � Hearst, COLING ’ Hearst, COLING ’92, 92, “ “ Automatic Acquisition of Hyponyms from Large Text Automatic Acquisition of Hyponyms from Large Text Corpora” Corpora ” Example: Hyponym I dentification Marti Hearst, NYU Semantics ‘08 5

  6. Combine Tricks 1 and 2 Combine Tricks 1 and 2 Trick: Use Unambiguous Contexts + Trick: Use Unambiguous Contexts + Lot’ ’s O s O’ ’ Text Text Lot ► Combine ► Combine lexico lexico- -syntactic patterns with syntactic patterns with occurrence counts. occurrence counts. � � Kozareva, , Riloff Riloff, , Hovy Hovy, HLT , HLT- -ACL ACL’ ’08. 08. “ “Semantic Class learning form the Web with Semantic Class learning form the Web with Kozareva Hyponym Pattern Linkage Graphs” Hyponym Pattern Linkage Graphs ”. . Marti Hearst, NYU Semantics ‘08 6

  7. Trick: Use Unambiguous Contexts + Trick: Use Unambiguous Contexts + Lot’ ’s O s O’ ’ Text Text Lot ► Combine (usually) unambiguous surface ► Combine (usually) unambiguous surface patterns with occurrence counts. patterns with occurrence counts. � � Nakov & Hearst, HLT/EMNLP Nakov & Hearst, HLT/EMNLP’ ’05 05 “ “Using the Web as an Implicit Training Using the Web as an Implicit Training Set: Application to Structural Ambiguity Resolution” Set: Application to Structural Ambiguity Resolution ”. . Left dash Left dash Punctuation Punctuation cycle analysis � � left heath care, provider � � left cell- cell -cycle analysis left heath care, provider left Possessive marker Possessive marker Abbreviation Abbreviation s stem cell � � right ) factor � � right brain’ brain ’s stem cell right tum. tum . necr.(TN necr.(TN) factor right Parentheses Parentheses Concatenation Concatenation growth factor (beta) � � left reform � � left growth factor (beta) left heathcare reform heathcare left Marti Hearst, NYU Semantics ‘08 Trick: Use Unambiguous Contexts + Trick: Use Unambiguous Contexts + Lot’ ’s O s O’ ’ Text Text Lot ► Identify a ► Identify a “ “protagonist protagonist” ” in each text to learn in each text to learn narrative structure narrative structure � � Chambers & Jurafsky Jurafsky, ACL , ACL’ ’08 08 “ “Unsupervised Learning of Narrative Event Chains Unsupervised Learning of Narrative Event Chains” ”. . Chambers & Marti Hearst, NYU Semantics ‘08 7

  8. Trick 3: Trick 3: Rewrite & Verify Rewrite & Verify Trick: Rewrite & Verify Trick: Rewrite & Verify ► ► Check if alternatives exist in text Check if alternatives exist in text � � Nakov & Hearst, HLT/EMNLP & Hearst, HLT/EMNLP’ ’05 05 “ “Using the Web as an Implicit Training Set: Application to Using the Web as an Implicit Training Set: Application to Nakov Structural Ambiguity Resolution” Structural Ambiguity Resolution ”. . � Example: NP bracketing � Example: NP bracketing � Prepositional � Prepositional � right � stem cells in in the ► ► stem cells the brain brain right brain � � right stem cells from from the ► stem cells ► the brain right stem � � left cells from from the ► ► cells the brain brain stem left � Verbal � Verbal human immunodeficiency � � left ► ► virus virus causing causing human immunodeficiency left � left � ► ► pain pain associated with associated with arthritis migraine arthritis migraine left � Copula � Copula skyscraper � � right ► office building office building that is that is a ► a skyscraper right Marti Hearst, NYU Semantics ‘08 8

  9. Towards New Approaches Towards New Approaches to Semantic Analysis to Semantic Analysis Ideas Ideas ► Inducing Semantic Grammars ► Inducing Semantic Grammars � � Boggess, Boggess , Agarwal Agarwal, & Davis, AAAI , & Davis, AAAI ’ ’91, 91, “ “Disambiguation of Prepositional Disambiguation of Prepositional Phrases in Automatically Labelled Phrases in Automatically Labelled Technical Text Technical Text” ” Marti Hearst, NYU Semantics ‘08 9

  10. Ideas Ideas ► Use Cognitive Linguistics ► Use Cognitive Linguistics � Hearst, � Hearst, ’ ’90, 90,’ ’92, 92, “ “Direction Direction- -Based Text Interpretation Based Text Interpretation” ”. . � � Talmy Talmy’ ’s s Force Dynamics + Reddy Force Dynamics + Reddy’ ’s Conduit Metaphor s Conduit Metaphor � Path Model � Path Model � Solves: Was the person in favor of or opposed to the idea � Solves: Was the person in favor of or opposed to the idea Marti Hearst, NYU Semantics ‘08 Using Cognitive Linguistics Using Cognitive Linguistics ► Talmy ► Talmy’ ’s s Theory of Force Dynamics Theory of Force Dynamics � � Talmy, Talmy , “ “Force Dynamics in Language and Thought, Force Dynamics in Language and Thought,” ” in in Parasession on Causatives and Agentivity , Chicago Linguistic Society 1985. � � Describes how the interaction of agents with respect to force is lexically lexically Describes how the interaction of agents with respect to force is and grammatically expressed. and grammatically expressed. � � Posits two opposing entities: Agonist and Antagonist. Posits two opposing entities: Agonist and Antagonist. � � Each entity expresses an intrinsic force: towards rest or motion. Each entity expresses an intrinsic force: towards rest or motion . � � The balance of the strengths of the entities determines the outcome of the The balance of the strengths of the entities determines the outc ome of the event. event. ► ► Grammatical expression includes using a Grammatical expression includes using a claused claused headed by headed by “ “despite despite” ” to express a weaker to express a weaker antagonist. antagonist. Marti Hearst, NYU Semantics ‘08 10

Recommend


More recommend