Compositional and Distributional Models of Meaning for Natural Language Stephen Clark Natural Language and Information Processing Research Group University of Cambridge Computer Laboratory Oxford October 2010
C & C tools Intro Parsing CCG 2 Natural Language Processing (NLP) • The branch of AI concerned with the automatic analysis, generation and understanding of natural language text • Paradigm shift in NLP in the early 1990s • move from knowledge-heavy to data-driven approaches • We now have usable language technology, e.g. Google translate Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 3 Two Success Stories • Practical natural language parsing • robust, efficient, accurate parsers based on ML from corpora • Distributional lexical semantics • word meanings based on data-driven linguistics and lots of text Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 4 Today’s Talk • Syntactic parsing (leading to compositional semantics) • Distributional lexical semantics • Combining the two approaches • theoretical advances in semantics leading to better LT • Talk will be from a practical language technology perspective • but will introduce a fundamental theoretical problem relevant to practice (and this workshop) • and will serve as an introduction to some of the linguistics talks Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 5 Phrase Structure S NP VP DT NP RB VP the JJ NNS also MD VP proposed changes would VB S allow NP VP NNS executives TO VP to VP ADVP RB CONJP VB NP CC RB early report NNS PP and often P NP exercises of NNS options Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 6 Dependency Structure PREP POBJ DOBJ SUBJ DET DET John hit the ball with the bat Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 7 Logical Form From 1953 to 1955 , 9.8 billion Kent cigarettes with the filters were sold , the company said . _____________ _________________________________________________________________ | x1 | | x2 x3 | |-------------| |-----------------------------------------------------------------| (| company(x1) |A| say(x2) |) | single(x1) | | agent(x2,x1) | |_____________| | theme(x2,x3) | | proposition(x3) | | __________________ ____________ ________________ | | | x4 | | x5 | | x6 x7 x8 | | | x3: |------------------| |------------| |----------------| | | (| card(x4)=billion |;(| filter(x5) |A| with(x4,x5) |)) | | | 9.8(x4) | | plural(x5) | | sell(x6) | | | | kent(x4) | |____________| | patient(x6,x4) | | | | cigarette(x4) | | 1953(x7) | | | | plural(x4) | | single(x7) | | | |__________________| | 1955(x8) | | | | single(x8) | | | | to(x7,x8) | | | | from(x6,x7) | | | | event(x6) | | | |________________| | | event(x2) | |_________________________________________________________________| Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 8 Why Build these Structures? • We want to know the meaning of the sentence • Structured representations allow us to access the semantics • (Arguably) useful for a variety of NLP applications, e.g. Machine Translation, Question Answering Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 9 Why is Parsing Difficult? • Obtaining a wide-coverage grammar which can handle arbitrary real text is challenging Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 9 Why is Parsing Difficult? • Obtaining a wide-coverage grammar which can handle arbitrary real text is challenging • Natural language is surprisingly ambiguous Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 10 Syntactic Ambiguity S S NP VP NP VP John V NP John V NP PP saw NP PP DT N P NP saw DT N P NP the man with DT N the man with DT N the telescope the telescope Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 11 Ambiguity: the problem is worse than you think S S NP VP NP VP John V NP John V NP PP ate NP PP DT N P NP ate DT N P NP the pizza with DT N the pizza with DT N a fork a fork Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 12 Ambiguity: the problem is worse than you think S S NP VP NP VP John V NP John V NP PP ate NP PP DT N P NP ate DT N P NP the pizza with DT N the pizza with DT N the anchovies the anchovies Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 13 Grammars for Natural Language Parsing • Standard approach is to use a Context Free Grammar S → NP VP VP → V NP, V NP PP PP → P NP NP → DT N DT → the, a N → cat, dog V → chased, jumped P → over Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 14 Combinatory Categorial Grammar (CCG) • CCG (Steedman) is a type-driven lexicalised grammar • An elementary syntactic structure – for ccg a lexical category – is assigned to each word in a sentence walked : S \ NP ‘give me an NP to my left and I return a sentence’ • A small number of rules define how categories can combine Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 15 ccg Lexical Categories • Atomic categories: S , N , NP , PP , . . . (not many more) • Complex categories are built recursively from atomic categories and slashes, which indicate the directions of arguments Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 15 ccg Lexical Categories • Atomic categories: S , N , NP , PP , . . . (not many more) • Complex categories are built recursively from atomic categories and slashes, which indicate the directions of arguments • Example complex categories for verbs • intransitive verb: S \ NP walked • transitive verb: ( S \ NP ) / NP respected • ditransitive verb: (( S \ NP ) / NP ) / NP gave Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 16 A Simple ccg Derivation interleukin − 10 inhibits production ( S \ NP ) / NP NP NP S \ NP S Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 17 A Simple ccg Derivation interleukin − 10 inhibits production ( S \ NP ) / NP NP NP > S \ NP S > forward application Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 18 A Simple ccg Derivation interleukin − 10 inhibits production ( S \ NP ) / NP NP NP > S \ NP < S > forward application < backward application Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 19 A Simple ccg Derivation with Semantics interleukin − 10 inhibits production NP : ( S \ NP ) / NP : NP : inter ′ λx.λy inhibit ′ ( x, y ) prod ′ S \ NP S Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 20 A Simple ccg Derivation with Semantics interleukin − 10 inhibits production NP : ( S \ NP ) / NP : NP : inter ′ λx.λy inhibit ′ ( x, y ) prod ′ > S \ NP : λy inhibit ′ ( prod ′ , y ) S > forward application Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 21 A Simple ccg Derivation with Semantics interleukin − 10 inhibits production NP : ( S \ NP ) / NP : NP : inter ′ λx.λy inhibit ′ ( x, y ) prod ′ > S \ NP : λy inhibit ′ ( prod ′ , y ) < S : inhibit ′ ( prod ′ , inter ′ ) > forward application < backward application Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 22 Classical Categorial Grammar • ‘Classical’ Categorial Grammar only has application rules • Classical Categorial Grammar is context free S S\NP NP (S\NP)/NP NP interleukin-10 inhibits production Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
C & C tools Intro Parsing CCG 23 Classical Categorial Grammar • ‘Classical’ Categorial Grammar only has application rules • Classical Categorial Grammar is context free S VP NP V NP interleukin-10 inhibits production Stephen Clark Models of Meaning for Natural Language Oxford, October 2010
Recommend
More recommend