November 14, 2017 Administrative notes • Reminder: In the news call #3 individual component due November 22 . • Reminder: In the news call #3 group sign up due November 24 • Reminder: project deadlines coming up starting November 27 • Reminder: In the news call #3 group component due November 28 • Reminder: Final exam: Tuesday, December 5 @noon in CIRS 1250 Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Okay, so that’s what AI is. But how did they do that? • There are LOTS of different parts involved • We’ll look at a few • Note that we’ll cover the general idea of how things work, but not the specific details • We’ll start with looking at how Watson understands language • Understanding how language is processed by computers is called Natural Language Processing (NLP) Computational Thinking www.ugrad.cs.ubc.ca/~cs100
How does Watson process language? • See: Building Watson: An Overview of the DeepQA Project, by David Ferrucci et al., AI Magazine, 2010 • One thing we won’t cover: they use classification (remember decision trees?) to help group types of questions. • We’ll start by looking at traditional Natural Language Processing (NLP) techniques http://www.aaai.org/ojs/index.php/aimagazine/article/view/2303 Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Natural Language Processing (NLP) • Natural Language Processing (NLP): automatic processing of language, e.g., by computers • NLP to infer meaning from natural languages is challenging! • NLP draws on many disciplines: linguistics, cognitive science, psychology, logic, computer science, philosophy, engineering, … Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Group exercise NLP is needed for many different things that computers do these days. List applications that you have used that need NLP and what they used it for. Siri - set a 5 minute timer Google translate Ask alexa maps questions Call systems - "please say yes or no" Voice to text Grammar check Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Typical NLP steps 1. Recognize speech (Watson skipped this) 2. Syntax analysis, or parsing: inferring parts of speech and sentence structure, using a lexicon and grammar 3. Semantic analysis: inferring meaning using syntax and semantic rules 4. Pragmatics: inferring meaning from contextual information Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Parsing: identifying parts of speech and sentence structure using lexicon and grammar Input: Lexicon Grammar Word Category Sentence NounPhrase, VerbPhrase VerbPhrase Verb, NounPhrase Cat Noun NounPhrase Article, Noun Cheese Noun NounPhrase Noun Ate Verb the Article Output: a parse tree Computational Thinking www.ugrad.cs.ubc.ca/~cs100
How parsing helped Watson The structure of some clues and certain keywords tells Watson what the form of the answer will be – without considering semantics. Consider the following clue that Watson can answer: Category : Oooh....Chess Clue : Invented in the 1500s to speed up the game, this maneuver involves two pieces of the same color. Answer : Castling Parsing is key in Watson’s ability to answer this question Computational Thinking www.ugrad.cs.ubc.ca/~cs100
How parsing helped Watson Parsing takes the sentence and shows how the words are assigned parts of speech and build up to form a sentence: Data mining showed that given this structure, the noun between the two verb phrases was the type of thing the answer is. In this case, the answer was a “maneuver.” Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Group exercise: create a parse tree Lexicon Grammar Sentence NounPhrase, VerbPhrase Word Category VerbPhrase Verb, NounPhrase Cat Noun NounPhrase Article, Noun Rat Noun NounPhrase Article, Adjective, Noun Chased Verb Large Adjective the Article Using the above lexicon and grammar, parse the sentence: “the large cat chased the rat” If you have a choice of rules, pick the one that works best. You don’t have to use all the rules. Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Parsing: identifying parts of speech and sentence structure using lexicon and grammar Lexicon Grammar Sentence NounPhrase, VerbPhrase Word Category VerbPhrase Verb, NounPhrase Cat Noun NounPhrase Article, Noun Rat Noun NounPhrase Article, Adjective, Noun Chased Verb Large Adjective the Article Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Parse “time flies like an arrow” Group exercise Write down your tree structure and your algorithm. Note: you don’t have to use all the rules! Lexicon Grammar Word Category Sentence NounPhrase, VerbPhrase an article NounPhrase Article, Noun arrow noun NounPhrase Article, Adjective, Noun NounPhrase Noun flies noun NounPhrase Noun, Noun flies verb VerbPhrase Verb, Adverb, NounPhrase time noun VerbPhrase Verb, NounPhrase time verb like adverb like verb Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Use your algorithm to parse “fruit flies like a banana” Group exercise Did the algorithm work? A. Yes B. No C. Kind of… but “flies” wasn’t quite right. Computational Thinking www.ugrad.cs.ubc.ca/~cs100
The point: Parsing is hard! • Those were short, yet tricky examples – natural languages are ambiguous! • Imagine trying to write a parsing algorithm that works for a natural language… sentences of 20-30 words may have 10,000 possible syntactic structures! • Jeopardy makes the problem much easier, because the structure of Jeopardy clues are relatively simple Computational Thinking www.ugrad.cs.ubc.ca/~cs100
How good are computers at parsing? • A recent Google Parser – Parsey McParseface – claims to have a record setting 94% accuracy for a newspaper dataset… but only 90% for web content • This sounds pretty good, but that means that assuming accuracy is measured per word, you’d expect to have ~5 words parsed incorrectly on this slide. Computational Thinking https://research.googleblog.com/2016/05/announcing-syntaxnet- worlds-most.html www.ugrad.cs.ubc.ca/~cs100
Final note on parsing: it’s the basis for computer programming • A computer has to "understand" programs in order to execute them • Programming languages are designed so that they can be parsed unambiguously • A grammar specifies all the possible programs that can be written in a language • Designing programming languages (and their grammars) is a fun and important part of computer science Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Recall: Typical NLP steps 1. Recognize speech (Watson skipped this) 2. Syntax analysis, or parsing: inferring parts of speech and sentence structure, using a lexicon and grammar 3. Semantic analysis: inferring meaning using syntax and semantic rules 4. Pragmatics: inferring meaning from contextual information Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Semantic analysis: inferring meaning using syntax and semantic rules Syntax analysis/parsing can sometimes help determine semantics, or meaning Examples: • Knowing whether “flies” is a noun or a verb (the syntax) tells us something about its meaning (the semantics) • Semantic rules provide additional information: • Word categories: e.g., a cat is a feline • Relationships between words, e.g., a semantic rule for the word “like” can help us interpret “the boy likes the cat” Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Semantic analysis: inferring meaning using syntax and semantic rules Syntax describes a sentence’s structure. Semantics adds (limited) meaning that can be figured out using simple rules that don’t require much context. Examples: • Word categories: e.g., a cat is a feline • “gave” is the past tense of “give” Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Recall: Typical NLP steps 1. Recognize speech (Watson skipped this) 2. Syntax analysis, or parsing: inferring parts of speech and sentence structure, using a lexicon and grammar 3. Semantic analysis: inferring meaning using syntax and semantic rules 4. Pragmatics: inferring meaning from contextual information Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Pragmatics: inferring meaning from contextual information • Most techniques to find semantic meaning of words will look for clues in the surrounding text to disambiguate word meaning. For example, the real estate meaning of “lot” might have the words “vacant” or “square foot” near by. • Pragmatics becomes important also when sentences contain pronouns Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Pragmatics and Watson An example Watson can solve Category : Decorating Clue : Though it sounds “harsh,” it’s just embroidery, often in a floral pattern, done with yarn on cotton cloth. Answer : crewel • Syntax parses the sentence and determines the parts of the speech and the parse tree. It shows that the answer is what “it’s” refers to • Semantics provides definitions of terms such as “harsh” and “crewel” • Pragmatics determines what “it’s” refers to and differentiates between the different definitions of “Harsh” and “crewel/cruel” Computational Thinking www.ugrad.cs.ubc.ca/~cs100
Recommend
More recommend