Building a chatbot: NLP pipeline and dependency parsing By: Andrei Şuiu meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
What Is a Chatbot? Chat ro bots are computer programs powered by rules and sometimes artificial intelligence, that mimic conversation with people via a chat interface. Applications: ● Legal consultancy ● HR services Customer Services ● ● Call centres Banks ● ● Restaurants Travel Services & Hotels ● ● Medical services meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
History: first chatbot ELIZA Created from 1964 to 1966 @MIT AI Laboratory by Joseph Weizenbaum meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Applications: virtual lawyer meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Applications: virtual lawyer https://donotpay-search-master.herokuapp.com DoNotPay - a chatbot that provides free legal advices using AI invented by British entrepreneur Joshua Browder. It can assist with writing letters and filling out forms . By June of 2016, DoNotPay had successfully contested 160,000 parking tickets - a 64% success rate - and earlier this year, Browder added capabilities to assist asylum seekers in the US, UK and Canada. Now, the bot is able to assist with over 1,000 different legal issues in all 50 states and across the UK. meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Applications: Human Resources meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Applications: Human Resources meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Applications: Human Resources Help new employees to learn & find: ● Kitchen, coffee company main ● internal policies Printer/xerox ● ● Company structure ● Main business processes ● etc. meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Perception of chatbots You can think of a bot just as of another user ● Bot can be invited to a group and post messages with the help of keywords ● ● Bots can have many of the same qualities as their human counterparts: names ○ ○ profile photos can be direct messaged or mentioned ○ ○ can post messages or initiate conversation upload files, etc... ○ meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Perception of chatbots: responsibility meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Why use chatbots? Human language is a natural way to command and ask questions ● Single point of navigation that offers contextual&personalized information ● ● Chatbots give you the opportunity to serve more clients with less human resources Chatbots are often more cost effective and faster than their human ● counterparts. meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Messaging platforms are opening their APIs meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Applications Legal consultancy ● HR services ● Customer Services (Emag) ● ○ cross selling/up-selling, help make purchase decisions Handle objections personally, get customer feedback ○ ○ Offer discount codes Deliver shipping notifications, out-of-stock notificatoins ○ Call centres ● ● Banks (Livia de la BT) ● Restaurants Travel Services & Hotels (Uber chatbot) ● Medical services ● meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Building a chatbot The key for a bot to efficiently communicate with humans is its ability to understand the intentions of humans and extraction of relevant information from that intention and of course relevant action against that information. One of the main concerns of NLP science is to extract the intentions and other relevant information from text. meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Intention identification Below I propose a simple method for identification of some types of intentions. Generally, you'll get a unicode string out of the user input, either this is written at keyboard, either it's a string generated by a speech recognition engine from the audio stream received from a phone line. We'll use a technique called semantic role labeling . Semantic role labeling is a task in NLP consisting of the detection of the semantic arguments associated with the predicate/verb of a sentence and their classification into their specific roles. This is an important step towards making sense of the meaning of a sentence. meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Semantic Role Labeling Can we figure out that these sentences have the same meaning? ● Gates sold Microsoft stock to Google. ● Google bought Microsoft stock from Gates. The Microsoft stock was sold to Google by Gates. ● The Microsoft stock was purchased by Google from Gates. ● Predicates sold, bought, purchase represent an event. Semantic roles express the abstract role that arguments of a predicate can take in event. Gates - agent that sells Google - agent that buys Microsoft stock - the object being transacted meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
NLP Pipeline An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
NLP Pipeline An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Sentence tokenization How would you split sentences in a text? We know that the period in Mr. Smith and Google Inc. do not mark sentence boundaries. a period may denote an abbreviation , decimal point , an ellipsis(...) , or an email address – not ● the end of a sentence. About 47% of the periods in the Wall Street Journal corpus denote abbreviations . ● And sometimes sentences can start with non-capitalized words. i is a good variable name. And some sentences are not separated by periods ! Sentence Boundary Disambiguation: you can use PTBTokenizer from Stanford CoreNLP for Java, or Punkt Sentence Tokenizer from NLTK for Python. meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
NLP Pipeline An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Word tokenization How would you split words in a sentence? We don 't want to lose the negative particle. And usually , punctuation marks are not part of the words ! meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
NLP Pipeline An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
PoS Tagging Part-Of-Speech Tagger is a piece of software that reads text in some language and assigns parts of speech to each word (and other token), such as noun, verb, adjective, etc., although generally computational applications use more fine-grained POS tags like 'proper-noun-plural' or 'verb-past-gerund'. Usually taggers use PoS abbreviations like: NN - noun, singular ● NNPS - proper noun, plural ● VBZ - verb, 3rd person singular present ● JJR - adjective, comparative ● RBS - adverb, superlative ● Usually PoS taggers performs tokenization and lemmatization in the same time. meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
NLP Pipeline An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Lemmatization & Stemming The goal of both stemming and lemmatization is to reduce inflectional forms and sometimes derivationally related forms of a word to a common base form. For instance: am, are, is → be ● car, cars, car's, cars' → car ● Stemming usually refers to a crude heuristic process that chops off the ends of words in the hope of achieving this goal correctly most of the time, and often includes the removal of derivational affixes. Lemmatization usually refers to doing things properly with the use of a vocabulary and morphological analysis of words, normally aiming to remove inflectional endings only and to return the base or dictionary form of a word, which is known as the lemma . meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
NLP Pipeline An example of a NLP pipeline for role labeling: raw text → sentence tokenization → tokenization → PoS-tagging → → lemmatization → dependency parsing → role labeling meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Dependency parsing A dependency parse connects words according to their relationships. It generates a directed acyclic graph where nodes are words that are dependent on the parent, and edges are labeled by the relationship. Above is an example of a graph generated by Stanford CoreNLP parser meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Dependency parsing Another way to represent dependencies. Note the root relation. The quick brown fox jumps over the lazy dog. root (ROOT-0, jumps-5) ● det (fox-4, The-1) ● det (dog-9, the-7) ● amod (fox-4, brown-3) ● amod (dog-9, lazy-8) ● nsubj (jumps-5, fox-4) ● case (dog-9, over-6) ● amod (fox-4, quick-2) ● _ nmod (jumps-5, dog-9) ● meetup.com/IASI-AI/ facebook.com/AI.in.Iasi/
Recommend
More recommend