s9276 towards open domain conversational ai
play

S9276: Towards Open-Domain Conversational AI Y U N - N U N G ( V I V - PowerPoint PPT Presentation

S9276: Towards Open-Domain Conversational AI Y U N - N U N G ( V I V I A N ) C H E N 1 H T T P : / / V I V I A N C H E N . I D V. T W Ir Iron Man (2 (2008) What can machines achieve now or in the future? 2 Language Empowering


  1. S9276: Towards Open-Domain Conversational AI Y U N - N U N G ( V I V I A N ) C H E N 陳 縕 儂 1 H T T P : / / V I V I A N C H E N . I D V. T W

  2. Ir Iron Man (2 (2008) What can machines achieve now or in the future? 2

  3. Language Empowering In Intelli ligent Assis istant M I U L A B Microsoft Cortana (2014) Google Now (2012) Apple Siri (2011) N T U Google Assistant (2016) Amazon Alexa/Echo (2014) Apple HomePod (2017) Facebook M & Bot (2015) Google Home (2016)

  4. Why Natural Language? • Global Digital Statistics (2018 January) M I U L A B N T U Active Mobile Active Social Media Internet Users Total Population Unique Mobile Users Users Social Users 4.02B 7.59B 5.14B 3.20B 2.96B 4% 14% 7% 13% The more natural and convenient input of devices evolves towards speech.

  5. Why and When We Need? Social Chit-Chat Turing Test (talk like a human) “I want to chat” Information consumption “I have a question” M I U L A B Task-Oriented “I need to get this done” Task completion Dialogues “What should I do?” Decision support • What is today’s agenda? N T U • What does GTC stand for? • Book me the flight ticket from Taipei to San Francisco • Reserve a table at Din Tai Fung for 5 people, 7PM tonight • Is GTC good to attend?

  6. In Intelligent Assis istants M I U L A B N T U Task-Oriented

  7. Conversational Agents Chit-Chat M I U L A B N T U Task-Oriented

  8. T a s k - O r i e n t e d D i a l o g u e S y s t e m s M I U L A B N T U JARVIS – Iron Man’s Personal Assistant Baymax – Personal Healthcare Companion

  9. Task-Oriented Dialogue System (Y (Young, g, 2000) http://rsta.royalsocietypublishing.org/content/358/1769/1389.short Speech Signal Hypothesis M I U L A B are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame N T U request_movie genre=action, date=this weekend Dialogue Management (DM) 9 Natural Language • Dialogue State Tracking (DST) Text response Generation (NLG) • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers

  10. Task-Oriented Dialogue System (Y (Young, g, 2000) Speech Signal Hypothesis M I U L A B are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame N T U request_movie genre=action, date=this weekend Dialogue Management (DM) 10 Natural Language • Dialogue State Tracking (DST) Text response Generation (NLG) • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers

  11. Semantic ic Frame Representation • Requires a domain ontology: early connection to backend • Contains core content (intent, a set of slots with fillers) M I U L A B Restaurant find me a cheap taiwanese restaurant in oakland Domain price type find_restaurant (price=“cheap”, restaurant type=“ taiwanese ”, location=“ oakland ”) N T U location Movie show me action movies directed by james cameron 11 Domain genre year find_movie (genre=“action”, movie director=“ james cameron ”) director

  12. Backend Database / Ontology • Domain-specific table • Target and attributes date rating M I U L A B • Functionality • Information access: find specific entries movie name • Task completion: find the row that satisfies theater time the constraints N T U Movie Name Theater Rating Date Time Iron Man Last Taipei A1 8.5 2018/10/31 09:00 Iron Man Last Taipei A1 8.5 2018/10/31 09:25 Iron Man Last Taipei A1 8.5 2018/10/31 10:15 Iron Man Last Taipei A1 8.5 2018/10/31 10:40

  13. Task-Oriented Dialogue System (Y (Young, g, 2000) Speech Signal Hypothesis M I U L A B are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame N T U request_movie genre=action, date=this weekend Dialogue Management (DM) 13 Natural Language • Dialogue State Tracking (DST) Text response Generation (NLG) • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers

  14. Language Understanding (L (LU) • Pipelined M I U L A B 1. Domain 2. Intent 3. Slot N T U Classification Classification Filling 14

  15. 1. . Domain Id Identification Requir ires Predefined Do Domain in Ontology User M I U L A B find a good eating place for taiwanese food N T U Movie DB Restaurant DB Taxi DB 15 Organized Domain Knowledge (Database) Intelligent Agent Classification!

  16. 2. . In Intent Detection Requir ires Predefined Sch Schema User M I U L A B find a good eating place for taiwanese food FIND_RESTAURANT N T U FIND_PRICE Restaurant DB FIND_TYPE 16 : Intelligent Agent Classification!

  17. 3. . Slo lot Fil illing Requir ires Predefined Sch Schema O O B-rating O O O B-type O User M I U L A B find a good eating place for taiwanese food Restaurant Rating Type Rest 1 good Taiwanese Rest 2 bad Thai N T U Restaurant DB : : : 17 FIND_RESTAURANT SELECT restaurant { Intelligent rest.rating =“good” rating=“good” Agent type=“ taiwanese ” rest.type =“ taiwanese ” } Semantic Frame Sequence Labeling

  18. Slo lot Tagging (Y (Yao+, 20 2013 13; ; Mesn snil il+, 201 2015) • Variations: http://131.107.65.14/en-us/um/people/gzweig/Pubs/Interspeech2013RNNLU.pdf; http://dl.acm.org/citation.cfm?id=2876380 a. RNNs with LSTM cells M I U L A B b. Input, sliding window of n-grams c. Bi-directional LSTMs 𝑧 0 𝑧 1 𝑧 2 𝑧 𝑜 N T U 𝑧 1 𝑧 2 𝑧 0 𝑧 𝑜 𝑧 0 𝑧 1 𝑧 2 𝑧 𝑜 𝑐 𝑐 𝑐 𝑐 ℎ 0 ℎ 1 ℎ 2 ℎ 𝑜 ℎ 0 ℎ 1 ℎ 2 ℎ 𝑜 ℎ 0 𝑔 𝑔 𝑔 ℎ 1 ℎ 2 ℎ 𝑜 𝑔 ℎ 1 ℎ 0 ℎ 2 ℎ 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 (a) LSTM (c) bLSTM (b) LSTM-LA

  19. Slo lot Tagging (Kurata+, 20 2016 16; Si Simonnet+, 20 2015 15) • Encoder-decoder networks http://www.aclweb.org/anthology/D16-1223 • Leverages sentence level information 𝑧 1 𝑧 0 𝑧 2 𝑧 𝑜 M I U L A B ℎ 𝑜 ℎ 2 ℎ 1 ℎ 0 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 𝑥 𝑜 𝑥 2 𝑥 1 𝑥 0 • Attention-based encoder-decoder • Use of attention (as in MT) in the encoder-decoder network N T U • Attention is estimated using a feed-forward network with input: h t and s t at time t 𝑧 1 𝑧 2 𝑧 0 𝑧 𝑜 𝑡 0 𝑡 1 𝑡 2 𝑡 𝑜 ℎ 0 ℎ 1 ℎ 2 ℎ 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 c i … ℎ 0 ℎ 𝑜

  20. Jo Joint Semantic ic Frame Parsing • Intent prediction • Slot filling and and slot filling are intent prediction M I U L A B performed in two Sequence- in the same Parallel (Liu based (Hakkani- branches output sequence and Lane, 2016) Tur et al., 2016) N T U taiwanese food please EOS U U U U h t-1 h t h t+1 h T+1 W W W W V V V V O FIND_REST B-type O Intent Prediction Slot Filling

  21. Jo Joint Model Comparison M I U L A B Attention Intent-Slot Mechanism Relationship Joint bi-LSTM X Δ (Implicit) Attentional Encoder-Decoder √ Δ (Implicit) N T U Slot Gate Joint Model √ √ (Explicit) 21

  22. Slo lot-Gated Jo Joint SLU (G (Goo+, 20 2018 18) Slot 𝑇 𝑇 𝑇 𝑇 𝑕 𝑧 1 𝑧 2 𝑧 3 𝑧 4 M I U L A B Sequence 𝑧 𝐽 Intent Attention Slot 𝑤 Gate BLSTM tanh Slot Attention Word 𝑋 N T U 𝑦 1 𝑦 2 𝑦 3 𝑦 4 BLSTM Sequence 𝑇 𝑑 𝐽 𝑑 𝑗 Slot Gate Word 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑇 + 𝑋 ∙ 𝑑 𝐽 𝑕 = ∑𝑤 ∙ tanh 𝑑 𝑗 Sequence Slot Prediction 𝑇 = 𝑡𝑝𝑔𝑢𝑛𝑏𝑦 𝑋 𝑇 ℎ 𝑗 + 𝒉 ∙ 𝑑 𝑗 𝑇 + 𝑐 𝑇 𝑧 𝑗 𝒉 will be larger if slot and intent are better related

  23. Context xtual LU Domain Identification → Intent Prediction → Slot Filling M I U L A B send_email D communication I just sent email to bob about fishing this weekend U O O O O O S B-contact_name B-subject I-subject I-subject → send_email(contact_name =“bob”, subject=“fishing this weekend”) N T U U 1 send email to bob S 1 B-contact_name → send_email(contact_name =“bob”) 23 are we going to fish this weekend U 2 B-message I-message I-message I-message S 2 I-message I-message I-message → send_email (message=“are we going to fish this weekend”)

  24. Context xtual LU • User utterances are highly ambiguous in isolation M I U L A B Restaurant Book a table for 10 people tonight. Booking Which restaurant would you like to book a table for? N T U Cascal, for 6. ? #people time

Recommend


More recommend