De Deep Le Learnin ing fo for Di Dialogue Sy Systems GTC 2018 - PowerPoint PPT Presentation

De Deep Le Learnin ing fo for Di Dialogue Sy Systems GTC 2018 P ROF . Y UN -N UNG (V IVIAN ) C HEN 陳縕儂 Mar 28 th , 2018 HTTP://VIVIANCHEN.IDV.TW

2 Best Poster Award @ GTC 2017 Thanks NVIDIA!!!

Future Life – Intelligent Assistant 3

Introduction & Background 4

Language Empowering Intelligent Assistant 5 Microsoft Cortana (2014) Google Now (2012) Apple Siri (2011) Google Assistant (2016) Apple HomePod (2017) Amazon Alexa/Echo (2014) Facebook M & Bot (2015) Google Home (2016)

Why We Need? 6  Get things done  E.g. set up alarm/reminder, take note  Easy access to structured data, services and apps  E.g. find docs/photos/restaurants  Assist your daily schedule and routine  E.g. commute alerts to/from work  Be more productive in managing your work and personal life “Hey Assistant” 6

Why Natural Language? 7  Global Digital Statistics (2017 January) Unique Active Social Active Mobile Internet Users Total Population Media Users Mobile Users Social Users 3.77B 7.48B 2.79B 4.92B 2.55B The more natural and convenient input of devices evolves towards speech. 7

Dialogue System 8  Spoken dialogue systems are intelligent agents that are able to help users finish tasks more efficiently via spoken interactions.  Spoken dialogue systems are being incorporated into various devices (smart-phones, smart TVs, in- car navigating system, etc). JARVIS – Iron Man’s Personal Assistant Baymax – Personal Healthcare Companion Good dialogue systems assist users to access information conveniently and finish tasks efficiently. 8

App  Bot 9  A bot is responsible for a “single” domain, similar to an app Users can initiate dialogues instead of following the GUI design 9

Task-Oriented Dialogue System (Young, 2000) 10 http://rsta.royalsocietypublishing.org/content/358/1769/1389.short Speech Signal Hypothesis are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame request_movie genre=action, date=this weekend Dialogue Management (DM) Natural Language • Dialogue State Tracking (DST) Generation (NLG) Text response • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers 10

Interaction Example 11 User find a good eating place for taiwanese food Good Taiwanese eating places include Din Tai Fung, Boiling Point, etc. What do you want to choose? I can help you go there. Intelligent Q: How does a dialogue system process this request? Agent 11

Task-Oriented Dialogue System (Young, 2000) 12 Speech Signal Hypothesis are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame request_movie genre=action, date=this weekend Dialogue Management (DM) Natural Language • Dialogue State Tracking (DST) Generation (NLG) Text response • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers 12

1. Domain Identification Requires Predefined Domain Ontology 13 User find a good eating place for taiwanese food Movie DB Restaurant DB Taxi DB Organized Domain Knowledge (Database) Intelligent Agent Classification! 13

2. Intent Detection Requires Predefined Schema 14 User find a good eating place for taiwanese food FIND_RESTAURANT FIND_PRICE Restaurant DB FIND_TYPE : Intelligent Agent Classification! 14

3. Slot Filling Requires Predefined Schema 15 O O B-rating O O O B-type O User find a good eating place for taiwanese food Restaurant Rating Type Rest 1 good Taiwanese Rest 2 bad Thai Restaurant DB : : : FIND_RESTAURANT SELECT restaurant { Intelligent rest.rating =“good” rating=“good” Agent type=“ taiwanese ” rest.type =“ taiwanese ” } Semantic Frame Sequence Labeling 15

Task-Oriented Dialogue System (Young, 2000) 16 Speech Signal Hypothesis are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame request_movie genre=action, date=this weekend Dialogue Management (DM) Natural Language • Dialogue State Tracking (DST) Generation (NLG) Text response • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers 16

Elements of Dialogue Management 17 (Figure from Gašić ) 17

State Tracking Requires Hand-Crafted States 18 User find a good eating place for taiwanese food i want it near to my office NULL location rating type Intelligent rating, loc, loc, rating Agent type type all 18

State Tracking Requires Hand-Crafted States 19 User find a good eating place for taiwanese food i want it near to my office NULL location rating type Intelligent rating, loc, loc, rating Agent type type all 19

State Tracking Handling Errors and Confidence 20 User find a good eating place for taixxxx food FIND_RESTAURANT FIND_RESTAURANT FIND_RESTAURANT rating=“good” rating=“good” rating=“good” type=“ taiwanese ” type=“ thai ” ? rating=“good” NULL , type=“ thai ” ? ? rating=“good”, location rating type type=“ taiwanese ” ? Intelligent rating, loc, loc, rating Agent type type all 20

Elements of Dialogue Management 21 (Figure from Gašić ) 21

Dialogue Policy for Agent Action 22  Inform(location=“Taipei 101”)  “The nearest one is at Taipei 101”  Request(location)  “Where is your home?”  Confirm(type=“ taiwanese ”)  “Did you want Taiwanese food?” 22

Task-Oriented Dialogue System (Young, 2000) 23 Speech Signal Hypothesis are there any action movies to see this weekend Language Understanding (LU) • Domain Identification Speech • User Intent Detection Recognition • Slot Filling Text Input Are there any action movies to see this weekend? Semantic Frame request_movie genre=action, date=this weekend Dialogue Management (DM) Natural Language • Dialogue State Tracking (DST) Generation (NLG) Text response • Dialogue Policy Where are you located? System Action/Policy request_location Backend Action / Knowledge Providers

Output / Natural Language Generation 24  Goal: generate natural language or GUI given the selected dialogue action for interactions  Inform(location=“Taipei 101”)  “The nearest one is at Taipei 101” v.s.  Request(location)  “Where is your home?” v.s.  Confirm(type=“ taiwanese ”)  “Did you want Taiwanese food?” v.s. 24

Deep Learning for Dialogue Systems 25

Machine Learning ≈ Looking for a Function 26  Speech Recognition    “你好 ( Hello ) ” f  Image Recognition    cat f  Go Playing    f 5-5 (next move)  Chat Bot    “The address is … ” “ Where is GTC? ” f

A Single Neuron 27 w x 1 1 Activation function w x 2   z 2   y   z  z w … N   1 x   1 z N   z e b Sigmoid function 1 bias z w, b are the parameters of this neuron 27

A Single Neuron 28  N M f : R R w x 1 1 w x 2 z 2  y w … N   is " 2 " y 0 . 5 x  N   not " 2 " y 0 . 5 b 1 bias A single neuron can only handle binary classification 28

A Layer of Neurons 29  N M f : R R  Handwriting digit classification  y x 1 1 “1” or not x  y 2 2 Which “2” or not one is … max?  x y N 3 “3” or not … … 1 10 neurons/10 classes A layer of neurons can handle multiple possible output, and the result depends on the max one

Deep Neural Networks (DNN) 30  N M f : R R  Fully connected feedforward network Layer L Input Output Layer 1 Layer 2 …… y x 1 1 vector vector x …… y y 2 x 2 …… …… …… …… …… y x M N Deep NN: multiple hidden layers

Recurrent Neural Network (RNN) 31 : tanh, ReLU time RNN can learn accumulated sequential information (time-series) http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/

Deep Learning for LU 32  IOB Sequence Labeling for Slot Filling 𝑧 0 𝑧 1 𝑧 2 𝑧 𝑜 𝑧 0 𝑧 1 𝑧 2 𝑧 𝑜 𝑧 0 𝑧 1 𝑧 2 𝑧 𝑜 𝑐 𝑐 𝑐 𝑐 ℎ 𝑜 ℎ 0 ℎ 1 ℎ 2 ℎ 0 ℎ 1 ℎ 0 ℎ 1 ℎ 2 ℎ 𝑜 ℎ 2 ℎ 𝑜 𝑔 𝑔 𝑔 𝑔 ℎ 1 ℎ 0 ℎ 2 ℎ 𝑜 𝑥 0 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 𝑥 1 𝑥 2 𝑥 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 (b) LSTM-LA (a) LSTM  Intent Classification (c) bLSTM intent ℎ 0 ℎ 1 ℎ 2 ℎ 𝑜 𝑥 0 𝑥 1 𝑥 2 𝑥 𝑜 (d) Intent LSTM 32

De Deep Le Learnin ing fo for Di Dialogue Sy Systems GTC 2018 - PowerPoint PPT Presentation

De Deep Le Learnin ing fo for Di Dialogue Sy Systems GTC 2018 P ROF . Y UN -N UNG (V IVIAN ) C HEN Mar 28 th , 2018 HTTP://VIVIANCHEN.IDV.TW 2 Best Poster Award @ GTC 2017 Thanks NVIDIA!!! Future Life Intelligent Assistant

Spelling, Punctuation and Grammar Suffixes -ing Year One SPaG | Suffixes -ing Suffixes Suffixes

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Ou Outstandin ding g Teachin ing, g, Learnin ing g and As Assessme sment nt (OTLA) A)

Language and Computers Speech acts Rules Early dialogue Dialog Systems systems ELIZA Other

dialogue notations and design Dialogue Notations and Design Dialogue Notations

dialogue systems, dialogue modeling 15 June 2007 ptt dialogue systems: intro 1/71 Dialog

Mental l Health: : Le Learnin ing fr from havin ing wal alked a a mile ile in in th

Characterizing Ext xtragalactic Pre-Main- Sequence Stars wit ith Machine and Deep Learnin ing

In Introductio ion to Deep Learnin ing I2DL: Prof. Niessner, Prof. Leal-Taix 1 The Team

Advanced Deep Learnin ing for Computer Vis isio ion Prof. Leal-Taix and Prof. Niessner 1

Le Learnin ings of f a Psychosocia ial l Approach when work rkin ing with ith Tort rture

A Structured Learnin ing Approach wit ith Neural Condit itio ional Random Fie ield lds for

Dialogue corpora NPFL070 December 11, 2019 (NPFL070) Dialogue corpora December 11, 2019 1 /

dialogue notations and Dialogue linked to the semantics of the system what it does

Brokered Agreements in in Mult lti-Party Machine Learnin ing 10th ACM SIGOPS Asia-Pacific

The Computer and Natural Language Speech acts Discourse structure (Ling 445/515) Early dialogue

MICA Many Integrated County Applications MICA is the name of our new system for integrating data

Project Overview Dow processing plants need water for a variety of purposes. Water should

GBREMIT Central Collection and Publication Service The Service GBREMIT has been live since

1 Micro SIMs s are use sed in Tabs 2 3 Press the button again Press the button and hold

Dialogue Managing the Unexpected Managing the Unexpected the Role of the Regulatory Body

Moreno Valley College Dialogue: Enrollment Management Hosted by President Irving Hendrick March

The Future of Water Management Powered by Life beyond the 100th meridian 2 Powered by Our

Management of Contaminated Soil Oakland December 2, 2017 Co-hosts: Greenaction for Health and