Chatbots as active members of our society Proseminar Data Mining Luca Dombetzki Fakulät für Informatik Technische Universität München Email: luca.dombetzki@tum.de
AGENDA Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion 2 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Chatbots today: Microsoft Tay
Introduction What can Chatbots do? How far have they come? What limits still constrain them while impacting our society? Fig2 4 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
AGENDA Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion 5 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Definition - Chatbot ● Alias: Chatterbot ● Computer program ● textual methods ● Interact with human being ● Aim 1: Tool, known as a bot ● Aim 2: convincingly participate in human conversation 6 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
AGENDA Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion 7 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots Turing Test 1950 1965 1980 1995 2010 2025 8 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots Turing Test ELIZA DOCTOR 1950 1965 1980 1995 2010 2025 9 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots Turing Test PARRY ELIZA DOCTOR 1950 1965 1980 1995 2010 2025 10 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots Turing Test PARRY ELIZA Loebner Prize DOCTOR 1950 1965 1980 1995 2010 2025 11 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots A.L.I.C.E Turing Test PARRY (AIML) ELIZA Loebner Prize DOCTOR 1950 1965 1980 1995 2010 2025 12 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots A.L.I.C.E Turing Test PARRY Cleverbot (AIML) ELIZA Loebner Prize DOCTOR 1950 1965 1980 1995 2010 2025 13 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots A.L.I.C.E Turing Test PARRY Cleverbot (AIML) ELIZA Loebner Prize Facebook DOCTOR 1950 1965 1980 1995 2010 2025 14 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots A.L.I.C.E Turing Test PARRY Cleverbot (AIML) ELIZA Loebner Prize Facebook Whatsapp DOCTOR 1950 1965 1980 1995 2010 2025 15 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots A.L.I.C.E Turing Test PARRY Cleverbot (AIML) ELIZA Loebner Prize Facebook Whatsapp DOCTOR Tay 1950 1965 1980 1995 2010 2025 16 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Brief history of chatbots Now: ● companies interested in chatbots ● many different chatbots on the market ● several use cases ● deep learning ● Malicious bots 1950 1965 1980 1995 2010 2025 17 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
AGENDA Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion 18 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Use Cases ● Customer service ● Information acquisition ● Research ● Malicious intent 19 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Use Cases – Customer Service ● Goals: ● Customer closeness ● reliably understand customer ● Integrate seamlessly => human like appearance not necessary ● Implementation: ● Pattern based approach ● Instant messaging platform APIs with extra features ● Closed domain 20 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Use Cases – Information Acquisition ● Goals: ● Simple implementation ● Ease of use for customer => human like appearance not necessary ● Implementation: ● Pattern based approach ● Instant messaging platform APIs with extra features ● Closed domain 21 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Use Cases – Research ● Natural Language Processing as main topic => Access to a lot of data to train, analyze and learn from ● Opinion mining / sentiment analysis => Negobot, a chatbot trained to find pedophiles 22 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Use Cases – Malicious Intent ● Advertisement / spam ● Phishing attacks => Disclosure of private information ● Spreading of bad information => manipulation of public opinion => the better click bots 23 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
AGENDA Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion 24 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Main Problems ● Validation ● Coherent personality (same answer to semantically same questions) ● Context ● Intention and diversity 25 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
AGENDA Introduction Definition Brief history of chatbots Use cases Main Problems Mechanics Detection Example of a Sqe2Seq Model using TF Conclusion 26 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Mechanics Complex chatbot broken down in categories of interest: ● Response ● Intent ● Context ● Domain 27 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Mechanics - Response Retrieval based: Generative based: ● Database as a Backend ● Generate complete text ● Retrieval algorithm ● Recurrent Neural Networks (LSTM / GRU) ● No new text generated + Open domain learnable (in + Spelling mistakes theory) preventable - Unreliable + reliable - open domain impossible 28 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Mechanics - Intent Pattern approach (AIML): Classification approach: ● Symbolic reduction ● E.g. Recurrent Neural Networks (LSTM / GRU) ● Divide and conquer produce a “intent-vector” ● Synonyms ● Spelling / Grammar + Fully automatic and correction scaleable – Intent vector not human + reliable, verifiable readable => decoder required – manual 29 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Mechanics – Intent (Code) <category> <pattern>DO YOU KNOW WHO * IS</pattern> <template><srai>WHO IS <star/></srai></template> </category> <category> <pattern>YES *</pattern> <template><srai>YES</srai> <sr/></template> </category> 30 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Mechanics - Context Rule based (AIML): Machine learning based: ● State machine, variables ● Context Layer in RNN ● Conditionals ● Context vector together with input data + human readable + artificial intelligence => - human planning human behaviour – unverifiable, unstable 31 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Mechanics - Context 32 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Mechanics- Domain ● Closed domain => less possibilities => more fitting replies ● Open domain => infinite possibilities + topic switches 33 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Mechanics - Architectures ● Chatbot API: A lot of ready-to-use features ● Seq2Seq: Two RNN connected ● Cleverbot: Search on a database of human responses ● A.L.I.C.E: AIML script 34 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Detection Passive Detection: Active Detection: ● Message sizes ● General questions ● Inter message delay ● URL probes ● Repetition ● Subcognitive probes ● Evasiveness ● Rating games ● Social/Emotional probes Social Detection: ● Ambiguity probes / ● Followers to following Keyword targeting ratio ● Activity 35 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Example of a Seq2Seq Model using TF 1) Cornell Movie Corpus 2) Transform to data accepted by TensorFlow (https://github.com/b0noI/dialog_converter) 3) Train TF-translate model with this data 20000 it.: Intent and diversity problem (underfit) 45000 it.: Long sentences that make sence 60000+ it.: Special answers exactly from the training data (overfit) 36 Luca Dombetzki, Proseminar Datamining, TU Munich 9th June, 2017
Recommend
More recommend