Intelligent Chatbot on WeChat WeChat AI NLP 2017.05.09
WeCh We Chat is is the he le leading ding mob obil ile e so socia ial l ne network work in in Ch Chin ina. In In 6 6 ye years, rs, We WeCh Chat at has gained… llion mo monthly nthly act active users ive users 889 889 mi million ion Of Offic ficial Acco ial Accounts unts 10 mill 10 million and de devel velopers opers 200 th thous usand 600 m 00 mil illion lion We WeChat Chat Pa Pay user y users Data: : Tencent Financial Repor orts
50% of users spend more than 1 hour on WeChat every day 3 Data: Penguin Intelligence
WeChat Overview The WeChat Lifestyle WeChat is not just a mobile messaging app. It’s a new lifestyle, connecting people with people, services, devices and more.
Chinese New Year 2017 Red Pocket Jan 27 – Feb 01 46 Billion Emoji Voice Call Jan 27 – Feb 01 Jan 27 – Jan 28 16 Billion 2.1 Billion minutes
The new way for businesses to interact with their customers. 6 Powered by WeChat
Service Accounts China Merchant Bank case Messaging Account management (Can be automated) China Merchants Bank Over 10 million followers Open an account Pay bill/loan Receive payment notifications Receive CRM promotions 7 Powered by WeChat
Service Accounts China Southern Airlines case Messaging Account management China Southern Airlines Buy Tickets Check-in Choose seats Flight status update Frequent flyer services 8 Powered by WeChat
WeChat AI Growth in 6 years 2013.8 2014.1 Voice Input 2012.10 Shake TV Scan Amber Platform 2014.12 Voice Search Cover/Word BOTs Platform Voice Print 5.0 Now … 4.3 4.5 5.3 5.4 6.2 5.2 6.0 2013.10 2015.4 Voice to Text Data Mining 2014.6 2013.2 Scan Cards Voice Reminder Smart Open Shake Music Platform
Highlights Data/model parallelism • Flexible resource management and scheduling • Compile the graphs • Best-effort concurrent operations • Limited memory reuse • Consistent data streaming • Kernels merge • WeChat Amber Platform Applications Machine Learning • Deep Learning • Data Mining • Experiments – Google Net
Features End-to-end deep learning • Above 95% accuracy • Cloud based & embedded ASR • User defined vocabulary and • grammar Infrastructure Clusters+CUDA+MPI • Speech Recognition Latency control with infiniband • Training with Tesla M40 • Inference with Tesla P4 • Applications WeChat voice input • Keyword spotting • Speech retrieval • Large vocabulary continuou • s speech recognition
Image Recognition Applications • OCR • Identity Card Recognition – Key personal information extraction and verification • Image Understanding – Classify tens of millions of images daily – Supports 3 levels and 1,000 categories Algorithms – CNN/RNN/LSTM – End-to-end deep learning
Chatbot Natural to server customers • Powerful for users to acquire • service, information, knowledge, etc. Chatbot on WeChat Examples WeBank • WeChat official account • Tencent games • Xiao ‘ er Mechanical Monk •
Work Flow of Wechat Chatbot Question Parsing Question Understanding Question Context Output Rule Chitter Chat QnA Model Match Knowledge Base Answer Candidates Answer Ranking Answer
Chatbot Architecture in Progress Question Sentiment Parsing Question Understanding Sentiment Analysis Question Context Analysis Output Output Rule Document Knowledge Chitter Chat QnA Model Content Match Graph Answer Candidates Answer Ranking Under development Sentiment analysis • Knowledge graph • Personalization Doc-chat • Personalization • Expose the platform to public • Answer
Conversational Chatbot Why I’m so busy? How can be happy?
Hard Problems for Conversational Chatbot Knowle wledg dge e Re Repr present esentatio tion: n: Answ swer er Genera eratio tion: n: avoid trivial and boring Questio stion n Unde ders rstan tandi ding ng: answers Notaria ial l ce certifi ficat cates es, executed in the ma mainla land nd, 干啥 呐 ? (what are you doing?) • • and to be used in Hong g Kong g Spe peci cial l (busy now) 干啥 的 ? (what is your job?) 忙呢 • • Adm dministr istrative tive Re Regi gion on, shall be acknowledged by ( take your time) 你忙 • the Consu sular lar Depa partment ment of th f the Minis istry ry of f 你 哪里 好? (why you think you are good?) 再见 ( see you later) • • Fo Foreig eign Aff ffairs rs of the People's Republic of China 你在 哪里 ? (where are you?) • 狗狗很可爱 (dogs are cute ) • 转心 (transform the heart) ,就是心里要去拿起一 (yes, they are cute) 你师父呢? (where is your master?) 是很可爱 • • 个正确的东西,否则心在 烦恼 (affliction) 中时是 师父在忙 (master is busy) • 很难转动的。要不断培养自己的 发心 (bodhicitta- 他 在忙啥? (what is he he doing?) • samutpada) ,让它越来越宽广,越来越清净, 烦恼自然就越来越少。 恨 (hatred) 也好, 念 闻何法啊? (how do you practice Dharma?) • (obsession) 也好,都是 妄想 (delusion) ,消耗心力、 (being not obsessive) 破除我执 • 迷障未来。 如何破除 呢? (how how?) •
Sentence Modeling by Recurrent Neural Network h 3 h 0 h 1 h 2 h n V 0 V 6 V 1 V 2 V 3 V 4 V 7 V 8 V 5 V 1 V 2 V 3 V 0 V n Embedding Layer Embedding Layer x 6 x 5 x 3 x 4 x 7 x 8 x 2 x 1 x 0 x 3 x n x 2 x 1 x 0
Anaphora Resolution q ' = = H ( q,C ) About 5% of the total queries Inpu put: q: q: cu current ent qu query Examples: c: c: co contenxt nxt Outpu Ou put: C 1: 你是 陈奕迅 粉丝吗 ? (are you a fan of Eason Chan? ) q ': current query after anaphora resolution (I like Jacky Cheung more) C 2: 更喜欢 张学友 H: replace pronouns in the current query with noun H: q : 为什么更喜欢 他 ? (Why like him more?) phrases in the context q ‘ : 为什么更喜欢 张学友 (Why like Jacky Cheung more?) C 1 : 你住哪儿 ? (where do you live? ) ( Bu’er Temple ) C 2 : 不二寺 。 (Where is it? ) q : 那 在哪儿? q ‘ : 不二寺 在哪儿? (Where is Bu’er Temple ? )
RNN for Anaphora Resolution 代消解 模型建立 Example: Query Context 为什么 更 喜欢 他 陈奕迅 粉丝 更 喜欢 张学友 C 1: 你是 陈奕迅 粉丝吗 ? C 2: 更喜欢 张学友 q : 为什么更喜欢 他 ? q ‘ : 为什么更喜欢 张学友 100K training data • Accuracy: 90% • Majority of the errors are • caused by the mistakes of entity 张学友 为什么更喜欢他 ( | ) P P m ax tagging A bad case: “他” (him) “张学友” (Jacky Chueng) C1: 你认识 贤三 吗 ? q' = 为什么更喜欢张学友 C2: 当然认识。 q : 他是你什么人? q ': 三 是你什么人?
Query Complement q ' = = H ( q,C ) About 15% of the total queries Inpu put: q: current query Examples: c: context Ou Outpu put: C 1: 那你会发表情包吗 ? (can you send emojis? ) q ': current query after query complement ( usually I don’t send emojis) C 2: 一般 不发 H: H: complete the current query with information in the q : 为什么? (Why?) context (Why not send emojis?) q ‘ : 为什么 不发表情包 C 1 : 讲个故事给我听 (tell me a story ) C 2 : 等我学会了给你讲哦 。 (I’ll tell you a story once I learn how to) (I’m waiting) q : 我等着 q ‘ : 我等着 听故事 (I’m waiting for the story)
RNN for Query Completiontt 代消解 模型建立 Training Sample: 我 等 着 听 故 C 1: 讲个故事给我听 C 2: 等我学会了给你讲哦 。 x q : 我等着 q ‘ : 我等着 听故事 y 我 等 着 听 ... ... 100,000 training instances • Accuracy: 70% • 讲 个 故 事 给 我 听 _E_ 等 Increased the engagement of ... • Xian’er Mechanical Monk by 11%
Query Completion Results in Real Dialogs 部分结果展示 Does your master like you? Need to ask him 你去问问师父喜欢你吗 ask Ask your master if he likes you 不会的,问你师父去 什么时候问必要
Sentence Similarity Computation 部分结果展示 Unsupervised word embedding approach is not good enough Similarity based Sentence 0 Sentence 1 Similar on Word Enough? Embedding 你是谁 (who are you) 我是谁 (who am I) 0.93 No 我爱你 (I love you) 你爱我 (you love me) 0.89 No 吃饭了吗 (Do you have lunch?) 吃饭了 (just had lunch) 0.84 No 你干嘛的 (what is your job?) 你干嘛呢 (who are you doing?) 0.93 No 有轮回吗 ? (Is 轮回有结束吗 (will the cycle of 0.73 No life end?) reincarnation true?) 会不会轮回 (will reincarnation 会不会轮回结束 (Will 0.84 No happen?) reincarnation end?) 随喜您 (you did it well) 您做的很好 (you did it well) 0.20 Yes
Supervised Learning for Sentence Similarity RNN for sentence similarity Question 0 Question 1 Feature Embedding Model Sentence features • unigrams bi-grams Comparison Features • word pairs from two sentences each edit operations vs. s. 什么 含义 什么 意思 match- 什么 - 什么 replace- 含义 - 意思
Recommend
More recommend