How GPUs Power Comcast's X1 Voice Remote and Smart Video Analytics Jan Neumann Comcast Labs DC May 10th, 2017
Comcast Applied Artificial Intelligence Lab Media & Video Analytics Smart TV Voice & Deep Learning Smart Home NLP Data Science Smart Internet Recommendations & Search 2
Today: How Comcast Uses AI to Evolve and Reinvent the TV Experience Media & Video Analytics Smart TV Voice & Deep Learning Smart Home NLP Data Science Smart Internet Recommendations & Search 3
Online Video Netflix AI for Content Discovery –Voice Search LIVETV 4
X1 Smart TV with Voice Query: “HBO” • Voice remote ASR Set-top Box TV NLP modules Answer Selector action query 5
Open NLP: Multiple Domains with Voice TV HOME Answer Domain Answer Selector query response Selector Selector NEWS . CUSTOMER . CARE . . . . 6
Open NLP: Multiple Domains with Voice 0.15 TV 0.80 HOME turn Answer Domain on Answer Selector response Selector the Selector heat 0.02 NEWS 0.03 CUSTOMER CARE Threshold=0.10 Selected={TV, Home} Precision=100% Applicable={TV, Home} Recall=100% 7
Open NLP: Multiple Domains with Voice 0.04 TV 0.03 HOME Answer Show me Domain Answer Selector my response Selector Selector password 0.03 NEWS 0.90 CUSTOMER CARE Threshold=0.10 Selected={Customer Care} Precision=100% Applicable={Customer Care} Recall=100% 8
Domain Selector in Practice • Cascade of Deep Learning Models of increasing complexity Entity YES Detection “HBO” Service NO Simple Model NO YES NO YES Complex Model SEND DO NOT SEND TO DOMAIN TO DOMAIN 9
Domain Selector in Practice • Cascade of Deep Learning Models of increasing complexity Entity “Show me YES Detection funny Service comedies” NO Simple Model NO YES NO YES Complex Model DO NOT SEND SEND TO DOMAIN TO DOMAIN 10
X1 Smart TV with Voice Query: “who plays the oracle in matrix” • ASR Set-top Box TV Voice remote NLP modules QA Question Answer (text) (id or text) action query 11
First-order Question Answering • Given: • Question in natural-language form q • Structured knowledge base that contains list of facts • [ subject – relation – (attribute) – object ] subject “9/1/1956” object “Tom Hanks” “Keanu Reeves” “Matrix” • Return: • Answer to q attribute “Neo” • Assuming: • q answerable by a single fact. • Source entity mentioned in q . • Answer is neighbor of source entity node. 12
Question Answering with Knowledge Graph Subj= e 1 Structured Predict Obj= ? relation Query Rel= r Relation r Question Search Extract How old is names / titles Tom Hanks? Entities [ e 1 , …, e N ] e 1 | r | e 2 Generate Answer subj | rel | obj Knowledge Train Text answer Graph 13
Question Answering with Knowledge Graph Subj= Tom Hanks Subj= e 1 Structured Rel= birth Predict Obj= e 2 relation relation birth Query Obj = ? Relation Rel= r r r Question Search Tom Extract How old is names / titles names / titles Tom Hanks? Hanks Entities Tom Hanks [ e 1 , …, e N ] [ e 1 , …, e N ] Tom Hanks | birth | 1956 e 1 | r | e 2 is 55 years old. Generate Answer subj | rel | obj Tom Hanks Knowledge Train Text answer is 59 years old Graph 14
Question Answering with Knowledge Graph using Recurrent Neural Networks (RNNs) subj= e Structured Predict relation Query obj= ? Question r Relation attr=? rel= r names / titles Entity [ e 1 , …, e N ] Detection Entity Detection ~ Tagging Relation Prediction ~ Classification memory NA Subj Subj NA NA memory place of birth where Tom where Tom Hanks was Hanks was born born 15
Recurrent Neural Networks LOC PER LOC PER 0.39 0.61 0.89 0.11 output memory hidden input word heights washington 16
Online Video Netflix AI for Content Discovery – Automatic Content Analysis LIVETV 17
Most metadata is at the asset level • Genres • Credits • Synopsis • Keywords 18
Much more data exists within the asset • Chapters • Moments • Annotations Movie Frame Shot Scene Chapter 19
Why is this useful? In-game What are the best highlight moments on TV? navigation Who is in this scene? Search & Recommendations 20
How does Automatic Content Analysis work? Video Computer Vision Chaptering AI & Audio Scene-level Machine Analysis Annotations Learning Frame-level Natural Annotations Language Processing 21
Why is it possible now? Better Big Cloud/GPU Algorithms Data Computing (Deep learning) Large-scale Image recognition performance 22
Super-human accuracy in speech and image recognition! Better Big Cloud/GPU Algorithms Data Computing (Deep learning) Large-scale Image recognition performance 23
New experiences! Better Big Cloud/GPU Algorithms Data Computing (Deep learning) 24
Example Application: In-Game Highlights • Place highlights over games recorded onto customers’ DVRs for football, baseball, hockey, basketball and soccer. “In-Game Highlights” Feature for NFL has been released on Comcast X1 last fall “I’ll record as many games as I can. When I don’t want to watch the whole game, it’s a great way to do it.” – Customer Testimonial 25
Online Video Netflix AI for Content Discovery – Personalization LIVETV 26
Personalized Entertainment Experiences What is popular right now? What do you like? + Personalized Recommendations = 27
What should I watch right now? Deep learning-based recommender system for Live TV - Training a joint embedding space to combine the scores - Channel- and Program-based recommendations - Time-dependent recommendations - Trending/popular and personal favorite channels, programs, sport teams - Rich content descriptions from automatic content analysis Favorite Favorite Content Collaborative Trending Channels Programs Descriptions Filtering Popularity Live TV Recommender System 28
Online Video Netflix Deep Learning Infrastructure LIVETV 29
Deep Learning Infrastructure • Deep Learning Frameworks – Keras, Tensorflow, Theano, PyTorch, Caffee (older models) • All deployments using nvidia-docker – Thanks to Nvidia solutions team to help with best practices • All deep learning training done on multi-GPU servers – NvidiaTesla (Production) and 8xTitan X (Dev) GPUs – Nvidia DGX-1 for large scale training – video and nlp • Next steps – Container scheduler – Kubernetes and Hashicorp Nomad – Network compression/simplification for increased efficiency (TensorRT) 30
Deep Learning-based ML is applied everywhere at Comcast Machine Learning Data Science Big Data High Speed Internet AI Video Improving Customer IP Telephony Experience Home Security / Everywhere at Automation Comcast/NBCU Universal Parks For more info see: dclabs.comcast.com Media Properties 31
Recommend
More recommend