Deep Text Mining of Instagram Data Without Strong Supervision WI - PowerPoint PPT Presentation

Deep Text Mining of Instagram Data Without Strong Supervision WI 2018 Santiago | International Conference on Web intelligence Kim Hammar, Shatha Jaradat, Nima Dokoohaki, and Mihhail Matskin KTH Royal Institute of Technology kimham@kth.se December 4, 2018 Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 1 / 19

Key enabler for Deep Learning: Data growth Annual Size of the Global Datasphere. Source: IDC 150 Zettabytes 100 50 0 2009 2012 2015 2017 2020 2023 2026 Year Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 2 / 19

Key enabler for Deep Learning: Data growth Annual Size of the Global Datasphere. Source: IDC 150 Zettabytes 100 50 0 2009 2012 2015 2017 2020 2023 2026 Year But what about Labeled Data? Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 2 / 19

b 0 b 1 x 0 , 1 x 1 , 1 y ˆ x 0 , 2 x 1 , 2 Prediction x 0 , 3 x 1 , 3 Supervised learning: Iteratively Minimize The Loss Function: L (ˆ y , y ) Ground truth � Labeled Training Data is Still a Bottleneck Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 3 / 19

Research Problem: Clothing Prediction on Instagram Instagram Post Image Model Clothing Prediction  dress = 0  coat = 1     .   .  .    skirt = 0 Text Model b 0 b 1 x 0 , 1 x 1 , 1 y ˆ x 0 , 2 x 1 , 2 x 0 , 3 x 1 , 3 Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 4 / 19

This Paper: Text Classification Without Labeled Data Text Mining Analytics Mention of brand “foo” over time post 1 post 2 30 b 0 b 1  w 1 , 1 . . . w 1 ,n  post 3 Mentions post n 20 . ... .   x 0 , 1 x 1 , 1 . .  . .  ˆ y     10 x 0 , 2 x 1 , 2 w n, 1 . . . w n,n 0 x 0 , 3 x 1 , 3 04.2017 05.2017 06.2017 07.2017 08.2017 09.2017 10.2017 11.2017 12.2017 01.2018 02.2018 03.2018 Word Embeddings Neural Networks Trends detection User recommendations Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 5 / 19

Example Instagram Post Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 6 / 19

Challenge: Noisy Text and No Labels A case study of a corpora with 143 fashion accounts, 200K posts, 9M comments Challenge 1: Noisy Text with a Long-Tail Distribution Log-Log plot over the frequency of text per post 10 4 Comments Posts with Words 0 comments 10 3 Text Statistic Fraction of corpora size Average/post Log frequency Emojis 0 . 15 48 . 63 Posts with 0 words 10 2 Hashtags 0 . 03 9 . 14 (comments+caption+tags) User-handles 0 . 06 18 . 62 Google-OOV words 0 . 46 145 . 02 Aspell-OOV words 0 . 47 147 . 61 10 1 10 0 10 0 10 1 10 2 10 3 10 4 10 5 Log count Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 7 / 19

Challenge: Noisy Text and No Labels A case study of a corpora with 143 fashion accounts, 200K posts, 9M comments Challenge 1: Noisy Text with a Long-Tail Distribution Log-Log plot over the frequency of text per post 10 4 Comments Posts with Words 0 comments 10 3 Text Statistic Fraction of corpora size Average/post Log frequency Emojis 0 . 15 48 . 63 Posts with 0 words 10 2 Hashtags 0 . 03 9 . 14 (comments+caption+tags) User-handles 0 . 06 18 . 62 Google-OOV words 0 . 46 145 . 02 Aspell-OOV words 0 . 47 147 . 61 10 1 10 0 10 0 10 1 10 2 10 3 10 4 10 5 Log count Challenge 2: Lack of Expensive Labeled Training Data Raw Instagram Text Human Annotations � � Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 7 / 19

Alternative Sources of Supervision That Are Cheap but Weak Sources of Weak Supervision Strong supervision: Manual annotation by expert Domain Heuristics Weak supervision: A Strong supervision Database Combiner signal that does not have full APIs coverage/perfect accuracy Crowdworkers Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 8 / 19

Weak Supervision in the Fashion Domain Open APIs: 1 https://github.com/jolibrain/deepdetect Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 9 / 19

Weak Supervision in the Fashion Domain Open APIs: Pre-trained Clothing Classificiation Models: DeepDetect 1 1 https://github.com/jolibrain/deepdetect Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 9 / 19

Weak Supervision in the Fashion Domain Open APIs: Pre-trained Clothing Classificiation Models: DeepDetect 1 Text mining system based on a fashion ontology and word embeddings: Instagram Post p ∈ P Ontology O Word Rankings   w 1 , 1 . . . w 1 , n Caption . . ...  . .  Happy Monday! Here is my outfit of . . Brands     Ranked Noisy Labels � r the day #streetstyle #me #canada #goals w n , 1 . . . w n , n Items: � ( bag , 0 . 63 ) , Word Embeddings V #chic #denim ( jeans , 0 . 3 ) , ( top , 0 . 1 ) � Items Brands: Tags Edit-distance Linear � ( Gucci , 0 . 8 ) , ( Zalando , 0 . 3 ) � Zalando user1 user2 Combination Material: � ( Denim , 1 . 0 ) � Patterns tfidf ( w i , p , P ) . . Comments . term-score t ∈ I love the bag! Is it Gucci? Materials { caption , comment , #goals @username user-tag , hashtag } I #want the #baaag Wow! The #jeans You are suclh Styles ProBase an inspirationn, can you follow me back? 1 https://github.com/jolibrain/deepdetect Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 9 / 19

How To Combine Several Sources Of Weak Supervision? Simplest way to combine many weak signals: Majority Vote Recent research on combination of weak signals: Data Programming 2 2 Alexander J Ratner et al. “Data Programming: Creating Large Training Sets, Quickly”. In: Advances in Neural Information Processing Systems 29 . Ed. by D. D. Lee et al. Curran Associates, Inc., 2016, pp. 3567–3575. URL: http://papers.nips.cc/paper/6523-data-programming-creating-large-training-sets-quickly.pdf . Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 10 / 19

Model Weak Supervision With Generative Model Weak labels Combined labels     w 1 , 1 . . . w 1 , n Generative Model w 1 unlabeled Labeling functions . ... . .  . .   .  . . .     π α,β (Λ , Y ) data λ 1 . . . λ n     w n , 1 . . . w n , n w n Model weak supervision as labeling functions λ i λ i ( unlabeled data ) → label Learn Generative Model π α,β (Λ , Y ) over the labeling process. Based on conflicts between labeling functions assign the functions an estimated accuracy α i . Based on empirical coverage of labeling functions assign the functions a coverage β i . Given α and β for each labeling function, it can be used to combine labels into a single probabilistic label Give more weight to high-accuracy functions If there is a lot of disagreement → low probability label If all labeling functions agree → high probability label Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 11 / 19

Data Programming Intuition Low accuracy labeling functions High accuracy labeling functions � � “it is a coat” � “it is not a coat” � Probabilistic Label: 0 . 6 probability that it is a coat Majority Vote: 1 . 0 probability that it is not a coat Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 12 / 19

Extension of Data Programming to Multi-Label Classification Problem: Data programming only defined for binary classification in original paper To make it work for multi-class setting: model labeling function as λ i → k i ∈ { 0 , . . . , N } instead of λ i → k i ∈ {− 1 , 0 , 1 } . Idea 1 for multi-label: model labeling function as λ i → � k i = { v 0 , . . . , v n } ∧ v j ∈ {− 1 , 0 , 1 } Idea 2 for multi-label: learn a separate generative model for each class, and let each labeling function give binary output for each class λ i , j → k i , j ∈ {− 1 , 0 , 1 } . Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 13 / 19

Trained Generative Models: Labeling Functions’ Accuracy Differ Between Classes Predicted accuracy in generative model 1 . 0 Clarifai Deepomatic 0 . 8 DeepDetect Accuracy Google Cloud Vision SemCluster 0 . 6 KeywordSyntactic KeywordSemantic 0 . 4 accessories bags blouses coats dresses jackets jeans cardigans shoes skirts tights tops trousers Classes Figure: Multiple generative models can capture a different accuracy for labeling functions for different classes. Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 14 / 19

Putting Everything Together 1 Apply weak supervision to unlabeled data (open APIs, pre-trained models, domain heuristics etc.) Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 15 / 19

Putting Everything Together 1 Apply weak supervision to unlabeled data (open APIs, pre-trained models, domain heuristics etc.) 2 Combine labels using majority voting or generative modelling (data programming) Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 15 / 19

Putting Everything Together 1 Apply weak supervision to unlabeled data (open APIs, pre-trained models, domain heuristics etc.) 2 Combine labels using majority voting or generative modelling (data programming) 3 Use the combined labels for training a discriminative model using supevised machine learning. Kim Hammar (KTH) Text Mining in Social Media December 4, 2018 15 / 19

Deep Text Mining of Instagram Data Without Strong Supervision WI - PowerPoint PPT Presentation

Deep Text Mining of Instagram Data Without Strong Supervision WI 2018 Santiago | International Conference on Web intelligence Kim Hammar, Shatha Jaradat, Nima Dokoohaki, and Mihhail Matskin KTH Royal Institute of Technology kimham@kth.se

SCALING INSTAGRAM INFRA Lisa Guo Nov 7th, 2016 lguo@instagram.com INSTAGRAM HISTORY 2012/4/3

Value Proposition By Kate Ray PLAN OF ACTION CREATE A BUSINESS INSTAGRAM ACCOUNT Using INSTAGRAM

INSTAGRAM #CambSMmeetup @lenkakopp Is INSTAGRAM right for your business? Additional Resources

SCALING INSTAGRAM INFRA Lisa Guo March 7th, 2017 lguo@instagram.com INSTAGRAM HISTORY 2010

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Data Mining 2020 Text Classification Naive Bayes Ad Feelders Universiteit Utrecht Ad Feelders

INSTAGRAM MASTERCLASS MIKE KRIEGER KEVIN SYSTROM @UNTAPPEDDIGITAL WEVE GONE FROM LOOKING

40 Billion photos have been shared on Instagram Insert All Instagram Slides Here Mention LDS

Instagram Basics Today well cover: Creating and Downloading an Instagram account

INSTAGRAM PRIMER & TIPS DAN FOLEY How To Reach People Without Ads Facebook Shares, Retweets

Introduction What is data mining? to Data Mining: On what kind of data? Data Mining

Text Mining Text Mining Web pages Emails Technical documents Corporate documents

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Data Mining in Bioinformatics Day 4: Text Mining Karsten Borgwardt February 25 to March 10

Stack Stack Heap Heap Data Data Text Text Program A Program B Stack Stack Text Heap

Automatic Quality Estimation for Natural Language Generation: Ranting (Jointly Rating and

Meta-Kernelization with Structural Parameters Robert Ganian Friedrich Slivovsky Stefan Szeider

Rank and Range Vector Spaces Marco Chiarandini Department of Mathematics & Computer Science

Comparing the Network Modeling Techniques Ivan Kendzor, Max Helm Friday 25 th January, 2019 Chair

Ranking Factors of Team Success Nataliia Pobiedina, Julia Neidhardt, Maria del Carmen Calatrava

What is a factor? Introduction to R for Finance Stocks or bonds? Investment = 2 stock = 1

Making Decisions via Simulation [Law, Ch. 10], [Handbook of Sim. Opt.], [Haas, Sec. 6.3.6] Peter

DRUPAL PERFORMANCE A Surgical Approach 1 1 @mandclu MARTIN ANDERSON-CLUTZ 2

Deep Text Mining of Instagram Data Without Strong Supervision WI - PowerPoint PPT Presentation

Deep Text Mining of Instagram Data Without Strong Supervision WI 2018 Santiago | International Conference on Web intelligence Kim Hammar, Shatha Jaradat, Nima Dokoohaki, and Mihhail Matskin KTH Royal Institute of Technology kimham@kth.se

SCALING INSTAGRAM INFRA Lisa Guo Nov 7th, 2016 lguo@instagram.com INSTAGRAM HISTORY 2012/4/3

Value Proposition By Kate Ray PLAN OF ACTION CREATE A BUSINESS INSTAGRAM ACCOUNT Using INSTAGRAM

INSTAGRAM #CambSMmeetup @lenkakopp Is INSTAGRAM right for your business? Additional Resources

SCALING INSTAGRAM INFRA Lisa Guo March 7th, 2017 lguo@instagram.com INSTAGRAM HISTORY 2010

10 slides that always work Simple text boxes (I) Sample text Sample text Sample text

Web Mining Web Mining Web Mining Web Mining Web mining is the use of data mining techniques

Data Mining 2020 Text Classification Naive Bayes Ad Feelders Universiteit Utrecht Ad Feelders

INSTAGRAM MASTERCLASS MIKE KRIEGER KEVIN SYSTROM @UNTAPPEDDIGITAL WEVE GONE FROM LOOKING

40 Billion photos have been shared on Instagram Insert All Instagram Slides Here Mention LDS

Instagram Basics Today well cover: Creating and Downloading an Instagram account

INSTAGRAM PRIMER &amp; TIPS DAN FOLEY How To Reach People Without Ads Facebook Shares, Retweets

Introduction What is data mining? to Data Mining: On what kind of data? Data Mining

Text Mining Text Mining Web pages Emails Technical documents Corporate documents

CONTENT TITLE Insert Subtitle Here Enter Text Here Enter Text Here Enter Text Here

Data Mining in Bioinformatics Day 4: Text Mining Karsten Borgwardt February 25 to March 10

Stack Stack Heap Heap Data Data Text Text Program A Program B Stack Stack Text Heap

Automatic Quality Estimation for Natural Language Generation: Ranting (Jointly Rating and

Meta-Kernelization with Structural Parameters Robert Ganian Friedrich Slivovsky Stefan Szeider

Rank and Range Vector Spaces Marco Chiarandini Department of Mathematics &amp; Computer Science

Comparing the Network Modeling Techniques Ivan Kendzor, Max Helm Friday 25 th January, 2019 Chair

Ranking Factors of Team Success Nataliia Pobiedina, Julia Neidhardt, Maria del Carmen Calatrava

What is a factor? Introduction to R for Finance Stocks or bonds? Investment = 2 stock = 1

Making Decisions via Simulation [Law, Ch. 10], [Handbook of Sim. Opt.], [Haas, Sec. 6.3.6] Peter

DRUPAL PERFORMANCE A Surgical Approach 1 1 @mandclu MARTIN ANDERSON-CLUTZ 2

INSTAGRAM PRIMER & TIPS DAN FOLEY How To Reach People Without Ads Facebook Shares, Retweets

Rank and Range Vector Spaces Marco Chiarandini Department of Mathematics & Computer Science