From AI K to AI D : Acquiring Social Media Intelligence via `Big - PowerPoint PPT Presentation

From AI K to AI D : Acquiring Social Media Intelligence via `Big’ Data Huan Liu Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 1 Data Mining and Machine Learning Lab

Thanks to Former & Current PhD Students Robert Trevino, AFRL • Reza Zafarani, Asst Prof, Syracuse U • Yunzhong Liu, LeEco, US Xia Hu, Asst Prof, Texas A&M U • • Magdiel Galan, Intel Somnath Shahapurkar, FICO • • Shamanth Kumar, Castlight Health Fred Morsta\er, USC ISI • • Pritam Gundecha, IBM Res Almaden Christophe Faucon • • Jiliang Tang, Asst Prof, MSU • Isaac Jones • Huiji Gao, LinkedIn • Suhas Ranganath • Ali Abbasi, Machine Zone • Suhang Wang • Salem Alelyani, Asst Prof, King Khalid U • Tahora Nazer • Xufei Wang, LinkedIn • Jundong Li • Geoffrey Barbier, AFRL • Liang Wu • Lei Tang, Clari • Ghazaleh Beigi • Zheng Zhao, Google • Kai Shu • NiUn Agarwal, Chair Prof, UALR • JusUn Sampson • Sai Moturu, PostDoc, MIT Media Lab • Lei Yu, Assc Prof, Binghamton U, NY • Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 2 Data Mining and Machine Learning Lab

A Tortuous but Fortuitous Path to Social CompuIng Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 3 Data Mining and Machine Learning Lab

From AI K to AI D • “ K nowledge is Power”: AI was then solely about K – Expert Systems or Rule-based Systems • “Intelligence is ten million rules.” – Knowledge-based Systems (Cyc) • “ D ata is the New Oil”: AI is now hyped up with D – Big data is ubiquitous – CS, StaUsUcs, InformaUon Science è Data Science • Recent surge of AI is powered by Data – Machine Learning (including Deep Learning) – For any learning algorithm to work, data is key Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 4 Data Mining and Machine Learning Lab

Big Social Media Data Facebook Degree DistribuUon • Twi\er – 300 million users – 500 million tweets / day – 1% (5 million) released for research • Facebook – 2 billion users Instagram Users over Time – 422 million updates / day – 196 million photos / day • Instagram – 700 million users – 80 million photos / day Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 5 Data Mining and Machine Learning Lab

Discovering Social Media Intelligence • Graph Theories • Network Measures and Models • Data Mining, NLP, and Visual AnalyUcs • Community DetecUon and Analysis • InformaUon Diffusion • Influence and Homophily • Recommender Systems • Behavior AnalyUcs – SenUment Analysis Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 6 Data Mining and Machine Learning Lab

Some Challenges in Acquiring SM Intelligence • Social media data seems really big, but why are we onen sUll short of data? – How can we make data ` bigger ’? • Data is power, so it can produce any result – Can we algorithmically evaluate the results from big data? • We don’t know what we don’t know – How can we know if our result of social media analysis is of any value? Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 7 Data Mining and Machine Learning Lab

Making Big Data “Bigger” • What is big data? – A convenUonal answer is 4Vs – A pracUUoner’s answer is more nuanced • Big data can be actually li.le or thin • For machine learning or data mining to work, the more data, the be,er – Make li\le data bigger – Make thin data thicker Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 8 Data Mining and Machine Learning Lab

Curse of Dimensionality: Required Samples • Sparsity becomes exponenUally worse as feature dimensionality increases – ConvenUonal distance metric becomes ineffecUve as far and near neighbors have similar distances 3 samples per unit region 1 sample per region 1/3 sample per region http://nikhilbuduma.com/2015/03/10/the-curse-of-dimensionality / Arizona State University Recent Advances in Feature SelecIon: A Data PerspecIve KDD2017 Tutorial, Halifax, Canada 9 Michigan State University

Relevant, Redundant and Irrelevant Features • Feature selecUon retains relevant features for learning and removes redundant or irrelevant ones • For a binary classificaUon task below, f 1 is relevant, f 2 is redundant given f 1 , and f 3 is irrelevant Arizona State University Recent Advances in Feature SelecIon: A Data PerspecIve KDD2017 Tutorial, Halifax, Canada 10 Michigan State University

Feature SelecIon Feature selecUon selects an `opUmal’ subset of relevant features from the original high- dimensional data given a certain criterion Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 11 Data Mining and Machine Learning Lab

Feature SelecIon and scikit-feature • Feature selecUon can make data `bigger’ – Assuming all binary a\ribute values in our toy example – Before FS, 5/2 10 = 5/1024, aner FS, 5/2 3 = 5/8 • Does FS always work? – Yes, for most high-d data • Where can we find it? • scikit-feature , an open- source repository in Python Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 12 Data Mining and Machine Learning Lab

Making Thin Data • Most people like many of us are in the long tail – Our data is thin or sparse – With li\le data, machine learning is powerless • Social media data offers new opportuniUes – MulUple facets: posts, profile, linked informaUon – MulUple platorms that offer different funcUons • Two case studies – Feature selecUon using social network informaUon – ConnecUng users across more than one social media site Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 13 Data Mining and Machine Learning Lab

Making Sense of Big Data • For big social-media data, we want to automaUcally get a sense of what it is – User needs, senUment, opinions, behavior, and trends • A big part of big data is TEXT • NLP and text mining can help extract topics from text • If these machine-learned topics are for human consumpUon, are they actually comprehensible? – How can comprehensibility be measured? Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 14 Data Mining and Machine Learning Lab

Measuring Topic Interpretability • How to measure interpretability of topics generated from machine learning? • One common way is to indirectly measure predicUve performance of these learned topics – The higher the performance (say, accuracy), the be\er – Does it really measure interpretability? – Human experts seem to be the best evaluator • But involving human experts in evaluaUon may not be scalable and reproducible • Hence, it is a challenging problem Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 15 Data Mining and Machine Learning Lab

Big Text Data • Some example corpora: Source Size Wikipedia 36 million arUcles World Wide Web 100+ billion staUc web pages Social Media 500 million new tweets each day • Too much data to read • How can we begin to understand all of these large bodies of text data? Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 16 Data Mining and Machine Learning Lab

Topic Models Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 17 Data Mining and Machine Learning Lab

Measuring Interpretability • How do we measure the interpretability of staUsUcal topic models • A dilemma – Experts are credible , but not scalable , – Crowdsourcing needs no experts , so scalable , but has no exper4se , thus is not credible Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 18 Data Mining and Machine Learning Lab

A Measure of Topic Interpretability • Model Precision • It shows a Turker 6 words in random order – Top 5 words from the topic – 1 “Intruded” word – Ask the Turker to idenUfy the “Intruded” word MP model,topic = # Correct Guesses /Total # Guesses Topic i : cat dog bird truck horse snake Chang, Jonathan, Sean Gerrish, Chong Wang, Jordan L. Boyd-Graber, and David M. Blei. "Reading Tea Leaves: How Humans Interpret Topic Models." In Advances in Neural InformaUon Processing Systems, pp. 288-296. 2009. Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 19 Data Mining and Machine Learning Lab

Observing Model Precision (MP) What does Model Precision measure? What doesn’t Model Precision measure? It seems we need another measure Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 20 Data Mining and Machine Learning Lab

Measuring Coherence – Another Measure • Model Precision Choose Two • Nearly the same setup as Model Precision: – Difference: A Turker is asked to choose top two words • IntuiUon: if the topic is coherent, then it would be difficult to consistently choose a second word Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 21 Data Mining and Machine Learning Lab

A ComparaIve Example Model Precision Model Precision Choose Two Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 22 Data Mining and Machine Learning Lab

From AI K to AI D : Acquiring Social Media Intelligence via `Big - PowerPoint PPT Presentation

From AI K to AI D : Acquiring Social Media Intelligence via `Big Data Huan Liu Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 1 Data Mining and Machine Learning Lab Thanks to Former & Current PhD

Presentation 1 What is social media? Get Media Smart social media 2 What is social media?

Social Media Legal Issues Brian C. England Deputy City Attorney Garland, Texas March 7, 2018

Social Media for Mason AGENDA What is Social Media Social Media Strategy Content

Social Media donts What is social media Social media is nothing new Just an extension

DIGITAL ANALYTICS in Social Media Enterprise Solution For Todays Social Media DIGITAL

Social Media Analytics Ahmed Abbasi University of Virginia 1 Outline Social Media Overview

Clinical Research Resources at UCSF-GIVI Acquiring data from relevant populations Acquiring

The PSRs market review into card-acquiring services 8 th PSE Merchant Acquiring Conference 14

Getting Social What is social media? Why does social media matter? What social media

Social Media Seminar for Development Educators Part 1: Social Media Basics How are these

Social Media for Business July 28, 2009 What is it? Social media marketing also known as social

network science and social science on Twitter mor naaman rutgers SC&I | social media

Presentation 2 Why is there advertising on social media? Get Media Smart social media 2

Social Media Week BEIRUT Social Media versus Traditional Media; The contradictory results of the

Digital Media Addiction Smart Phones, Social Media and Suicide Fact: Social Media is a

Contents Introduction What is social media Social media overview Classification of

1 Introduction We consider a scenario where two parties having private databases wish to

Data Mining with Weka Department of Computer Science University of Waikato New Zealand

Co nc e pt a nd Applic a tio ns o f Da ta Mining We e k 1 Topics Topics Introduction

1 "#$%

Data Mining Hypothesis Evaluation Hamid Beigy Sharif University of Technology Fall 1396 Hamid

An Overview of CS512 @Spring 2020 JIAWEI HAN COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT

Data Mining meets Football (soccer) Ulf Brefeld Knowledge Mining & Assessment TU Darmstadt

Physics Analysis with Advanced Data Mining Techniques Hai-Jun Yang University of Michigan, Ann

From AI K to AI D : Acquiring Social Media Intelligence via `Big - PowerPoint PPT Presentation

From AI K to AI D : Acquiring Social Media Intelligence via `Big Data Huan Liu Arizona State University SBP-BRiMS2017, DC AI, Social Media Intelligence, Big Data 1 Data Mining and Machine Learning Lab Thanks to Former & Current PhD

Presentation 1 What is social media? Get Media Smart social media 2 What is social media?

Social Media Legal Issues Brian C. England Deputy City Attorney Garland, Texas March 7, 2018

Social Media for Mason AGENDA What is Social Media Social Media Strategy Content

Social Media donts What is social media Social media is nothing new Just an extension

DIGITAL ANALYTICS in Social Media Enterprise Solution For Todays Social Media DIGITAL

Social Media Analytics Ahmed Abbasi University of Virginia 1 Outline Social Media Overview

Clinical Research Resources at UCSF-GIVI Acquiring data from relevant populations Acquiring

The PSRs market review into card-acquiring services 8 th PSE Merchant Acquiring Conference 14

Getting Social What is social media? Why does social media matter? What social media

Social Media Seminar for Development Educators Part 1: Social Media Basics How are these

Social Media for Business July 28, 2009 What is it? Social media marketing also known as social

network science and social science on Twitter mor naaman rutgers SC&amp;I | social media

Presentation 2 Why is there advertising on social media? Get Media Smart social media 2

Social Media Week BEIRUT Social Media versus Traditional Media; The contradictory results of the

Digital Media Addiction Smart Phones, Social Media and Suicide Fact: Social Media is a

Contents Introduction What is social media Social media overview Classification of

1 Introduction We consider a scenario where two parties having private databases wish to

Data Mining with Weka Department of Computer Science University of Waikato New Zealand

Co nc e pt a nd Applic a tio ns o f Da ta Mining We e k 1 Topics Topics Introduction

1 &quot;#$%

Data Mining Hypothesis Evaluation Hamid Beigy Sharif University of Technology Fall 1396 Hamid

An Overview of CS512 @Spring 2020 JIAWEI HAN COMPUTER SCIENCE UNIVERSITY OF ILLINOIS AT

Data Mining meets Football (soccer) Ulf Brefeld Knowledge Mining &amp; Assessment TU Darmstadt

Physics Analysis with Advanced Data Mining Techniques Hai-Jun Yang University of Michigan, Ann

network science and social science on Twitter mor naaman rutgers SC&I | social media

1 "#$%

Data Mining meets Football (soccer) Ulf Brefeld Knowledge Mining & Assessment TU Darmstadt