leveraging artificial intelligence and big data to create
play

Leveraging Artificial Intelligence and Big Data to Create Value - PowerPoint PPT Presentation

Leveraging Artificial Intelligence and Big Data to Create Value Dr. Sudha Ram Director, INSITE Center for Business Intelligence and Analytics Anheuser-Busch Professor of MIS, Entrepreneurship & Innovation Professor of Computer Science


  1. Leveraging Artificial Intelligence and Big Data to Create Value Dr. Sudha Ram Director, INSITE Center for Business Intelligence and Analytics Anheuser-Busch Professor of MIS, Entrepreneurship & Innovation Professor of Computer Science Eller College of Management Email: ram@eller.arizona.edu August 19, 2020 EROSS-2020

  2. BIG DATA: From Petabytes to ZettaBytes 2

  3. Meaning of “BIG”

  4. Meaning of “BIG”

  5. Big Data – Traditionally Defined VOLUME VARIETY VELOCITY VERACITY VALUE 5

  6. Diverse Sources of Data Many Different Sources generating Data

  7. An Internet Minute

  8. PARADIGM SHIFT Sensors embedded in “Datafication” of the Physical Objects PARADIGM SHIFT! IP Protocol based world communication

  9. Health Internet of Things

  10. Paradigm Shift Billions of Users and “Laboratory” for Objects Temporal and Spatial understanding the pulse Leaving Massive Traces of Dimensions of humanity Activity

  11. QUE QUEST f for the he HOL HOLY GR GRAIL Predicting the Future 12

  12. INSITE Center for Business Intelligence and Analytics • Interdisciplinary Research Center at University of Arizona • www.insiteua.org 13

  13. Creating a Smarter/Better World • Data Science and Network Science • Visualizations Using Time and Space • Scalable techniques for network analysis and graph mining • Predictive Modeling • Train students in Data science • Work on interesting research projects with industry partners to solve real world problems 14

  14. RESEARCH PROJECTS • Health Care • Education • News Media/Journalism SOCIAL • Crowdfunding IMPLICATIONS • Crowdsourcing • Internet of Things and Wearable devices • Social Media 15

  15. Leveraging Data Science • Define a problem/challenge • Identify signals • Use data science methods • Solve the problem Repurposing Data is Key 16

  16. PREDICTION MODELS  Predict Emergency Department Visits in near Real Time Using Big Data  Freshman Retention Prediction  COVID-19 Research 17

  17. Leverage Big data Big Data not just about volume • Social media • Internet search • Environmental sensors • Wearable sensors • Spatial and Temporal Dimensions • Fine Grained - Spatial/Temporal 18

  18. Focus on Asthma • 25 million people affected in the United States • 2 million emergency department (ED) visits • 0.5 million hospitalizations • 3,500 deaths • 50 billion dollars in medical costs annually • 11 million missed school days every year • 14 million missed work days every year Source: CDC Reports (2011, 2012) 19

  19. Pediatric asthma ER Visits, USA, 2011 20

  20. Our Research Objective Develop Robust Models to predict Asthma Related Emergency Department Visits in near Real Time Using Big Data Partner: Parkland Center for Clinical Innovation Joint work with Wenli Zhang, Dr. Yolande Pengetenze, Max Williams, funded in part by Parkland Center for Clinical Innovation 21

  21. Leverage Big data Big Data not just about volume • Social media • Internet search • Environmental sensors • Wearable sensors • Spatial and Temporal Dimensions • Fine Grained - Spatial/Temporal 22

  22. EXTRACTING SIGNAL from Noisy Data True asthma related tweets Not actually related to asthma 23

  23. Asthma Related Tweets 24

  24. Asthma Related Tweets 25

  25. Asthma Keywords Asthma Inhaler Wheezing Sneezing Runny Nose 26

  26. Asthma Keywords Asthma Inhaler Wheezing Sneezing Runny Nose 27

  27. Asthma-Related Stream Twitter Asthma Stream - United States Asthma related tweets, United States, (Asthma stream, 11 Oct, 2013 – 31 Dec, 2013) 28

  28. Extracting Signals Distinguish tweets that are relevant to asthma from tweets that mentioned asthma in an irrelevant context. 1. Tweets indicating awareness of disease, E.G., “Hope I don’t get an asthma attack again today..” 2. Using disease as rhetoric, e.G., “He is so cute I think I got asthma” 29

  29. Emergency Room Visits and Tweets 30

  30. Air Quality Sensor Data • Identify and include AQI data from a specific geographic region. • Collected pollution data from 27 air quality sites around the Dallas area. • Selected sites closest to the zip codes of the ED asthma patients in our ED visits dataset. Using this data, we calculated daily average AQI for our model. 31

  31. Pollutants • CO : Carbon monoxide • NO2 : Nitrogen dioxide • O3 : Ozone • Pb : Lead • PM2.5 : Atmospheric particulate matter, diameter of 2.5 micrometres or less • PM10 : Atmospheric particulate matter, diameter of 10 micrometres or less • SO : Sulfur monoxide 32

  32. EPA Pollution Sensor Data and Emergency Visits 33

  33. Prediction Models Using Streaming Data • Air Quality Sensor data streams • Tweets • Google Trends search data • Machine Learning Techniques to predict number of ED visits per day with high accuracy 34

  34. Best Predictors Successfully predicted with 80% accuracy • # of asthma tweets • CO • NO 2 • PM2.5 35

  35. USEFUL for Public Health NOTIFICATION I. Epidemiologic surveillance of asthma disease activity in the community, e.g., the department of health and human services (DHHS) II. Stakeholders notifications of community-level asthma- disease activity and risk factors 36

  36. Hospital/ED Preparedness Predicting asthma ED visits and staffing ED consequently 37

  37. Targeted Patient Interventions Targeted patient interventions using patient address and geo-localization data for tweets. E.g., patient alerts about asthma risks and counseling for preventive methods. 38

  38. Contributions Promising Results Demonstrate the utility and value of linking big data from diverse sources in developing predictive models for non-communicable diseases Specific focus on asthma Relevant for other chronic conditions – Diabetes, Cardiac problems, Obesity 39

  39. Internet of Things and Big Data Big Data for Improving Education Internet of Things: Smart Cards, Wifi Logs, Mobile Apps 40

  40. BUILDING A SMARTER CAMPUS 41

  41. Combining Network Science and Machine Learning Societal Challenge: Student Retention Proactive Prediction is very Important Social Science theories indicate: • Social Interactions • Regularity of Routine 42

  42. Objective Predict freshman retention at individual level Make proactive prediction before knowing first term GPA Learn students’ behavioral patterns from their CatCard transactions Provide actionable suggestions for retention management

  43. BIG DATA Institutional Student Dataset ~ 7000 full-time registered freshmen, 6500 are left after removing international students for whom SAT scores or high school GPAs were not available 479 (7.37%) drop-out after Fall and 843 (12.98%) drop-out at the end of Spring SmartCard Transaction Dataset 1.8 million transactions made by freshmen from Aug 2012 thru May 2013 271 different locations include restaurants, vending machines, printers, parking, labs.

  44. Behavior and Interactions

  45. Patterns and Differences 46

  46. Movement and Behavior

  47. COMPUTATIONAL and NETWORK SCIENCE APPROACH Fills gaps in behavioral and extant data-driven approaches New prediction approach CatCard transactions  implicit social networks and spatial sequences Proactive prediction Predicting retention before the end of 1 st semester with 90% recall

  48. COVID-19 Related Research Projects 49

  49.  What is Contact Tracing?  Digital vs. Manual Methods  Three Different methods a. Manual contact Tracing b. Manual with Digital assistance from Prompted Mobility Pathway aka Memory Jogger c. Digital: BlueTooth App for exposure notification 50

  50. 51

  51. Memory Jogger using Wifi Logs

  52.  Working with Jeremy Frumkin, Research and Discovery Technologies  Using Wifi network logs with Catcard data to support strategic efforts related to congestion tracking on campus and managing campus foot traffic  Understanding Movement Patterns among Campus spaces  Complementing app-based and manual contact tracing efforts with the additional insights that can be gained through the wifi logs.  Design a Memory Jogger – prompted Mobility pathway tool to enhance manual contact tracing 53

  53. Traffic/Crowd Analysis Select Date: Feb 3, 2020 Time 8 am-9 am Building Traffic on campus between 8am and 9am Top ten traffic spots visualized and compared with selected building (in red) User types Comparison of hourly Traffic in selected building

  54.  To compare the three methods for Contact Tracing and Exposure notification.  How do the three contact tracing approaches differ in their outcomes such as timeliness and coverage of contacts and other metrics?  How do these methods complement each other and what are their relative strengths and weaknesses?  How do these methods perform overall in preserving privacy while allowing for comprehensive contact tracing? What are the tradeoffs?  How acceptable are these three strategies to the community and what is an effective path to deploying comprehensive contact tracing? 55

Recommend


More recommend