SNAP: Social Media and Network Analy5cs for Public Health Henry Kautz Goergen Ins1tute for Data Science, University of Rochester Idea Social media users = distributed sensor network Challenge Discovering signal in noise

  1. SNAP: Social Media and Network Analy5cs for Public Health • Henry Kautz • Goergen Ins1tute for Data Science, University of Rochester • Idea • Social media users = distributed sensor network • Challenge • Discovering signal in noise • Method • Machine learning • Applica1ons • Tracking influenza !me-lapse heatmap of tweets from NYC • Measuring alcohol use • Improving food safety

  2. Analyzing Tweets • Goal: find rare tweets about specific disease symptoms (1 / 50,000) Previous approach: keywords • Problems: “sick of homework”, “under the weather” • • Our approach: machine learning Use “Mechanical Turk” workers to create training data • 98% accuracy • Training Data Sick Tweets Contains Machine “sneeze”? Learning System “sick”? “1red”?

  3. Twi@erhealth Using a “flu symptom” classifier, we can: • Measure flu levels accurately and quickly • Predict risk of par1cular users catching the flu • Discover correla1ons between flu and factors such as air pollu1on

  4. nEmesis • Listen to tweets to find possible cases of food poisoning • Use results to priori1ze restaurant inspec1ons • 3 month trial is Las Vegas: proved effec1ve in double blind trial • CDC funding 5 year expansion

  5. GeoDrink • Understanding paZerns of alcohol use in communi1es • Infer loca1ons of users’ homes and the exact 1me and place of drinking


