Inferring Travel from Social Media Alessio Signorini <alessio-signorini@uiowa.edu> Alberto Maria Segre <alberto-segre@uiowa.edu> Philip Polgreen <philip-polgreen@uiowa.edu>
ONCE UPON A TIME...
H1N1 TWEET VOLUME CDC recommends canceling travels plans Pandemic level raised to 5 Number of confirmed cases reach 1000
AMERICAN IDOL 2009 More Positive Tweets about Kris Allen 45 40 ALLEN 35 % positive tweets 30 vs. 25 20 15 10 5 LAMBERT 0 days
REAL-TIME ILI% ESTIMATE Reported vs. Predicted Weekly ILI% Flu Season 2009-2010 - United States 8.5 8 7.5 7 6.5 6 5.5 5 4.5 % ILI 4 Predicted 3.5 3 2.5 2 1.5 Reported 1 0.5 0 09/40 09/41 09/42 09/43 09/44 09/45 09/46 09/47 09/48 09/49 09/50 09/51 09/52 10/01 10/02 10/03 10/04 10/05 10/06 10/07 10/08 10/09 10/10 10/11 10/12 10/13 10/14 10/15 10/16 10/17 10/18 10/19 10/20 1-fold validation ~ error avg=0.28%, min=0.04%, max=0.93%. Std=0.23%
DEFINITELY A GOOD IDEA!
SICK PEOPLE STILL TRAVEL + + CURRENT FLU MAP TRAVEL MODEL
TRAVEL MODELS CENSUS TRAFFIC TICKETS MONEY CELL PHONES
TRAVEL MODELS CENSUS TRAFFIC TICKETS MONEY CELL PHONES
TRAVEL MODELS CENSUS TRAFFIC TICKETS MONEY CELL PHONES
TRAVEL MODELS CENSUS TRAFFIC TICKETS MONEY CELL PHONES
TRAVEL MODELS CENSUS TRAFFIC TICKETS MONEY CELL PHONES
TRAVEL MODELS CENSUS TRAFFIC TICKETS MONEY CELL PHONES
GPS ADDED TO CELL PHONES STEVE JOBS
GPS ADDED TO CELL PHONES STEVE JOBS
LOCATION-BASED APPS
FOCUSING ON THE MOST POPULAR CHECK-IN TO PLACES TO EARN BADGES TWEETS CAN BE GEO-LOCATED
FOLLOW PEOPLE EVERYWHERE RESTAURANT BAR DOCTOR GYM OFFICE STARBUCKS
DATA COLLECTED Number of Locations 76 MILLION Number of Users 6 MILLION
DATA CLEANUP CASUAL USERS TOO FREQUENT TOO FAST (<5 locations) (>1 every 5 secs) (>1800 km/h)
DISTANCE TRAVELLED 100% 99% 97% 100% 85% 90% 80% 70% 60% 50% 50% 40% 30% 20% 10% 0% 0 < 1 mile 1 < 10 miles 10 < 100 miles 100 < 1000 miles 1000 < 10000 miles % Trips % Cumulative
TIME INTERVAL 97% 16% 89% 14% 81% 12% 69% 59% 10% 46% 8% 38% 6% 31% 24% 21% 4% 15% 8% 2% 4% 1% 0% 0% 10s 30s 1m 2m 5m 10m 15m 30m 1h 2h 6h 12h 1d 2d 1w % Trips % Cumulative
TRIPS vs. DISTANCE 22.3 22.0 21.6 21.2 20.8 19.8 17% 18.5 16% 16% 14% 14% 12% 11% Monday Tuesday Wednesday Thursday Friday Saturday Sunday % Trips Miles
TYPICAL NEW YORK CITY DAY 6 AM 2 PM 8 PM
TRACKING INDIVIDUALS
AGGREGATES BETWEEN U.S. STATES
WHERE TO GET MORE INFORMATION Alessio Signorini alessio-signorini@uiowa.edu http://www.cs.uiowa.edu/~asignori/ UIOWA Computational Epidemiology Group http://compepi.cs.uiowa.edu paper and datasets will be soon available on the CompEpi website
Recommend
More recommend