Not All Apps Are Created Equal: Analysis of Spatiotemporal Heterogeneity in Nationwide Mobile Service Usage Cristina Marquez and Marco Gramaglia (Universidad Carlos III de Madrid); Marco Fiore (CNR-IEIIT) ; Albert Banchs (Universidad Carlos III de Madrid and Institute IMDEA Networks) ; Cezary Ziemlicki and Zbigniew Smoreda (Orange Labs) 1
INTRODUCTION • Current status of mobile services: – Superficial comprehension Properly dimension & – Restricted to a small set of orchestrate the mobile coarse-grained datasets network Supporting data mining • Aim: characterize the usage of techniques mobile services at a national scale given a large dataset Understanding social behaviors • Analysis of traffic behavior of Across time & services space 2
DATASET Dataset collected at Orange core network Data recorded at passive 1 week from September 24, 2016 probes at User population ~ 30 million individuals Gn and s5/s8 Distributed over > 550,000 𝒍𝒏 𝟑 interfaces of Granularity of 5 mins GGSN & P-GW ~25,000 base stations (distributed over > 36,000 communes ) ( ~ 16 𝑙𝑛 2 each) We aggregated AIM: mobile service overview data per commune 3
DATASET: DEEP VIEW service Description time commune service ul dl 1475067900 01001 5 3780 151200 1 YouTube WEB 1475078100 . 6 26875 328412 . Instagram 1475094840 . 7 21768 715481 . Web Advertising 1475051700 . 8 5654 111236 . Wikipedia 1475063520 97424 9 2584 20596 500 Shazam Extensive 500 distinct services dataset! High 7 macro-category granularity! Selection of 20 main categories (most representative) YouTube: YouTube WEB, YouTube Streaming HTTP, YouTube TLS, YouTube Streaming MP4,YouTube Apple 4
ANALYSIS 5
TIME SERIES ANALYSIS • Focus on weekly demand for each traffic over communes : Each time series is characterized by a variety of fluctuation In all cases higher diurnal activity (activity reduced at night). Apple Store YouTube Distinctive dichotomy between weekends & weekdays Different temporal patterns between categories & similar services SnapChat Facebook 6
ARE THEY REALLY SIMILAR? All possible k considered! To be minimized To be maximized Uplink Downlink K-Shape Time Clustering: check goodness of fit with distinct quality indices vs the #clusters K - Davies-Bouldin (top graphs) NOT QUITE Best option? 19 clusters - Dunn, Silhoutte (bottom graphs) SIMILAR! 7
PEAKS DETECTED Same macro- AppleStore category, different behavior 8
SERVICE USAGE GEOGRAPHY It is ubiquitous • Significant peaks of activity also in space: Except 2 outliers It is used Bytes/ outdoors subscriber Twitter NetFlix Similar geographical pattern 9
DOES THE SPACE HAVE AN INFLUENCE IN TIME DYNAMICS? INSEE urbanization distribution 10
ARE TIME SERIES RELATED? • Correlation of mobile services for different urbanization levels Depends on the • Each bar shows the average 𝑠 2 value. train’s schedule • In all cases but TGV , the correlation is extremely high urbanization level has little impact on temporal dynamics of category usage. • Service usage changes when people are aboard TGV. 11
SIMILAR USAGE IN TERMS BYTES/SUBSCRIBER? • Slope of least square regression of per-subscriber time series • Findings: – Semi-urban & urban areas present similar individual service usage level – Subscribers in rural areas consume around ½ of the mobile service data in urban areas – Users on TGV generate on average twice or more volume of traffic than urban areas 12
CONCLUSIONS temporal, spatial & hybrid dynamics of mobile services categorized granularity We studied at a national scale finding new interesting macroscopic properties of traffic • Findings: – No 2 services exhibit similar time patterns – Mobile services have very comparable geographical distributions – The urbanization level influences how users consume mobile services, but limited on when they do so – Unique time dynamics on high-speed trains 13
14 Cristina Marquez mcmarque@pa.uc3m.es Cristina Márquez /Dec 13th, 2017/ Not All Apps Are Created Equal
15
3G/4G NETWORK • Data recorded at passive probes at the Gn and s5/s8 interfaces of GGSN & P-GW • DPI techniques classify 88% of the mobile traffic • Geo-referencing of the IP sessions by examining ULI (User Location Information) 16
Recommend
More recommend