APPLYING WEAK SUPERVISION TO MOBILE SENSOR DATA: EXPERIENCES WITH - PowerPoint PPT Presentation

APPLYING WEAK SUPERVISION TO MOBILE SENSOR DATA: EXPERIENCES WITH TRANSPORT MODE DETECTION JONATHAN FÜRST 1 , MAURICIO FADEL ARGERICH 1 , KALYANARAMAN SHANKARI 2 , GÜRKAN SOLMAZ 1 , BIN CHENG 1 JONATHAN.FUERST@NECLAB.EU, MAURICIO.FADEL@NECLAB.EU, BIN.CHENG@NECLAB.EU 1 NEC LABS EUROPE, 2 UC BERKELEY

AGENDA ¡ ML in IoT ¡ Our domain ¡ Transport Mode Detection ¡ Weak Supervision for Transport Mode Detection ¡ Evaluation & Results ¡ Takeaways & Future work

ML IN IOT IoT is expanding to new domains ¡ ML is essential to exploit the power of IoT ¡ Data Quality Challenges ¡ Location ¡ External Time ¡ Knowledge Data Quality ¡ Labeled data ML Model t 0 t 1 Labeled data is very expensive ¡ Time L a L b But, we can label data using noisy programmable External Knowledge Location functions that express external knowledge, and then re-train our model

OUR DOMAIN: TRANSPORT The city of Heidelberg wants to improve public ¡ transportation They need insights about how people move in the city ¡ Our solution was to create a mobile app ¡ Citizens get transport recommendations ¡ Individual Travel Insights City gets an aggregated view of transportation ¡ We need to know ¡ Location of users (start, trajectory and end point) à TGGPS ¡ Transport mode of user à Manually labeled? ! ¡ Overall Travel Insights à Can we infer it? "

TRANSPORT MODE DETECTION Transport mode detection is fundamental to optimize urban multimodal human mobility ¡ It requires two steps: ¡ Segmentation 1. Classification 2. Current studies have used GPS, accelerometer, barometer and GIS data to train supervised ML models ¡ Data has to be labeled manually ¡ Data is labeled semi-automatically Training sets are small (guess why) and then model is overfitted Training set can be much larger, less overfitting ¡ The more the data, the less data quality needed Data quality vs. battery and OS limitations ¡ Our take: improve data availability using weak supervision

WEAK SUPERVISION FOR TRANSPORT MODE DETECTION Label & Training Phase Pre-processing Phase Smartphone F 1 ([s 1 , s 2 ... s i ]) Data Collection Learn Generative Accelerometer F 2 ([s 1 , s 2 ... s i ]) Model Dwell-time supported walk heuristic point segmentation F 3 ([s 1 , s 2 ... s i ]) Train Transport F n ([s 1 , s 2 ... s i ]) Mode Classifier • Location APIs Section Filtering and Trip • Activity Detection Candidate resampling Segmentation Segmentation APIs Re-segment Transport Mode sections based on Classification classified modes Classified   Trips & User Smart Classification Phase Sections Phone

MOBILE SENSOR DATA COLLECTION Label & Training Phase Pre-processing Phase Smartphone F 1 ([s 1 , s 2 ... s i ]) Data Collection Learn Generative Accelerometer F 2 ([s 1 , s 2 ... s i ]) Dwell-time Model supported walk heuristic point segmentation F 3 ([s 1 , s 2 ... s i ]) Train Transport F n ([s 1 , s 2 ... s i ]) Mode Classifier • Location APIs Section Filtering and Trip • Activity Detection Candidate resampling Segmentation APIs Segmentation Re-segment Transport Mode sections based on Classification Classified   classified modes Trips & User Smart Sections Classification Phase Phone Collecting data from mobile sensors drain a lot of battery ¡ Sensing location using GPS ¡ Accelerometer and barometer à high frequency ¡ Instead, we use Android and iOS native APIs (Location and Activity) ¡ Highly optimized for battery consumption ¡ BUT, sparse and noisy sensor data ¡

TIME SERIES SEGMENTATION Label & Training Phase Pre-processing Phase Smartphone F 1 ([s 1 , s 2 ... s i ]) Data Collection Learn Generative Accelerometer F 2 ([s 1 , s 2 ... s i ]) Dwell-time Model supported walk heuristic location point segmentation F 3 ([s 1 , s 2 ... s i ]) Train Transport F n ([s 1 , s 2 ... s i ]) Mode Classifier • Location APIs Section Filtering and Trip • Activity Detection Candidate resampling Segmentation activity APIs Segmentation Re-segment Transport Mode sections based on Classification Classified   classified modes Trips & User Smart Sections Classification Phase Phone Filter and re-sample data 1. Sparsity ¡ Location and activity data are not aligned ¡ Trip 1 Trip 2 No fixed sampled interval ¡ Segment time series into Trips 2. Dwell time heuristics ¡ Trip 2 Segment Trips into Segments 3. Walk-point-based ¡ Segment 1 Segment 2

LABELING, TRAINING AND CLASSIFICATION “if the maximum speed of a segment is less than 3 m/s, then it’s probably a Label & Training Phase walking segment” Pre-processing Phase Smartphone F 1 ([s 1 , s 2 ... s i ]) Data Collection Learn Generative “instead, if it’s higher than 3 m/s but less than 10 m/s, then it’s probably a Accelerometer F 2 ([s 1 , s 2 ... s i ]) Dwell-time Model supported walk heuristic point segmentation F 3 ([s 1 , s 2 ... s i ]) bike segment” Train Transport F n ([s 1 , s 2 ... s i ]) Mode Classifier ... • Location APIs Section Filtering and Trip • Activity Detection Candidate resampling Segmentation APIs Segmentation Re-segment Transport Mode sections based on Classification Classified   classified modes Trips & User Smart Sections Classification Phase Phone We use Data Programming (Ratner et al. 2017) ¡ Labeling functions ¡ Programmable functions ¡ Use external knowledge ¡ Cast a (noisy) vote on each data point ¡ 0 1 1 data point 1 Votes create a Labeling Matrix (LM) ¡ 0 1 0 data point 2 LM + lab. propensity + accuracy + correlation = Generative Model 0 −1 0 ¡ data point 3 1 −1 −1 data point 4 We label data points with generative model and use data to train an ¡ 1 n 2 o 3 i n t n c o end model o n i t i u c t c f n n . b u u a f f l . b . b a l a l

EVALUATION & RESULTS (1) Our data ¡ (V) 8 users collected data for 4 months: 300k data points (V) ¡ (A) Features ¡ (V) GPS location (through iOS and Android Location API) ¡ (OSM) Accelerometer based activity data (through Activity API) (S) ¡ Users partially labeled data using a visual labeling tool (S) ¡ 4 transport modes: walk, bike, car, train ¡ 74.10% Train/test split: 50/50 ¡ 72.40% 70.35% We implemented 7 labeling functions using ¡ Sensed speed ¡ 64.00% Velocity (calculated with GPS) ¡ OpenStreetMaps (to check train stops) ¡ We tested the Generative Model accuracy with different sets ¡ of labeling functions V V+S V+S+A V+S+A+OSM

EVALUATION & RESULTS (2) 81.00% 80.20% We label all the train data using the generative ¡ F1 score 78.40% model and train a Random Forest and a Neural Network We also train a Random Forest using the manually ¡ labeled data from users 74.10% GEN.MODEL WS-RF WS-NN SUP-RF

LESSONS LEARNT & FUTURE WORK Extensive manually labeling is not necessary for IoT data if we use external knowledge ¡ Domain/Expert knowledge ¡ Physical knowledge ¡ Access to external knowledge is not always easy ¡ Granularity in which IoT series should be labeled ¡ We will gather more data to continue the evaluation of our application in Heidelberg ¡ We will evaluate our approach with data from other cities, to test the generalizability ¡

Thank you! mauricio.fadel@neclab.eu bin.cheng@neclab.eu jonathan.fuerst@neclab.eu

APPLYING WEAK SUPERVISION TO MOBILE SENSOR DATA: EXPERIENCES WITH - PowerPoint PPT Presentation

APPLYING WEAK SUPERVISION TO MOBILE SENSOR DATA: EXPERIENCES WITH TRANSPORT MODE DETECTION JONATHAN FRST 1 , MAURICIO FADEL ARGERICH 1 , KALYANARAMAN SHANKARI 2 , GRKAN SOLMAZ 1 , BIN CHENG 1 JONATHAN.FUERST@NECLAB.EU,

Noise2Self: Blind Denoising by Self-Supervision Joshua Batson Loc Royer Noisy Data

Few-shot learning of weak supervision sources in Snorkel (or, learning weakly supervised weak

Supervision Strengthening Our Practice The plan Supervision what is it? Benefits

Sensor Relocation Mesh-based Sensor Relocation Mesh-based Sensor Relocation Objective for

Supervision Mandatory Webinar 4 Webinar overview I. Background II. Why supervision? III.

Weak Supervision Vincent Chen and Nish Khandwala Outline Motivation We want more

MOBILE ADVERTISING Agenda Get off to a mobile start with Media Impact! Why mobile? MI

Learning Dependency Structures for Weak Supervision Models Fred Sala , Paroma Varma, Ann He, Alex

Weak Supervision, noisy labels, and error propagation Marat Freytsis hep-ai journal club

Sensor Networks & TinyDB Author: Roman Kolcun Supervisor: Julie A. McCann Index Sensor

FLOOD SENSOR WATER LEAK & TEMPERATURE SENSOR Strictly Confidential Water leak &

Weak-Signal Digital Modes Weak-Signal Digital Modes The weak-signal digimodes have been

To the weak I became weak, that I might win the weak. I have become all things to all people,

WEAK INTERPOLATION PROPERTY over THE MINIMAL LOGIC Larisa Maksimova Sobolev Institute of

Linking linking Weak forms Linking Weak forms Elision (sound cut)

Weak memory models INF4140 - Models of concurrency Weak memory models Fall 2016 30. 10. 2016

Design a Resilient Future These materials were developed by CIRES Education & Outreach at the

Constraint satisfaction Jets and Sharks (McClelland, 1981) Units represent hypotheses about parts

Kickstart your Application! Webinar No. 4: Task Descriptions, Cost Estimates, and Measuring

Space Utilization & Metrics Team Number: 16 Team Lead: Raquel de Lemos Arnaut, Savills Team

socio-organizational issues and stakeholder requirements socio-organizational issues and

Deadlines! I love em. I love the whooshing noise they make as they fly by. Douglas Adams

Evaluation of personal exposure to air pollutants and estimation of the inhaled dose for commuters

Advisory Session The purpose of this second year advisory session on day one of semester two

APPLYING WEAK SUPERVISION TO MOBILE SENSOR DATA: EXPERIENCES WITH - PowerPoint PPT Presentation

APPLYING WEAK SUPERVISION TO MOBILE SENSOR DATA: EXPERIENCES WITH TRANSPORT MODE DETECTION JONATHAN FRST 1 , MAURICIO FADEL ARGERICH 1 , KALYANARAMAN SHANKARI 2 , GRKAN SOLMAZ 1 , BIN CHENG 1 JONATHAN.FUERST@NECLAB.EU,

Noise2Self: Blind Denoising by Self-Supervision Joshua Batson Loc Royer Noisy Data

Few-shot learning of weak supervision sources in Snorkel (or, learning weakly supervised weak

Supervision Strengthening Our Practice The plan Supervision what is it? Benefits

Sensor Relocation Mesh-based Sensor Relocation Mesh-based Sensor Relocation Objective for

Supervision Mandatory Webinar 4 Webinar overview I. Background II. Why supervision? III.

Weak Supervision Vincent Chen and Nish Khandwala Outline Motivation We want more

MOBILE ADVERTISING Agenda Get off to a mobile start with Media Impact! Why mobile? MI

Learning Dependency Structures for Weak Supervision Models Fred Sala , Paroma Varma, Ann He, Alex

Weak Supervision, noisy labels, and error propagation Marat Freytsis hep-ai journal club

Sensor Networks &amp; TinyDB Author: Roman Kolcun Supervisor: Julie A. McCann Index Sensor

FLOOD SENSOR WATER LEAK &amp; TEMPERATURE SENSOR Strictly Confidential Water leak &amp;

Weak-Signal Digital Modes Weak-Signal Digital Modes The weak-signal digimodes have been

To the weak I became weak, that I might win the weak. I have become all things to all people,

WEAK INTERPOLATION PROPERTY over THE MINIMAL LOGIC Larisa Maksimova Sobolev Institute of

Linking linking Weak forms Linking Weak forms Elision (sound cut)

Weak memory models INF4140 - Models of concurrency Weak memory models Fall 2016 30. 10. 2016

Design a Resilient Future These materials were developed by CIRES Education &amp; Outreach at the

Constraint satisfaction Jets and Sharks (McClelland, 1981) Units represent hypotheses about parts

Kickstart your Application! Webinar No. 4: Task Descriptions, Cost Estimates, and Measuring

Space Utilization &amp; Metrics Team Number: 16 Team Lead: Raquel de Lemos Arnaut, Savills Team

socio-organizational issues and stakeholder requirements socio-organizational issues and

Deadlines! I love em. I love the whooshing noise they make as they fly by. Douglas Adams

Evaluation of personal exposure to air pollutants and estimation of the inhaled dose for commuters

Advisory Session The purpose of this second year advisory session on day one of semester two

Sensor Networks & TinyDB Author: Roman Kolcun Supervisor: Julie A. McCann Index Sensor

FLOOD SENSOR WATER LEAK & TEMPERATURE SENSOR Strictly Confidential Water leak &

Design a Resilient Future These materials were developed by CIRES Education & Outreach at the

Space Utilization & Metrics Team Number: 16 Team Lead: Raquel de Lemos Arnaut, Savills Team