SensePresence: Infrastructure-less Occupancy Detection for Opportunistic Sensing Applications 16th IEEE International Conference on Mobile Data Management Md Abdullah Al Hafiz Khan H M Sajjad Hossain Nirmalya Roy University of Maryland Baltimore County Mobile, Pervasive & Sensor Computing Lab Universities of Maryland Baltimore County June 15, 2015 Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 1 / 27
Outline Voice Centric Sensing Motivation 1 Goals and Challenges 4 Class Participation Overview of SensePresence 5 Social Gathering Methodology Group member’s 6 Speaker Counting Algorithm participation Locomotive Counting Solutions 2 Experimental Setup and Ubiquitous Voice Sensing 7 3 Ubiquitous Sensing Results What we have already? Discussion & Future Work 8 Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 2 / 27
Which class is interactive? Helps to solve problems and theories. Helps gain knowledge. Total interactive participants. Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 3 / 27
How many people are there? Is the party enjoyable? Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 4 / 27
Group member’s participation How many people participate in the meeting? Does all the member participate? Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 5 / 27
Solutions People Count! ◮ Which class is Interactive? ⋆ Check how many students ask questions? ◮ Where is the party? ⋆ Find the place where most people speaks. ◮ Is the meeting e ff ective? ⋆ How many members participate? Microphone + Accelerometer Sensors Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 6 / 27
Ubiquitous Sensing Brace Smart Watch Necklace Phone What are the sensors available today? Which smart devices belongs to all people? Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 7 / 27
What we have already? Accelerometer Microphone Gyroscope etc. Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 8 / 27
Voice Centric Sensing Speaker Speaker Counting Identification 3 2 Emotion Detection Stress Detection Speaker Recognition What are the di ff erent types of application using voice centric sensing? “Blind Speaker clustering”, Iyer, IEEE, ISPACS (2006) “Crowd ++ : Unsupervised speaker count with smartphones”, Chenren Xu, UbiComp (2013): Static segmentation, controlled scenario where all speakers are active Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 9 / 27
Goals and Challenges Challenges Solution No prior knowledge of speakers Best Feature Extraction Background noise Filter Some people might remain silent Other Sensor (Accelerometer) Speech overlap Overlap Detection Privacy concern Use encryption (steganographic,stego) Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 10 / 27
SensePresence Architecture Mobile-side Architecture Microphone Accelerometer n mobile AFP Trigger phone Sink Sink Signature Pre-Processing Collection Estimate Proximity Server-side Architecture Pitch Estimation Node List Gender Detection Optimum Node Feature Extraction MFCC Speaker Count Speaker Estimation Occupancy Estimation Acoustic Context Model (ACM) Locomotive Context model (LCM) Occupancy Context Model (OCM) People Count Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 11 / 27
Methodology Acoustic methodology ◮ Create segment from raw audio ◮ Find Male and Female Segments ◮ Audio Processing Locomotive methodology ◮ Select sensors data based on speaker count and node list ◮ Calculate Magnitude ◮ Detect abrupt changes on the signal Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 12 / 27
Case 1: when people are conversing Dynamic Segmentation Confidence Score Vs. Segment Length 1.01 1 0.99 Confidence Scores 0.98 0.97 0.96 0.95 0.94 0.93 0.92 0.91 0.32 0.48 0.64 0.8 0.96 1.12 1.28 1.44 1.6 1.76 1.92 2.08 2.24 2.4 2.56 2.72 2.88 3.04 3.2 3.36 Segment Lengths What is the minimum or maximum segment length? Consider higher confidence score Which segment to choose when multiple segments have same confidence (i.e. 2.72 vs. 3.36 seconds) Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 13 / 27
Gender Detection Calculate Pitch Human voice ranges from 50Hz to 450Hz Male pitch falls between 100Hz to 146Hz Female pitch falls between 188Hz to 221Hz. Make Male and Female Segment sets. Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 14 / 27
Audio Processing Audio Dynamic Filter [300 Hz – 4 Framing Windowing Signal Segmentation kHz] Frames Hamming window (50% overlapped) Frame length 32 ms Band pass filter (300Hz - 4000Hz) Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 15 / 27
Mel-frequency cepstral coe ffi cients Discrete Cosine Mel-filter Bank DFT Transform Frames Delta Energy& Output: MFCC Spectrum Take Fourier transform Apply triangular mel-filter bank to map the power of the spectrum and take log Apply Discrete cosine transform Amplitude of the spectrum is the MFCC Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 16 / 27
Segment Sets Sorting s 1 s 2 s 3 s n Calculate Intra-frame angles Sort Segments s 1 s 4 s n S 3 Calculate intra-frame cosine angles Take average intra-frame angles Sort segment based on avg. angle Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 17 / 27
Grouping of Human Speakers based on Proximity S 1 S 2 S 4 S 3 S 6 S 5 Bucket Bucket Bucket 2 3 1 Calculate inter-frames cosine distance For similar person distance is less than equal 15 degree Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 18 / 27
Case 2: People are not Conversing Change point to capture the locomotive movements Use change points to find stray movements Baysian changepoint detection algorithm ◮ Calculate a-priori probability of two succesive change points at distance d (run length) ◮ Gaussian based log-likelihood model to compute log-likelihood of the data sequence [s,d] where no change point has been detected. ◮ Calculate log-likelihood for the entire signal S[t,n], log-likelihood of data sequence S s [ t , s ] where no change point occurs, π [ i , t ] log-likelihood where change point occurs ◮ summing up log-likelihoods for that sequence at time t ◮ set threshold δ th ◮ Count number of change points to assign movement score Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 19 / 27
Change Point Detection Result 30 1 0.9 25 0.8 0.7 20 Probabilities Magnitude 0.6 15 0.5 0.4 10 0.3 0.2 5 0.1 0 0 0 500 1000 1500 2000 0 500 1000 1500 2000 No of Samples No of Samples Change point with probability values Count the number of changepoint as movement score Set threshold probability to eliminate few changepoint Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 20 / 27
Experimental Setup and Results Data Collection: Natural conversation data collected and make it properly anonymized lab meeting, general discussion in lobby / corridor Data collection was 1-10 persons (with 5 males and 5 females) with age group of 18-50 years Audio sampling rate 16kHz at 16 bit PCM Locomotive sampling rate 5kHz Evaluation Metric: We use the average error count as the normalized predicted occupancy metric Error Count: | EC − AC | N where EC, AC, N denote the estimated people count, actual people count and number of samples respectively We use absolute value in order to avoid any positive or negative contribution Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 21 / 27
Occupancy Counting Results 2 Table Pocket 1.8 0.8 Average Error Count Distance 1.6 Average Error Count 0.7 1.4 0.6 1.2 0.5 1 0.4 0.8 0.3 0.6 0.2 0.4 0.1 0.2 0 0 2 3 4 10 15 20 25 30 Number of Speakers Similarity Measure Threshold (degree) Left figure depicts the e ff ect of cosine distant similarity measures Similarity measure threshold is 15 degree Right figure reports the average error count distance 0.5 with respect to di ff erent phone positions Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 22 / 27
Occupancy Counting Results 1.8 2 Average Error Count Distance 1.8 1.6 Average Error Count Distance 1.6 1.4 1.4 1.2 1.2 1 1 0.8 0.8 0.6 0.6 0.4 0.4 0.2 0.2 0 0 1 2 3 2 3 4 5 6 7 8 10 Distance (m) Number of Speakers Left figure depicts that error count increases as leader’s distance from other occupants increases Right figure presents speaker counting performance (both overlapped and non-overlapped conversation) Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 23 / 27
Occupancy Counting Results 12 Prediction Ground Truth 1.2 Acc. Estimated 10 Number of Occupants Count 1 Acc. Ground Truth 8 Binary Occupancy 0.8 Acoustic 6 Estimated Count 0.6 Acoustic Ground 4 Truth 0.4 Combined Count 2 0.2 Combined Ground 0 0 Truth 0 2 4 6 8 0 2 4 6 8 10 12 Test Cases Sensor Number Left figure shows binary occupancy counting Right figure presents locomotive augmented acoustic occupancy counting Example, 6 people converse and 4 remains silent. Acoustic sensing estimates 5 and locomotive sensing estimates 4. So total occupancy 9 out of 10 people Md Abdullah Al Hafiz Khan (UMBC) SensePresence June 15, 2015 24 / 27
Recommend
More recommend