Privacy-Preserving Inference in Crowdsourcing Systems Liyao Xiang Supervisor: Baochun Li Oct. 9, 2017 University of Toronto
Localization via Crowdsourcing ? A d AC d AB ? B C d BC ‣ In a crowd, some users know about their locations while some don’t. With distance observations between them, how to localize each user? 2
Localization via Crowdsourcing time t time t Upload Z i,t, D ij Upload Z j,t, D ji user i user j Prior estimate Z i,t Prior estimate Z j,t Return Z* i,t Return Z* j,t Run inference alg. ‣ Each user sends their prior estimates and distance observations to a central server, who returns the most likely position for each. ‣ What if users would like to keep their locations private? 3
Privacy-Preserving Localization ? A d AC d AB ? B C d BC ‣ In a crowd, some users know about their locations while some don’t. With distance observations between them, how to localize each user without breaching privacy? 4
Privacy-Preserving Localization ? A d AC d AB ? B C d BC ‣ In a crowd, some users know about their locations while some don’t. With distance observations between them, how to localize each user without breaching privacy? 5
Particle Representation ‣ User’s Location ‣ A user’s location is represented by a set of particles Z i,t = { z 1 , …, z R }, Z t = { Z 1,t , …, Z N,t }. ‣ At time t, the server finds the most likely distribution of Z t given Z t-1 and D . Z ∗ t = arg max P ( Z t | Z t − 1 , D ) . Z t 6
First Attempt ‣ To encrypt all particles and run the inference in the encrypted domain. However, encrypted operations are constrained. 7
Particle Representation ‣ User’s Location ‣ A user’s location is represented by a set of particles Z i,t = { z 1 , …, z R }. Each particle is associated with a weight { w 1 , …, w R }. ‣ For example, if the location estimate is {z 1 , z 2 , z 3 } with probabilities {0.6, 0.2, 0.2}, then the location is more likely to be z 1 than z 3 . 8
Particle Representation ‣ Users upload each particle’s weight {E(W 1 ), …, E(W R )} and distance observations to others E(D) in encryption. ‣ Server updates each particle’s weight. 9
Privacy-Preserving Inference ‣ Server computes partial information Ci,r for each particle r of each user i ( j is observed by i): 1 Y Y E pk (ln w j,s ) · E pk ( d ( z i,r , z j,s ) 2 ) − c i,r = 2 σ 2 j ∈ N ( i ) s ∈ { 1 ,...,R } d ( zi,r,zj,s ) 1 · E pk ( D 2 ij ) − · E pk ( D ij ) σ 2 2 σ 2 X X (ln w j,s − ( d ( z i,r , z j,s ) − D ij ) 2 / 2 σ 2 )] . = E pk [ j ∈ N ( i ) s ∈ { 1 ,...,R } 10
Privacy-Preserving Inference ‣ With secret key sk, user i updates the weight Wi,r for its particle r ( d js is the calculated distance between particle s of user j and particle r of user i ): i,r = w k − 1 w k exp[ E sk ( c i,r )] i,r X X = w k − 1 (ln w j,s � ( d js � D ij ) 2 / 2 σ 2 )] exp[ i,r j ∈ N ( i ) s ∈ { 1 ,...,R } Y Y = w k − 1 exp(ln w j,s � ( d js � D ij ) 2 / 2 σ 2 ) i,r j ∈ N ( i ) s ∈ { 1 ,...,R } � ( d js � D ij ) 2 ⇣ ⌘ Y Y = w k − 1 w j,s · exp i,r 2 σ 2 j ∈ N ( i ) s ∈ { 1 ,...,R } Y Y ' w k − 1 Pr ( z i,r , z j,s | D ij,t ) . i,r j ∈ N ( i ) s ∈ { 1 ,...,R } 11
Privacy-Preserving time t U p l o a d Z i , t , E( Localization with w ) a Prior Z i,t. n d E(D) D . o w Crowdsourcing n l o a d C Run inference. i , t . Decrypt and update prior with Z * i, t. Upload Z i, t+1 , E( w ) and E(D) . Download C i, t+1. time t+1 Prior Z i,t+1. 12
But, with R particles, adversary can still guess correct location with Prob. 1/R. 13
Data Perturbation ‣ Idea: perturb Z i,t = { z 1 , …, z R } as Y i,t = { y 1 , …, y R }. ‣ Perturbation: add Gaussian noise to Z i,t that N (0 , σ 2 ) satisfies location differential privacy. 14
Privacy Definition ‣ Location Differential Privacy: A mechanism M satisfies ( ✏ , � )-di ff erential privacy i ff for all z , z 0 that are d ( z, z 0 ) apart: Pr [ M ( z ) ∈ Y ] ≤ e ✏ Pr [ M ( z 0 ) ∈ Y ] + � , p and ✏ = ⇢ d 2 ( z, z 0 ) + 2 ⇢ log(1 / � ) d ( z, z 0 ) , where ρ is a constant specific to the perturbation mechanism we adopt. 15
Interpretation of Privacy Definition ‣ Location Differential Privacy: the projected distributions of all the points within the same dotted circle are at most ✏ apart from each other. ϵ 1 < ϵ 2 ϵ 1 < ϵ 2 z” M(z)(Y) M(z’)(Y) M(z’’)(Y) z’ d(z, z”) ) ’ z , z ( d z ϵ 1 ϵ 1 ϵ 2 ϵ 2 ‣ As the distance between the two locations is smaller, ✏ is smaller, indicating that it is harder to distinguish the two locations, i.e., higher privacy level. 16
Privacy Definition ‣ User Differential Privacy If we report Z = ( z 1 , ..., z R ) as Y = ( y 1 , ..., y R ), then the probability of reporting Y given Z is: Y Pr [ M ( Z ) ∈ Y ] = Pr [ M ( z i ) ∈ Y ] . i The user enjoys ( ✏ 0 , � )-di ff erential privacy with ✏ 0 = ⇢ Rd 2 ( Z, Z 0 ) + 2 p ⇢ log(1 / � ) Rd 2 ( Z, Z 0 ) . 17
Perturbed Private Inference ‣ Collecting Y , the server computes the pairwise distances between each pair of perturbed particles as: q ˜ d ( y, y 0 ) = || y − y 0 || 2 2 − 4 σ 2 . 18
How can we guarantee the inference result the same with the unperturbed case? 19
Privacy and Utility Analysis ‣ Utility results: We proved is an unbiased ˜ d ( y, y 0 ) estimator of d ( z, z 0 ) ‣ Privacy guarantee: We proved our perturbation scheme satisfies location differential privacy and user differential privacy. Compared to previous work, we improve the privacy level by with the same utility level. √ R 20
Performance Evaluation ‣ Overhead Running Time of the MAP Inference Convergence of the Particle Distribution 0.8 Average Running Time (ms) R = 50 Highest Particle Weight R = 75 12000 R = 100 0.6 0.4 7000 0.2 2000 0 0 20 40 60 80 10 20 30 40 50 No. of Iterations (5 Iterations × 15 Timeslots) Number of Users 21
Performance Evaluation ‣ Simulation results using random way point (RWP) model. Position Error of 20 Users( σ = 0.5) Position Error of 20 Users(R = 100) 1 1 R = 50 R = 75 R = 100 0.8 0.8 R = 125 R = 150 0.6 0.6 CDF CDF Unperturbed 0.4 0.4 σ = 0.2 σ = 0.7 σ = 1.0 0.2 0.2 σ = 1.5 σ = 2.3 0 0 0 5 10 15 20 25 0 5 10 15 20 25 Position Error (m) Position Error (m) 22
Performance Evaluation ‣ Comparison experiment and real-world experimental results. Comparison with Hilbert Curves on RWP Model Average Position Error of 7 Users in Different Settings 1 10 Unperturbed σ = 0.2, ε = 23.23 Average Position Error(m) 0.8 8 σ = 0.7, ε = 4.09 σ = 1.0, ε = 2.65 σ = 2.3, ε = 1.03 0.6 6 CDF 0.4 4 Hilbert Curves (n = 64) Hilbert Curves (n = 512) 2 0.2 Private Inference ( σ = 5.0) Private Inference ( σ = 10.0) 0 0 0 5 10 15 0 10 20 30 40 Sequence number Position Error (m) 23
Thank you! 24
Recommend
More recommend