Innovation Centre PINFER: PRIVACY-PRESERVING INFERENCE DPM 2019 Luxembourg � Sep. 26, 2019 Marc Joye Fabien Petitcolas
MACHINE LEARNING AS A SERVICE — GENERIC MODEL result Client Cloud (Server) 1 exchange of messages c 2019 OneSpan Innovation Centre 2 Innovation Centre
REQUIREMENTS AND SOLUTIONS Security requirements Proposed solutions Private evaluation for: 1 Linear regression 2 Logistic regression 3 Binary classification • The server learns nothing about the • Support Vector Machines (SVM) • requires a private comparison client’s input protocol (e.g., DGK+) • The server does not learn the output of the calculation 4 Neural networks • The client learns nothing about the ML • Sign or ReLU activation functions model • 1 interaction per layer c 2019 OneSpan Innovation Centre 3 Innovation Centre
LINEAR PREDICTION MODEL • Input 1 Server’s ML model: θ = ( θ 0 , . . . , θ d ) ∈ R d + 1 2 User’s feature vector: x = ( 1 , x 1 . . . . , x d ) ∈ { 1 } × R d • Output h θ ( x ) = g ( θ ⊺ x ) in many cases c 2019 OneSpan Innovation Centre 4 Innovation Centre
LINEAR PREDICTION MODEL — EVALUATION FUNCTION g Linear Regression [real-valued output] g = Id Logistic Regression [probability] exp( s ) g = σ where σ ( s ) = 1 +exp( s ) Linear Classification [binary decision] g = sign Rectified linear unit (ReLU) [neural networks] � 0 if s < 0 g ( s ) = otherwise s c 2019 OneSpan Innovation Centre 5 Innovation Centre
LINEAR PREDICTION MODEL WITH ENCRYPTION y = g ( θ ⊺ x ) Model evaluation: ˆ ( pk , sk ) Server ( θ ) Client ( x ) ⟦ x 1 ⟧ , . . . , ⟦ x d ⟧ , pk ❶ Compute ⟦ x ⟧ ❷ Compute ⟦ g ( θ ⊺ x ) ⟧ ⟦ g ( θ ⊺ x ) ⟧ ❸ Decrypt ⟦ g ( θ ⊺ x ) ⟧ y = g ( θ ⊺ x ) Set ˆ c 2019 OneSpan Innovation Centre 6 Innovation Centre
LINEARLY HOMOMORPHIC ENCRYPTION • We only require linearly homomorphic encryption: Enc pk ( m 1 ) ⊞ Enc pk ( m 2 ) = Enc pk ( m 1 + m 2 ) • NOT fully homomorphic encryption: Enc pk ( m 1 ) ⊞ Enc pk ( m 2 ) = Enc pk ( m 1 + m 2 ) Enc pk ( m 1 ) ⊡ Enc pk ( m 2 ) = Enc pk ( m 1 · m 2 ) • Benefits • Simpler implementation • Faster computation c 2019 OneSpan Innovation Centre 7 Innovation Centre
PRIVATE INNER PRODUCT • Since ⟦ · ⟧ is homomorphic ⟦ θ ⊺ x ⟧ = ⟦ θ 0 + � d ⟧ = ⟦ θ 0 ⟧ ⊞ ⟦ θ 1 x 1 ⟧ ⊞ · · · ⊞ ⟦ θ d x d ⟧ i = 1 θ i x i and, for 1 ≤ i ≤ d , := θ i ⊙ ⟦ x i ⟧ ⟦ θ i x i ⟧ = ⟦ x i ⟧ ⊞ · · · ⊞ ⟦ x i ⟧ � �� � θ i times Example (Paillier’s cryptosystem) • ⟦ m ⟧ = ( 1 + N ) m r N mod N 2 • ⟦ m 1 + m 2 ⟧ = ⟦ m 1 ⟧ · ⟦ m 2 ⟧ mod N 2 • ⟦ m 1 − m 2 ⟧ = ⟦ m 1 ⟧ / ⟦ m 2 ⟧ mod N 2 • a ⊙ ⟦ m ⟧ = ⟦ m ⟧ a mod N 2 = ⇒ ⟦ θ ⊺ x ⟧ requires d exponentiations modulo N 2 c 2019 OneSpan Innovation Centre 8 Innovation Centre
IF EVALUATION FUNCTION g IS NON-LINEAR • g is non-linear but injective (e.g., σ ) • Server computes ⟦ θ ⊺ x ⟧ • Client obtains θ ⊺ x and simply applies g and learns no more (by definition: g ( a ) = g ( b ) = ⇒ a = b ) • g is non-linear and non-injective (e.g., sign, ReLU) • Use set of tools and tricks • DGK+ comparison protocol • Simple masking with a random value • Masking and scaling of inner product • Variant of oblivious transfer (two possible ciphers sent) • Dual setup • Server publishes pk S and ⦃ θ ⦄ s • Still one round of messages! c 2019 OneSpan Innovation Centre 9 Innovation Centre
NEURAL NETWORKS Bias θ ( l ) Weights j , 0 θ ( l ) x ( l − 1 ) Activation j , 1 1 Output function θ ( l ) x ( l − 1 ) Inputs j , 2 2 . Σ g ( l ) x ( l ) . . j j x ( l − 1 ) θ ( l ) d l − 1 j , d l − 1 c 2019 OneSpan Innovation Centre 10 Innovation Centre
NUMERICAL EXPERIMENTS • Implementation (not much optimised) • Python • Intel i7-4770, 3.4GHz • GMP library (power exponentiation) • Fixed precision (53 bits) • Parameters • Public datasets and randomly generated ones • Models with 30 to 7994 features • Key sizes: 1388 to 2440 bits • Message overhead proportional to: • Key size • Number of features (or number of bits in DGK+) • Number of layers (FFNN) c 2019 OneSpan Innovation Centre 11 Innovation Centre
MESSAGE OVERHEAD (kB) 1 Protocol Protocol step Size Linear regression Client sends: pk C , ⟦ x i ⟧ , 1 ≤ i ≤ d ℓ M + d · 2 ℓ M ≈ 15 (core) Server sends: t ≈ 2 ℓ M < 1 SVM classification Client sends (core) t ∗ , ⟦ µ i ⟧ , 0 ≤ i ≤ ℓ − 1 2 ℓ M + ℓ · 2 ℓ M ≈ 29 Server sends ⟦ h ∗ i ⟧ , − 1 ≤ i ≤ ℓ − 1 ( ℓ + 1 ) · 2 ℓ M ≈ 30 FFNN sign act. Server sends 2,655 (core) t ∗ , ⦃ µ i ⦄ s , 0 ≤ i ≤ ℓ − 1 L · d · ( ℓ + 1 ) · 2 ℓ M (885 per layer) Client sends 2,700 ⟦ ˆ s , − 1 ≤ i ≤ ℓ − 1 L · d · ( ℓ + 2 ) · 2 ℓ M y ∗ ⟧ , ⦃ h ∗ i ⦄ (900 per layer) 1 Features: d = 30; key-size ℓ M = 2048; κ = 95; layers L = 3; Precision P = 53; Inner-product bound: ℓ = 58 c 2019 OneSpan Innovation Centre 12 Innovation Centre
RESULTS: LINEAR REGRESSION Private LR: 70 features Private LR: 7994 features Private linear regression (core protocol) Private linear regression (core protocol) Dataset: audiology, # features: 70 Dataset: enron, # features: 7994 16 Client Client Average computing time (ms) over 1000 trials Average computing time (ms) over 1000 trials Server Server 600 14 500 12 400 10 300 8 200 6 4 100 2 0 8 6 0 6 6 0 6 6 0 8 0 8 6 0 6 6 0 6 6 0 8 0 8 7 7 6 6 7 7 8 0 1 4 8 7 7 6 6 7 7 8 0 1 4 3 4 5 6 7 8 9 0 2 3 4 3 4 5 6 7 8 9 0 2 3 4 1 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 1 2 2 2 2 Length of modulus N (bits) Length of modulus N (bits) On Intel i7-4770, 3.4GHz c 2019 OneSpan Innovation Centre 13 Innovation Centre
RESULTS: SUPPORT VECTOR MACHINE CLASSIFICATION Private SVM: 70 features Private SVM: 7994 features Private SVM classification (core protocol) Private SVM classification (core protocol) Dataset: audiology, # features: 70 Dataset: enron, # features: 7994 1750 Client Client Average computing time (ms) over 100 trials Average computing time (ms) over 100 trials Server Server 1750 1500 1500 1250 1250 1000 1000 750 750 500 500 250 250 8 6 0 6 6 0 6 6 0 8 0 8 6 0 6 6 0 6 6 0 8 0 8 7 7 6 6 7 7 8 0 1 4 8 7 7 6 6 7 7 8 0 1 4 3 4 5 6 7 8 9 0 2 3 4 3 4 5 6 7 8 9 0 2 3 4 1 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 1 2 2 2 2 Length of modulus N (bits) Length of modulus N (bits) On Intel i7-4770, 3.4GHz DGK+ comparison is the main limiting factor c 2019 OneSpan Innovation Centre 14 Innovation Centre
RESULTS: NEURAL NETWORKS Private NNs: 10 features | 3 layers Private NNs: 10 features | 3 layers simple FFNN with sign activation (heuristic solution) simple FFNN with sign activation Dataset: random, # features: 10, # layers: 3 Dataset: random, # features: 10, # layers: 3 50000 Client Client Average computing time (ms) over 100 trials Average computing time (ms) over 100 trials Server Server 500 40000 400 30000 300 20000 200 10000 100 0 8 6 0 6 6 0 6 6 0 8 0 8 6 0 6 6 0 6 6 0 8 0 8 7 7 6 6 7 7 8 0 1 4 8 7 7 6 6 7 7 8 0 1 4 3 4 5 6 7 8 9 0 2 3 4 3 4 5 6 7 8 9 0 2 3 4 1 1 1 1 1 1 1 2 2 2 2 1 1 1 1 1 1 1 2 2 2 2 Length of modulus N (bits) Length of modulus N (bits) On Intel i7-4770, 3.4GHz DGK+ comparison is the main limiting factor c 2019 OneSpan Innovation Centre 15 Innovation Centre
COMMENTS/QUESTIONS? c 2019 OneSpan Innovation Centre 16 Innovation Centre
Recommend
More recommend