pr private stochastic convex optimization wi with optimal
play

Pr Private Stochastic Convex Optimization wi with Optimal Ra Rate - PowerPoint PPT Presentation

Pr Private Stochastic Convex Optimization wi with Optimal Ra Rate Raef Bassily Vitaly Feldman Kunal Talwar Abhradeep Guha Thakaurta UC Santa Cruz Ohio State University Google Brain Google Brain Google Brain This work


  1. Pr Private Stochastic Convex Optimization wi with Optimal Ra Rate Raef Bassily Vitaly Feldman Kunal Talwar Abhradeep Guha Thakaurta UC Santa Cruz Ohio State University Google Brain Google Brain Google Brain

  2. This work Differentially private (DP) algorithms for stochastic convex optimization with optimal excess population risk

  3. Stochastic Convex Optimization (SCO) Unknown distribution (population) 𝒠 over data universe 𝒢 Convex parameter space π’Ÿ βŠ‚ ℝ 2 𝑀 4 /𝑀 4 setting: π’Ÿ and πœ–β„“ are bounded in 𝑀 4 -norm Convex loss function β„“: π’Ÿ Γ— 𝒢 β†’ ℝ Dataset 𝑇 = 𝑨 & , … , 𝑨 ) ∼ 𝒠 )

  4. Stochastic Convex Optimization (SCO) Unknown distribution (population) 𝒠 over data universe 𝒢 Convex parameter space π’Ÿ βŠ‚ ℝ 2 𝑀 4 /𝑀 4 setting: π’Ÿ and πœ–β„“ are bounded in 𝑀 4 -norm Convex loss function β„“: π’Ÿ Γ— 𝒢 β†’ ℝ Dataset 𝑇 = 𝑨 & , … , 𝑨 ) ∼ 𝒠 ) A SCO algorithm, given 𝑇 , outputs 7 πœ„ ∈ π’Ÿ s.t. Excess Pop. Risk β‰œ 𝔽 <βˆΌπ’  β„“ 7 πœ„, 𝑨 βˆ’ min Aβˆˆπ’Ÿ 𝔽 <βˆΌπ’  β„“ πœ„, 𝑨 is as small as possible & Well-studied problem: optimal rate β‰ˆ )

  5. Private Stochastic Convex Optimization (PSCO) Unknown distribution (population) 𝒠 over data universe 𝒢 Convex parameter space π’Ÿ βŠ‚ ℝ 2 𝑀 4 /𝑀 4 setting: π’Ÿ and πœ–β„“ are bounded in 𝑀 4 -norm Convex loss function β„“: π’Ÿ Γ— 𝒢 β†’ ℝ Dataset 𝑇 = 𝑨 & , … , 𝑨 ) ∼ 𝒠 ) Goal: πœ—, πœ€ -DP algorithm 𝒝 FGHI that, given 𝑇 , outputs 7 πœ„ ∈ π’Ÿ s.t. Excess Pop. Risk β‰œ 𝔽 <βˆΌπ’  β„“ 7 πœ„, 𝑨 βˆ’ min Aβˆˆπ’Ÿ 𝔽 <βˆΌπ’  β„“ πœ„, 𝑨 is as small as possible

  6. Main Result & 2 Optimal excess population risk for PSCO is β‰ˆ max ) , L ) Optimal non-private Optimal private population risk empirical risk [ B S T 14]

  7. Main Result & 2 Optimal excess population risk for PSCO is β‰ˆ max ) , L ) When 𝑒 = Θ π‘œ (common in modern ML) & Opt. risk for PSCO β‰ˆ ) = opt. risk for SCO asymptotically no cost of privacy

  8. Algorithms Two algorithms under mild smoothness assumption on β„“ : Ø A variant of mini-batch noisy SGD: Ø Objective Perturbation (entails rank assumption on βˆ‡ 4 β„“ )

  9. Algorithms Two algorithms under mild smoothness assumption on β„“ : Ø A variant of mini-batch noisy SGD: Ø Objective Perturbation (entails rank assumption on βˆ‡ 4 β„“ ) β€’ The objective function in both algorithms is the empirical risk .

  10. Algorithms Two algorithms under mild smoothness assumption on β„“ : Ø A variant of mini-batch noisy SGD: Ø Objective Perturbation (entails rank assumption on βˆ‡ 4 β„“ ) β€’ The objective function in both algorithms is the empirical risk . β€’ Generalization error is bounded via uniform stability : o For the first algorithm: uniform stability of SGD [HRS15, F V19]. o For the second algorithm: uniform stability due to regularization.

  11. Algorithms β€’ General non-smooth loss: Ø A new, efficient, noisy stochastic proximal gradient algorithm: o Based on Moreau-Yosida smoothing o A gradient step w.r.t. the smoothed loss is equivalent to a proximal step w.r.t. the original loss .

  12. Results vs. Prior Work on DP-ERM This work & 2 Optimal excess population risk for PSCO is β‰ˆ max ) , β€’ L ) Previous work β€’ Focused on the empirical version (DP-ERM): [CMS11, KS T 12, B S T 14, TT Z15, …] β€’ Optimal empirical risk is previously known [ B S T 14], but not optimal population risk . 2 Q/R 2 β€’ Best known population risk using DP-ERM algorithms β‰ˆ max ) , L ) [ B S T 14].

  13. Poster #163 Full version: arXiv:1908.09970

Recommend


More recommend