Comprehensive Privacy Analysis of Deep Learning: � Passive and Active White-box Inference Attacks against Centralized and Federated Learning � Milad Nasr 1 , Reza Shokri 2 , Amir Houmansadr 1 1 University of Massachusetts Amherst, 2 National University of Singapore 1
Deep learning Tasks � Personal History Medical Location Financial 2
Privacy Threats � • We provide a comprehensive privacy analysis of deep learning algorithms. • Our objective is to measure information leakage of deep learning models about their training data • In particular we emphasize on membership inference attacks • Can an adversary infer whether or not a particular data record was part of the training set? 3
� � Membership Inference � Output Vector Output � Pattern � Train Data � Member Trained Model � Non Member Data Distribution What is the cause of this behavior ? � 4
Training a Model � 3.5 � SGD: 𝑿 3 � Model parameters 𝐌 Loss 2.5 � Train Data 𝛂 𝐌 ↓ 𝐱 Loss gradient w.r.t parameters 2 � 𝑿 = 𝑿 − 𝜷 𝛂 𝐌 ↓ 𝐱 1.5 � Model parameters 1 � change in the opposite direction 0.5 � of each training 0 � data point’s 0 � 0.5 � 1 � 1.5 � 2 � 2.5 � 3 � 3.5 � gradient � 5
Training a Model � 4 � 3.5 � 3 � Train Data 2.5 � 2 � 1.5 � 1 � 0.5 � 0 � 0 � 0.5 � 1 � 1.5 � 2 � 2.5 � 3 � 3.5 � 6
Training a Model � 4 � 3.5 � 3 � Train Data Gradients leak 2.5 � information by behaving differently 2 � for non-member data Non member 1.5 � data vs. member data. � 1 � 0.5 � 0 � 0 � 0.5 � 1 � 1.5 � 2 � 2.5 � 3 � 3.5 � 7
Gradients Leak Information � Gradient Norm Distribution � 1 � Separable distributions 0.9 � 0.8 � 0.7 � Members � 0.6 � 0.5 � Non-Member � 0.4 � 0.4 � 0.3 � 0.2 � 0.14 � 0.1 � 0.1 � 0.09 � 0.07 � 0.07 � 0 � 0 � 100 � 200 � 300 � 400 � 500 � 8
Different Learning/Attack Settings � • Fully trained • Black/ White box • Fine-tuning • Federated learning • Central/ local Attacker • Passive/ Active 9
� � � � � � � � � � Federated Model � Central Model � Local Local Local Local Model � Model � Model � Model � … Training Training Training Training Local Data � Local Data � Local Data � Local Data � Collaborator 3 Collaborator 1 Collaborator n Collaborator 2 10
Federated Learning � Multiple observations: 3.5 � 4 � 4 � 3 � 3.5 � 3.5 � 3 � 3 � 2.5 � 2.5 � 2.5 � 2 � 2 � 2 � 1.5 � 1.5 � 1.5 � 1 � 1 � 1 � 0.5 � 0.5 � 0.5 � 0 � 0 � 0 � 0 � 0.5 � 1 � 1.5 � 2 � 2.5 � 3 � 3.5 � 0 � 0.5 � 1 � 1.5 � 2 � 2.5 � 3 � 3.5 � 0 � 0.5 � 1 � 1.5 � 2 � 2.5 � 3 � 3.5 � Epoch 1 Epoch 2 Epoch n Every point leave traces on the target function 11
Active Attack on Federated Learning � 4 � Target member 3.5 � Target non-member 3 � 2.5 � Active attacker 2 � change the parameters in the 1.5 � direction of the 1 � gradient � 0.5 � 0 � 0 � 0.5 � 1 � 1.5 � 2 � 2.5 � 3 � 3.5 � 12
Active Attack on Federated Learning � 4 � 3.5 � For the data points 3 � that are in the 2.5 � training dataset, 2 � local training will compensate for the 1.5 � active attacker � 1 � 0.5 � 0 � 0 � 0.5 � 1 � 1.5 � 2 � 2.5 � 3 � 3.5 � 13
Active Attacks in � Federated Model � 6 � 5 � Target Members � 4 � Gradient norm � Target Non- 3 � member � Member instances � 2 � Non-member instances � 1 � 0 � 1 � 2 � 3 � 4 � 10 � 20 � 30 � 40 � 50 � 60 � 70 � 80 � Epochs � 14
� � � � � � Scenario 1: Fully Trained Model � Dog Cat Trained Trained Model � Model � Not Observable Output vector Input Training Outputs of all layers Loss Dataset Gradients of all layers Attacker 15
� � � � � � � � � � Scenario 2: Central Attacker in � Federated Model � Central Model � Local Local Local Local Model � Model � Model � Model � … Training Training Training Training Local Data � Local Data � Local Data � Local Data � Collaborator 3 Collaborator 1 Collaborator n Collaborator 2 16
� � � � � � � � Scenario 2: Central Attacker in � Federated Model � In addition to the local attacker observations: Isolated Target individual Model � collaborators Isolated any collaborators L Central o c Model � a l U p d a t e s Not observable Not observable Not observable Not observable Local Local Local Local Model � Model � Model � Model � … Training Training Training Training
� � � � � � � � � � Scenario 3: Local Attacker in � Federated Learning � Epoch 1: Central Not Observable Outputs of all layers Model � Epoch 2: Loss Outputs of all layers Gradients of all layers Active Loss Gradients of all layers Local Local Local Local Model � Model � Model � Model � . . . … Training Training Training Training Epoch N: Outputs of all layers Loss Local Data � Local Data � Local Data � Local Data � Gradients of all layers Collaborator 3 Collaborator 1 Collaborator n Collaborator 2 18
Score function � Di ff erent observation: Epoch 1: Outputs of all layers Epoch 2: Loss Outputs of all layers Gradients of all layers Loss Gradients of all layers Member ? Non-member Input � Score � . . . Epoch N: Outputs of all layers Loss Gradients of all layers 19
Experimental Setup � • Unlike previous works, we used publicly available pretrained models • We used all common regularization techniques • We implemented our attacks in PyTorch • We used following datasets: • CIFAR100 • Purchase100 • Texas100 20
Results � 21
Pretrained Models Attacks � Gradients leak significant information Last layer contains the most information 22
Federated Attacks � Global attack is more powerful An active attacker can force SGD than the local attacker to leak more information 23
Conclusions � • We go beyond black-box scenario and try to understand why a deep learning model leak information • Gradients leak information about the training dataset • Attacker in the federated learning can take the advantage of multiple observations to leak more information • In the federated setting, an attacker can actively force SGD to leak information Questions ? 24
Overall Attack Model � Output Layer 1 output � Epoch 1: Component � Layer 1 output � Layer 1 output � Layer 1 output � Layer 1 gradient � Layer 1 output � Layer 1 gradient � Layer 1 output � Gradient Layer 1 gradient � Layer 1 output � Epoch n : Layer 1 gradient � Layer 1 output � Component � Layer 1 gradient � Layer 1 output � Layer 2 output � Layer 1 gradient � Layer 2 output � Layer 1 gradient � Layer 2 output � Layer 1 gradient � Layer 2 output � Output Layer 2 gradient � Layer 1 gradient � Layer 2 output � Layer 2 gradient � Layer 2 output � Component � Layer 2 gradient � Layer 2 output � . Layer 2 gradient � Layer 2 output � . Layer 2 gradient � Layer 2 output � . . Layer 2 gradient � . . Gradient Layer 2 gradient � . . . Layer 2 gradient � . . Component � . Layer 2 gradient � . . . . . . . Member ? Non-member . . . . . Layer n output � . . Layer n output � . . Layer n output � . Layer n output � Output Layer n gradient � . Layer n output � Layer n gradient � Layer n output � Component � Layer n gradient � Layer n output � Layer n gradient � Layer n output � Layer n gradient � Layer n output � Layer n gradient � Gradient Loss � Layer n gradient � Loss � Layer n gradient � Loss � Component � Layer n gradient � Loss � Loss � Label � Loss � Label � Loss � Loss Label � Loss � Label � Loss � Label � Component � Label � Label � Label � Label � Label 25 Component �
� � � � � � � � � � � � Scenario 4: Fine-Tuning Model � Outputs of all layers General Loss Model � General Fine-Tuned Gradients of all layers Model � Model � Specialized General Not Observable None � dataset � dataset � Training Training Attacker Outputs of all layers Fine-Tuned Loss Model � Dataset Gradients of all layers Specialized Dataset 26
Fine-tuning Attacks � Distinguishing Distinguishing Distinguishing Specialized / non- specialized/general general / non-member Dataset Arch member datasets datasets datasets Both specialized and general datasets are vulnerable to the membership attacks 27
Federated Attacks � 28
� � � � � � � � � � � � Fine-Tuning Model Leakage � Outputs of all layers General Loss Model � General Fine-Tuned Gradients of all layers Model � Model � Specialized General Not Observable None � dataset � dataset � Training Training Attacker Outputs of all layers Fine-Tuned Loss Model � Dataset Gradients of all layers Specialized Dataset 29
Recommend
More recommend