Scalable PATE The Secret Sharer work by the Brain Privacy and - PowerPoint PPT Presentation

Scalable PATE The Secret Sharer work by the Brain Privacy and Security team and collaborators at UC Berkeley presented by Ian Goodfellow

PATE / PATE-G • Private / Papernot • Aggregation / Abadi • Teacher / Talwar • Ensembles / Erlingsson • Generative / Goodfellow

Threat Model Types of adversaries and our threat model Model querying ( black-box adversary ) Black-box ? ML Shokri et al. (2016) Membership Inference Attacks against ML Models Fredrikson et al. (2015) Model Inversion Attacks Model inspection ( white-box adversary ) Zhang et al. (2017) Understanding DL requires rethinking generalization In our work, the threat model assumes: - Adversary can make a potentially unbounded number of queries - Adversary has access to model internals 35

A definition of privacy: di ff erential privacy A definition of privacy: differential privacy } Answer 1 Answer 2 Randomized ... Algorithm ? ? Answer n ? ? Answer 1 Randomized Answer 2 Algorithm ... Answer n 36

A tangent • Which other fields need their “di ff erential privacy moment”? • Adversarial robustness needs a provable mechanism • Interpretability needs measurable / actionable definitions • Di ff erential privacy is maybe the brightest spot in ML theory, especially in adversarial settings. Real guarantees that hold in practice

Di ff erent teachers learn from di ff erent subsets Private Aggregation of Teacher Ensembles (PATE) Partition 1 Teacher 1 Partition 2 Teacher 2 Sensitive Partition 3 Teacher 3 Data ... ... Partition n Teacher n Training Data flow Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data [ICLR 2017 best paper] 37 Nicolas Papernot, Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, and Kunal Talwar

Aggregation Aggregation Aggregation Count votes Take maximum Count votes Take maximum 38 38

Intuitive Privacy Analysis Intuitive privacy analysis If most teachers agree on the label, it does not depend on specific partitions, so the privacy cost is small. If two classes have close vote counts, the disagreement may reveal private information. 39

Student Training Student training Not available to the adversary Available to the adversary Partition 1 Teacher 1 Partition 2 Teacher 2 Sensitive Aggregated Partition 3 Teacher 3 Student Queries Data Teacher ... ... Partition n Teacher n Public Data Training Inference Data flow 42

Why train a student model? Why train an additional “student” model? The aggregated teacher violates our threat model: Each prediction increases total privacy loss. 1 Privacy budgets create a tension between the accuracy and number of predictions. Inspection of internals may reveal private data. 2 Privacy guarantees should hold in the face of white-box adversaries. 43

Label-e ffi cient learning • More queries to teacher while training student = more privacy lost • Use semi-supervised GAN (Salimans et al 2016) to achieve high accuracy with few labels

Supervised Discriminator for Semi-Supervised Learning Real Real Fake Real Fake cat dog Learn to read with Hidden Hidden units units 100 labels rather than 60,000 Input Input (Odena 2016, Salimans et al 2016) (Goodfellow 2018)

Trade-o ff between accuracy and privacy Trade-off between student accuracy and privacy 47

Scalable PATE • Nicolas Papernot*, Shuang Song*, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, Úlfar Erlingsson

Limitations of first PATE paper • Only on MNIST / SVHN • Very clean • 10 classes (easier to get consensus) • Scalable PATE • More classes • Unbalanced classes • Mislabeled training examples

Improvements • Noisy votes use Gaussian rather than Laplace distribution • More likely to achieve consensus for large number of classes • Selective teacher response

Selective Teacher Response • Check for overwhelming consensus • Use high variance noise • Check if noisy votes for argmax exceed threshold T • Consensus? Publish noisy votes with smaller variance • No consensus? Don’t publish anything, student skips • Note: running the noisy consensus check still spent some of our privacy budget

Background: adversarial training Labeled as bird Still has same label (bird) Decrease probability of bird class

Virtual Adversarial Training Unlabeled; model New guess should guesses it’s probably match old guess a bird, maybe a plane (probably bird, maybe plane) Adversarial perturbation intended to change the guess (Miyato et al, 2015)

VAT performance (Oliver+Odena+Ra ff el et al, 2018)

Scalable PATE: Improved Synergy between utility and privacy Results 1. Check privately for consensus 2. Run noisy argmax only when consensus is sufficient (LNMax=PATE, Confident-GNMax=Scalable PATE) Scalable Private Learning with PATE [ICLR 2018] 48 Nicolas Papernot, Shuang Song, Ilya Mironov, Ananth Raghunathan, Kunal Talwar, Ulfar Erlingsson

Scalable PATE: Improved tradeo ff Trade-off between student accuracy and privacy Selective PATE 49

The Secret Sharer • Nicholas Carlini, Chang Liu, Jernej Kos, Úlfar Erlingsson, Dawn Song

Secret with format known to adversary • “My social security number is ___-__-____” Secret • Measure memorization with exposure

Definitions • Suppose model assigns probability p to the actual secret • The rank of the secret is the number of other strings given probability ≤ p • Minimum value is 1 • Exposure: negative log prob of sampling a string with probability less than p • equivalent: Exposure : log (# possible strings) - log rank

Practical Experiments • Can estimate exposure via sampling • Can approximately find most likely secret value with optimization (beam search)

Memorization during learning

Observations • Exposure is high • Exposure rises early during learning • Exposure is not caused by overfitting • Peaks before overfitting occurs

Comparisons • Across architectures: • More accuracy -> more exposure • LSTM / GRU: high accuracy, high exposure • CNN: lower accuracy, lower exposure • Larger batch size -> more memorization • Larger model -> more memorization • Secret memorization happens even when compressed model smaller than compressed dataset • Choice of optimizer: no significant di ff erence

Defenses • Regularization does not work • Weight decay • Dropout • Weight quantization • Di ff erentially privacy works, as guaranteed • Even for very small epsilon, with little theoretical guarantee, the exposure measured in practice decreases significantly

Questions

Scalable PATE The Secret Sharer work by the Brain Privacy and - PowerPoint PPT Presentation

Scalable PATE The Secret Sharer work by the Brain Privacy and Security team and collaborators at UC Berkeley presented by Ian Goodfellow PATE / PATE-G Private / Papernot Aggregation / Abadi Teacher / Talwar Ensembles / Erlingsson

Secret Sharing and Visual Cryptography Outline Secret Sharing Visual Secret Sharing

Advanced Tools from Modern Cryptography Lecture 3 Secret-Sharing (ctd.) Secret-Sharing Last

The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks Nicholas

VICTORIA'S SECRET THE WORLD'S BEST BRAS OVERVIEW Victoria's Secret Victoria's PINK Secret

Secret Management with Hashicorp's Vault Daniel Bornkessel Secret Management with Hashicorp's

Today. Polynomials. Secret Sharing. A secret! I have a secret! A number from 0 to 10. What is

Secret Sharing See: Shamir, How to Share a Secret , CACM, Vol. 22, No. 11, November 1979, pp.

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Secret Com puters - transcript of presentation video Hi Im Kevin McCarthy and I work on secret

TOP SECRET DEFENCE ESTATE & TOP SECRET INFRASTRUCTURE New Zealand Defence Industry

Secret Year 2 SATs Agent Training Secret Agent Training - SATs SATs should not be a

TRADE SECRETS Trade Secret A trade secret can be any formula,pattern,idea,process,physical

Adi Shamir, How to Share a Secret, CACM November 1979 CACM , November 1979. 2/23/2017

Agreeing on a secret language : Diffie-Hellman Bobs secret language Alices public lock

CS 166: Information Security Secret Sharing, Random Numbers, and Information Hiding Prof. Tom

How to Keep a Secret Key Securely Information can be secured by encryption under a secret key.

CSC 444: Data Visualization Instructor: Carlos Scheidegger TA: Jordan Siaha Course Website:

DISTANT SUPERVISION USING PROBABILISTIC GRAPHICAL MODELS Presented by: Sankalan Pal Chowdhury

Regulation, risk and new Regulation, risk and new investment investment Paper for the ACCC

Deconstructing the FTEM framework and its applicability within Australia Dr Juanita

CS 61: Database Systems ER models Adapted from Silberschatz, Korth, and Sundarshan unless

The Created Work kcoyle@kcoyle.net FRBR-LRM WEMI FaBiO FRBR-aligned Bibliographic Ontology

11 Shadow Volumes Steve Marschner CS5625 Spring 2019 References F . Crow, Shadow Algorithms

10 Shadow Volumes Steve Marschner CS5625 Spring 2015 References F . Crow, Shadow Algorithms

Scalable PATE The Secret Sharer work by the Brain Privacy and - PowerPoint PPT Presentation

Scalable PATE The Secret Sharer work by the Brain Privacy and Security team and collaborators at UC Berkeley presented by Ian Goodfellow PATE / PATE-G Private / Papernot Aggregation / Abadi Teacher / Talwar Ensembles / Erlingsson

Secret Sharing and Visual Cryptography Outline Secret Sharing Visual Secret Sharing

Advanced Tools from Modern Cryptography Lecture 3 Secret-Sharing (ctd.) Secret-Sharing Last

The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks Nicholas

VICTORIA'S SECRET THE WORLD'S BEST BRAS OVERVIEW Victoria's Secret Victoria's PINK Secret

Secret Management with Hashicorp's Vault Daniel Bornkessel Secret Management with Hashicorp's

Today. Polynomials. Secret Sharing. A secret! I have a secret! A number from 0 to 10. What is

Secret Sharing See: Shamir, How to Share a Secret , CACM, Vol. 22, No. 11, November 1979, pp.

Cache Coherence in Scalable Machines Scalable Cache Coherent Systems Scalable, distributed

Secret Com puters - transcript of presentation video Hi Im Kevin McCarthy and I work on secret

TOP SECRET DEFENCE ESTATE &amp; TOP SECRET INFRASTRUCTURE New Zealand Defence Industry

Secret Year 2 SATs Agent Training Secret Agent Training - SATs SATs should not be a

TRADE SECRETS Trade Secret A trade secret can be any formula,pattern,idea,process,physical

Adi Shamir, How to Share a Secret, CACM November 1979 CACM , November 1979. 2/23/2017

Agreeing on a secret language : Diffie-Hellman Bobs secret language Alices public lock

CS 166: Information Security Secret Sharing, Random Numbers, and Information Hiding Prof. Tom

How to Keep a Secret Key Securely Information can be secured by encryption under a secret key.

CSC 444: Data Visualization Instructor: Carlos Scheidegger TA: Jordan Siaha Course Website:

DISTANT SUPERVISION USING PROBABILISTIC GRAPHICAL MODELS Presented by: Sankalan Pal Chowdhury

Regulation, risk and new Regulation, risk and new investment investment Paper for the ACCC

Deconstructing the FTEM framework and its applicability within Australia Dr Juanita

CS 61: Database Systems ER models Adapted from Silberschatz, Korth, and Sundarshan unless

The Created Work kcoyle@kcoyle.net FRBR-LRM WEMI FaBiO FRBR-aligned Bibliographic Ontology

11 Shadow Volumes Steve Marschner CS5625 Spring 2019 References F . Crow, Shadow Algorithms

10 Shadow Volumes Steve Marschner CS5625 Spring 2015 References F . Crow, Shadow Algorithms

TOP SECRET DEFENCE ESTATE & TOP SECRET INFRASTRUCTURE New Zealand Defence Industry