Low-Cost Learning via Active Data Procurement October 2015 Jacob Abernethy Yiling Chen Chien-Ju Ho Bo Waggoner 1
Coming soon to a society near you data-needers s r e d l o h - a t a d ex: pharmaceutical co. ex: medical data 2
Classic ML problem data source learning alg hypothesis z 1 h z 2 data data-needer Goal : use small amount of data, output “good” h . 3
Example learning task: classification h h ● Data : (point, label) where label is or ● Hypothesis : hyperplane separating the two types 4
Twist: data is now held by individuals data source mechanism hypothesis z1 z2 h c2 c1 data-needer data-holders “Cost of revealing data” (formal model later…) Goal : spend small budget, output “good” h . 5
Why is this difficult? 1. (Relatively) few data are useful have mutation runners Studying ACTN-3 mutation and endurance running 6
Why is this difficult? 2. Utility of data may be correlated with cost (causing bias) HIV-positive HIV-negative Paying $10 for data no yes (to study HIV) no yes yes yes no yes 7
Why is this difficult? 2. Utility of data may be correlated with cost (causing bias) HIV-positive HIV-negative Paying $10 for data no yes (to study HIV) no yes yes yes no yes Machine Learning roadblock : how to deal with biases? 8
Why is this difficult? 3. Utility (ML) and cost (econ) live in different worlds entropies, gradients, loss functions, divergences mechanism learning alg auctions, budgets, value distributions, reserve prices 9
Why is this difficult? 3. Utility (ML) and cost (econ) live in different worlds entropies, gradients, loss functions, divergences mechanism learning alg auctions, budgets, value distributions, reserve prices Econ roadblock : how to assign value to data? 10
Broad research challenge: 1. How to assign value (prices) to pieces of data? 2. How to design mechanisms for procuring and learning from data? 3. Develop a theory of budget-constrained learning: what is (im)possible to learn given budget B and parameters of the problem? 11
Outline 1. Overview of literature, our contributions 2. Online learning model/results 3. “Statistical learning” result, conclusion 12
Related work How are agents strategic? Roth, Schoenebeck 2012 agents cannot this work Ligett, Roth 2012 fabricate data, have costs Horel, Ionnadis, Muthukrishnan 2014 principal-agent Cummings, Ligett, Roth, Wu, Ziani 2015 style, data Cai, Daskalakis, Papadimitriou 2015 depends on effort 13
Related work risk/regret minimize variance Type of goal bounds or related goal Roth, Schoenebeck 2012 agents cannot this work Ligett, Roth 2012 fabricate data, have costs Horel, Ionnadis, Muthukrishnan 2014 principal-agent Cummings, Ligett, Roth, Wu, Ziani 2015 style, data Cai, Daskalakis, Papadimitriou 2015 depends on effort 14
Related work risk/regret minimize variance bounds or related goal Roth, Schoenebeck 2012 agents cannot this work Ligett, Roth 2012 fabricate data, have costs Horel, Ionnadis, Muthukrishnan 2014 principal-agent Cummings, Ligett, Roth, Wu, Ziani 2015 style, data Cai, Daskalakis, Papadimitriou 2015 depends on effort Waggoner, Frongillo, Abernethy NIPS 2015: prediction-market style mechanism 15
e.g. Roth-Schoenebeck, EC 2012 data Conducting Truthful Surveys, Cheaply source i.i.d. 0 1 h c2 c1 ● Each datapoint is a number. Task is to estimate the mean ● Approach: offer each agent a price drawn i.i.d. ● Goal: minimize the estimate’s variance 16
What we wanted to do differently 1. Prove ML-style risk or regret bounds Why: ML-style approach: understand error rate as function of budget and problem characteristics. 2. Interface with existing ML algorithms. Why: understand how value derives from learning alg. Toward black-box use of learners in mechanisms. 3. Online data arrival Why: active-learning approach, simpler model 17
Overview of our contributions Propose model of online learning with purchased data: T arriving data points and budget B. Convert any “FTRL” algorithm into a mechanism. Show regret on order of T / √ B and lower bounds of same order. 18
Overview of our contributions Extend model to case where data is drawn i.i.d. (“statistical learning”) Propose model of online learning with purchased data: T arriving data points and budget B. Convert any “FTRL” algorithm into a mechanism. Show regret on order of T / √ B and lower bounds of same order. Extend result to “risk” bound on order of 1 / √ B . 19
Outline 1. Overview of literature, our contributions 2. Online learning model/results 3. “Statistical learning” result, conclusion 20
Online learning with purchased data a. Review of online learning b. Our model: adding $$ c. Deriving our mechanism and results 21
Standard online learning model For t = 1, … , T : ● algorithm posts a hypothesis h t ● data point z t arrives ● algorithm sees z t and updates to h t+1 Loss = ∑ t ℓ (h t , z t ) Regret = Loss - ∑ t ℓ (h * , z t ) where h * minimizes sum 22
Follow-the-Regularized-Leader (FTRL) Assume: loss function is convex and Lipschitz, hypothesis space is Hilbert, etc Algorithm: h t = argmin ∑ s<t ℓ (h, z s ) + R(h)/ η 23
Follow-the-Regularized-Leader (FTRL) Assume: loss function is convex and Lipschitz, hypothesis space is Hilbert, etc Algorithm: h t = argmin ∑ s<t ℓ (h, z s ) + R(h)/ η 2 Example 1 (Euclidean norm): R(h) = ǁ h ǁ 2 ⇒ h t = h t-1 - η ∇ ℓ (h, z t ) online gradient descent 24
Follow-the-Regularized-Leader (FTRL) Assume: loss function is convex and Lipschitz, hypothesis space is Hilbert, etc Algorithm: h t = argmin ∑ s<t ℓ (h, z s ) + R(h)/ η 2 Example 1 (Euclidean norm): R(h) = ǁ h ǁ 2 ⇒ h t = h t-1 - η ∇ ℓ (h, z t ) online gradient descent Example 2 (negative entropy): R(h) = ∑ j h (j) ln(h (j) ) . (j) ∝ h t-1 (j) exp[ η ∇ ℓ (h t-1 , z t ) ] ⇒ h t multiplicative weights 25
Regret Bound for FTRL Fact: the regret of FTRL is bounded by O of 2 where Δ t = ǁ ∇ ℓ (h t , z t ) ǁ . 1/ η + η ∑ t Δ t 26
Regret Bound for FTRL Fact: the regret of FTRL is bounded by O of 2 where Δ t = ǁ ∇ ℓ (h t , z t ) ǁ . 1/ η + η ∑ t Δ t We know Δ t ≤ 1 by assumption, so we can choose η =1/ √ T and get Regret ≤ O( √ T ) . “No regret” : average regret → 0. 27
Online learning with purchased data a. Review of online learning b. Our model: adding $$ c. Deriving our mechanism and results 28
First: model of strategic data-holder Model of agent: ● holds data z t and cost c t ● cost is threshold price ○ agent agrees to sell data iff price ≥ c t ○ interpretations: privacy, transaction cost, …. c t z t ● Assume: all costs ≤ 1 29
Model of agent-mechanism interaction ● Mechanism posts menu of prices offered: data: (32,12) (20,18) (32,12) price: $0.22 $0.41 $0.88 ● agent t arrives c t z t ● If c t ≤ price(z t ) , agent accepts : ○ agent reveals (z t , c t ) ○ mechanism pays agent price(z t ) ● Otherwise, agent rejects : ○ mechanism learns that agent rejected, pays nothing 30
Recall: standard online learning model For t = 1, … , T : ● algorithm posts a hypothesis h t ● data point z t arrives ● algorithm sees z t and updates to h t+1 31
Our model: online learning with $$ For t = 1, … , T : ● mechanism posts a hypothesis h t and a menu of prices ● data point z t arrives with cost c t c t z t ● If c t ≤ menu price of z t : mech pays price, learns z t ● else: mech pays nothing Loss = ∑ t ℓ (h t , z t ) Regret = Loss - ∑ t ℓ (h * , z t ) where h * minimizes sum 32
Online learning with purchased data a. Review of online learning b. Our model: adding $$ c. Deriving our mechanism and results 33
Start easy Suppose all costs are 1 . ⇒ Determine which data points to sample. data: (32,12) (20,18) (32,12) c t z t price: $1 $0 $0 34
Start easy Suppose all costs are 1 . ⇒ Determine which data points to sample. data: (32,12) (20,18) (32,12) c t z t price: $1 $0 $0 Examples: ● B = T/2 ● B = √ T ● B = log(T) 35
Key idea #1: randomly sample Can purchase each data point z t with probability q t (z t ) . Menu is now randomly chosen : data: (32,12) (20,18) (32,12) Pr[price=1]: 0.3 0.06 0.41 1/ η + E [ η ∑ t ( Δ t 2 / q t ) ] 36
Key idea #1: randomly sample Can purchase each data point z t with probability q t (z t ) . Menu is now randomly chosen : data: (32,12) (20,18) (32,12) Pr[price=1]: 0.3 0.06 0.41 Lemma (importance-weighted regret bound): For any q t s, the regret of (modified) FTRL is O of 1/ η + η E [ ∑ t ( Δ t 2 / q t ) ] See also: Importance-Weighted Active Learning , Beygelzimer et al, ICML 2009. 37
Result for easy case Lemma (importance-weighted regret bound): For any q t s, the regret of (modified) FTRL is O of 1/ η + η E [ ∑ t ( Δ t 2 / q t ) ] Corollary: Setting all q t = B/T and choosing η = √ B / T yields regret ≤ T / √ B . “No data, no regret” : average amount of data → 0 and average regret → 0. 38
Recommend
More recommend