sample complexity and
play

Sample Complexity and Expressiveness Roi Livni and Yishay Mansour - PowerPoint PPT Presentation

Graph Based- Discriminators Sample Complexity and Expressiveness Roi Livni and Yishay Mansour Discrimination A discriminator is provided with two data sets. 1 1 2 2 Decide if 1 and 2 are


  1. Graph Based- Discriminators Sample Complexity and Expressiveness Roi Livni and Yishay Mansour

  2. Discrimination β€’ A discriminator is provided with two data sets. β€’ 𝑇 1 ∼ 𝑄 1 β€’ 𝑇 2 ∼ 𝑄 2 β€’ Decide if 𝑄 1 and 𝑄 2 are different. β€’ If not, provide a certificate.

  3. Motivation: Synthetic Data Generation Goodfellow et al. ’ 14 https://thispersondoesnotexist.com/

  4. Discrimination: Learning Lens β€’ A learner is defined by a class 𝐼 βŠ† 0,1 π‘Œ β€’ Given labelled sample from some distribution 𝑄 over π‘Œ Γ— 0,1 β€’ Learner returns β„Ž ∈ 𝐼 such that 𝑄 (𝑦,𝑧) β„Ž 𝑦 β‰  𝑧 ≀ min β„ŽβˆˆπΌ 𝑄 (𝑦,𝑧) β„Ž 𝑦 β‰  𝑧 + πœ— β€’ If sup 𝐹 π‘¦βˆΌπ‘„ 1 β„Ž 𝑦 βˆ’ 𝐹 π‘¦βˆΌπ‘„ 2 β„Ž 𝑦 > πœ— β„ŽβˆˆπΌ β€’ Learner succeeds.

  5. Learning as a discrimination task β€’ Discriminator is defined by a class of distinguishers 𝐼 βŠ† 0,1 π‘Œ Integral Probability Metric: (Muller ’ 97) 𝐽𝑄𝑁 𝐼 𝑄 1 , 𝑄 2 = sup |𝐹 π‘¦βˆΌπ‘„ 1 β„Ž 𝑦 βˆ’ 𝐹 π‘¦βˆΌπ‘„ 2 β„Ž 𝑦 | β„ŽβˆˆπΌ β€’ If 𝐽𝑄𝑁 𝐼 𝑄 1 , 𝑄 2 > πœ— -- return β„Ž ∈ 𝐼 with 𝐽𝑄𝑁 𝐼 𝑄 1 , 𝑄 2 > πœ—/2 β€’ If not, may fail. (return EQUIVALENT).

  6. Higher order discrimination β€’ Instead of considering hypotheses classes, what if we take other types of statistical tests: β€’ Example: Collision test β€’ Estimate probability to draw the same point twice. If different – declare distinct. β€’ If not, may fail (return equivalent).

  7. Higher order discrimination β€’ Instead of considering hypotheses classes, what if we take other types of distinguishers: β€’ More generally: Take a family G = {𝑕: 𝑕: π‘Œ 2 β†’ 0,1 } 𝐽𝑄𝑁 𝐻 𝑄 1 , 𝑄 2 = sup 𝐹 𝑦 1, 𝑦 2 )βˆΌπ‘„ 1 2 𝑕 𝑦 1 , 𝑦 2 βˆ’ 𝐹 𝑦 1, 𝑦 2 )βˆΌπ‘„ 2 2 𝑕 𝑦 1 , 𝑦 2 π‘•βˆˆπ» β€’ Are graph-based distinguishers stronger than classical distinguishers? β€’ Sample Complexity?

  8. Expressive power of graph-based discriminators THEOREM: Let X be an infinite domain. There exists a graph g such that: For every hypothesis class H with finite VC dimension and πœ— > 0 , there are two distributions 𝑄 π‘‘π‘§π‘œ , 𝑄 π‘ π‘“π‘π‘š such that 𝐽𝑄𝑁 𝐼 π‘ž π‘‘π‘§π‘œ , π‘ž π‘ π‘“π‘π‘š < πœ— and, [𝑕(𝑦 1 , 𝑦 2 )] > 1 𝐹 (𝑦 1 ,𝑦 2 )βˆΌπ‘ž π‘‘π‘§π‘œ [𝑕 𝑦 1 ,𝑦 2 )] βˆ’ 𝐹 (𝑦 1 ,𝑦 2 )βˆΌπ‘ž π‘ π‘“π‘π‘š 2 2 4 (L, Mansour ’ 19)

  9. Finite Version If |X|=N, there is a graph g such that for every class H there are two ● distributions that are H-indistinguishable, g-distinguishable unless: π‘Šπ· 𝐼 = Ξ©(πœ— 2 log 𝑂) (L, Mansour ’ 19) β—‹ Optimal : For every graph-based class G with finite capacity there is a ● hypothesis class H with VC dimension 𝑃(πœ— 2 log 𝑂) such that 𝐽𝑄𝑁 𝐷 π‘ž π‘‘π‘§π‘œ , p π‘ π‘“π‘π‘š > 1 4 β‡’ 𝐽𝑄𝑁 𝐻 π‘ž π‘‘π‘§π‘œ , p π‘ π‘“π‘π‘š > πœ— (Alon, L, Mansour) Given a graph g how many sets are needed to separate every dense set from every β—‹ sparse set?

  10. Sample complexity of graph-based discriminators For a family of graph G. ● Given samples from two unknown distributions 𝑄 1 , 𝑄 2 : Decide if ● 𝐽𝑄𝑁 𝐻 𝑄 1 , 𝑄 2 > πœ— How many examples are needed? ● Recall: ● For an hypothesis class, a discriminator can decide if 𝐽𝑄𝑁 𝐼 𝑄 1 , 𝑄 2 > πœ— , if and β—‹ only if H has finite VC dimension. Θ π‘Šπ· 𝐼 /πœ— 2 are needed β—‹

  11. The graph-VC dimension The graph VC dimension is obtained by considering the projections of the ● graph by fixing a vertex. Namely, for every x consider the hypothesis class 𝐼 𝑦 = 𝑕 𝑦,β‹… : π‘Œ β†’ 0,1 : 𝑕 ∈ 𝐻 Then: π‘•π‘Šπ· 𝐷 = sup π‘Šπ·(𝐼 𝑦 ) ● π‘¦βˆˆπ‘Œ 𝑃(π‘•π‘Šπ· 𝐷 ) are sufficient. ● Ξ©( π‘•π‘Šπ· 𝐷 ) are necessary. ● (L, Mansour ’ 19)

Recommend


More recommend