 
              Robust Learning from Untrusted Sources Nikola Konstantinov Christoph H. Lampert ICML, June 2019 Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 1 / 13
Motivation Collecting data for machine learning applications Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 2 / 13
Motivation Collecting data for machine learning applications Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 2 / 13
Motivation Collecting data for machine learning applications Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 2 / 13
Motivation Using multiple data sources Crowdsourcing Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 3 / 13
Motivation Using multiple data sources Crowdsourcing Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 3 / 13
Motivation Using multiple data sources Web crawling Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 3 / 13
Motivation Using multiple data sources Data from personal devices Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 3 / 13
Motivation Using multiple data sources Data from different labs Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 3 / 13
Motivation Using multiple data sources Data from different labs How can we learn robustly from such data? Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 3 / 13
Motivation Learning from untrusted sources Motivation Untrusted sources can provide valuable data for training. Some of these data batches might be corrupted or irrelevant. Goal Naive approaches are to: Simply train on all data. Train only on the trusted subset. Can we do better? Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 4 / 13
Theory Setup Learning task Unknown target distribution D T on X × Y . Loss function L : Y × Y → R + . Want to learn a predictor h : X → Y from a hypothesis class H . Given Have a small reference dataset: � x T 1 , y T � � x T m T , y T � S T = { , . . . , } ∼ D T 1 m T Also given m i data points from each source i = 1 , . . . , N : � x i 1 , y i � � x i m i , y i � S i = { , . . . , } ∼ D i 1 m i Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 5 / 13
Theory Approach Assign weights α = ( α 1 , ..., α N ) to the sources, � N i =1 α i = 1. Minimize the α -weighted empirical loss:   N m i 1 ˆ � � x i , y i � � � � h α = argmin ǫ α ( h ) = argmin ˆ L h α i  j j  m i h ∈H h ∈H i =1 j =1 Want a small expected loss on the target distribution: � � � � ˆ L (ˆ h α = E D T h α ( x ) , y ) ǫ T How to decide which sources are trustworthy ? Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 6 / 13
Theory Approach Discrepancies between the sources (Kifer et al., VLDB 2004; Mohri et al., ALT 2012): disc H ( D i , D T ) = sup | ǫ i ( h ) − ǫ T ( h ) | h ∈H Small if H does not distinguish between the two learning tasks. Popular in the domain adaptation literature. Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 7 / 13
Theory Bound on the expected loss Given a hypothesis set H , let: ˆ h α = argmin h ∈H ˆ ǫ α ( h ) h ∗ T = argmin h ∈H ǫ T ( h ) For any δ > 0, with probability at least 1 − δ : | ǫ T (ˆ h α ) − ǫ T ( h ∗ T ) | ≤ � N N N � α 2 � � � � i 2 α i disc H ( D i , D T ) + C ( δ ) + 4 α i R i ( H , L ) � m i i =1 i =1 i =1 Similar bounds in Ben-David et al., ML 2010; Zhang et al., NIPS 2013. Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 8 / 13
Theory Bound on the expected loss Given a hypothesis set H , let: ˆ h α = argmin h ∈H ˆ ǫ α ( h ) h ∗ T = argmin h ∈H ǫ T ( h ) For any δ > 0, with probability at least 1 − δ : | ǫ T (ˆ h α ) − ǫ T ( h ∗ T ) | ≤ � N N N � α 2 � � � � i 2 α i disc H ( D i , D T ) + C ( δ ) + 4 α i R i ( H , L ) � m i i =1 i =1 i =1 Similar bounds in Ben-David et al., ML 2010; Zhang et al., NIPS 2013. Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 8 / 13
Theory Bound on the expected loss Given a hypothesis set H , let: ˆ h α = argmin h ∈H ˆ ǫ α ( h ) h ∗ T = argmin h ∈H ǫ T ( h ) For any δ > 0, with probability at least 1 − δ : | ǫ T (ˆ h α ) − ǫ T ( h ∗ T ) | ≤ � N N N � α 2 � � � � i 2 α i disc H ( D i , D T ) + C ( δ ) + 4 α i R i ( H , L ) � m i i =1 i =1 i =1 Similar bounds in Ben-David et al., ML 2010; Zhang et al., NIPS 2013. Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 8 / 13
Theory Bound on the expected loss Given a hypothesis set H , let: ˆ h α = argmin h ∈H ˆ ǫ α ( h ) h ∗ T = argmin h ∈H ǫ T ( h ) For any δ > 0, with probability at least 1 − δ : | ǫ T (ˆ h α ) − ǫ T ( h ∗ T ) | ≤ � N N N � α 2 � � � � i 2 α i disc H ( D i , D T ) + C ( δ ) + 4 α i R i ( H , L ) � m i i =1 i =1 i =1 Similar bounds in Ben-David et al., ML 2010; Zhang et al., NIPS 2013. Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 8 / 13
Theory Bound on the expected loss Given a hypothesis set H , let: ˆ h α = argmin h ∈H ˆ ǫ α ( h ) h ∗ T = argmin h ∈H ǫ T ( h ) For any δ > 0, with probability at least 1 − δ : | ǫ T (ˆ h α ) − ǫ T ( h ∗ T ) | ≤ � N N N � α 2 � � � � i 2 α i disc H ( D i , D T ) + C ( δ ) + 4 α i R i ( H , L ) � m i i =1 i =1 i =1 Similar bounds in Ben-David et al., ML 2010; Zhang et al., NIPS 2013. Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 8 / 13
Theory Bound on the expected loss Given a hypothesis set H , let: ˆ h α = argmin h ∈H ˆ ǫ α ( h ) h ∗ T = argmin h ∈H ǫ T ( h ) For any δ > 0, with probability at least 1 − δ : | ǫ T (ˆ h α ) − ǫ T ( h ∗ T ) | ≤ � N N N � α 2 � � � � i 2 α i disc H ( D i , D T ) + C ( δ ) + 4 α i R i ( H , L ) � m i i =1 i =1 i =1 Similar bounds in Ben-David et al., ML 2010; Zhang et al., NIPS 2013. Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 8 / 13
Theory Algorithm Theory suggests: Select α by minimizing: � N N � α 2 � � � i α i disc H ( D i , D T ) + λ � m i i =1 i =1 Find ˆ h α by minimizing the α -weighted empirical risk. Choose λ by cross-validation on the reference dataset. Trade-off between exploiting trusted sources and using all data. In practice, work with the empirical discrepancies: m i m T | 1 1 � � x i , y i x T , y T � � � � � � � � disc H ( S i , S T ) = sup L h − L h | j j j j m i m T h ∈H j =1 j =1 Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 9 / 13
Theory Experiments Evaluate empirically on: Multitask Dataset of Product Reviews 1 . Animals with Attributes 2 2 . Some clean reference data for a target task is available. Have other subsets, some of which are corrupted. Experimented with various manipulations/problems with the data. 1 Pentina et al., ICML 2017; McAuley et al., 2015 2 Xian et al., TPAMI 2018 Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 10 / 13
Theory Results Ours Average classification error Reference only 0.40 All data Pregibon et al. Median of probs 0.35 Feng et al. Yin et al. Batch norm 0.30 0.25 0.20 0 10 20 30 40 50 60 Number of corrupted sources Figure: Animals with Attributes 2: RGB channels swapped Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 11 / 13
Theory Summary Data from different sources is naturally heterogeneous. Our method suppresses the effect of corrupted/irrelevant data. The approach is theoretically justified and shows good empirical performance. The algorithm can be applied even when the data is private and/or distributed. Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 12 / 13
Theory Summary Data from different sources is naturally heterogeneous. Our method suppresses the effect of corrupted/irrelevant data. The approach is theoretically justified and shows good empirical performance. The algorithm can be applied even when the data is private and/or distributed. Thank you for your attention! Poster 156 Konstantinov, Lampert; IST Austria Robust Learning from Untrusted Sources Poster 156 12 / 13
Recommend
More recommend