Iterative Hybrid Algorithm for Semi-supervised Classification Martin SAVESKI Supervised by professor Thierry Arti` eres University Pierre and Marie Curie June 19, 2012 Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Outline Intro to Semi-supervised Learning The Iterative Hybrid Algorithm Other methods Experiments Performance comparison and observations Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Classical Supervised Learning Scenario Learning X, C Model Algorithm Parameters Label Dataset {(x 1 ,c 1 ), (x 2 , c 2 ), … ( x n , c n )} Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Semi-Supervised Learning Unlabeled Data {x 1 , x 2 , … x n } X U Learning X L , C L Model Algorithm Parameters Label Dataset {(x 1 ,c 1 ), (x 2 , c 2 ), … ( x n , c n )} How to use the unlabeled data to build better classifiers? Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Generative v.s. Discriminative Models Generative Models Model how samples from a particular class are generated p modeling inputs, hidden variables, and outputs jointly Strong modeling power, can easily handle missing values N � L G ( θ ) = p ( X , C , θ ) = p ( θ ) p ( x n , c n | θ c n ) . n =1 Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Generative v.s. Discriminative Models Generative Models Model how samples from a particular class are generated p modeling inputs, hidden variables, and outputs jointly Strong modeling power, can easily handle missing values N � L G ( θ ) = p ( X , C , θ ) = p ( θ ) p ( x n , c n | θ c n ) . n =1 Discriminative Models Concerned with defining the boundaries between the classes Directly optimize the boundary Tend to achieve better accuracy N � L D ( θ ) = p ( C | X , θ ) = p ( c n | x n , θ ) . n =1 Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Generative v.s. Discriminative Models Generative Models Model how samples from a particular class are generated p modeling inputs, hidden variables, and outputs jointly Strong modeling power, can easily handle missing values N � L G ( θ ) = p ( X , C , θ ) = p ( θ ) p ( x n , c n | θ c n ) . n =1 Discriminative Models Concerned with defining the boundaries between the classes Directly optimize the boundary Tend to achieve better accuracy N � L D ( θ ) = p ( C | X , θ ) = p ( c n | x n , θ ) . n =1 No easy way to combine them! Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Iterative Hybrid Algorithm Input: Labeled and Unlabeled data U Label L + U L L + U L Labeled Part of U Labeled Generative Generative Discriminative Generative Model Model Model Model Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Iterative Hybrid Algorithm (more formally) 1 Learn ˜ θ on L → ˜ θ (0) , by maximizing the following objective function: log p ( x | c , ˜ � θ ) x ∈ L 2 Learn ˜ θ on L ∪ U → ˜ θ (1) , starting from ˜ θ (0) , maximizing: log p ( x | c , ˜ p ( x | c ′ , ˜ � � � θ ) + λ log θ ) x ∈ L x ∈ U c ′ Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Iterative Hybrid Algorithm (more formally) Loop n number of iterations, or until convergence: 1 Learn θ on L → θ ( i ) , starting from ˜ θ ( i ) , maximizing: − 1 θ ( i ) || 2 + 2 || θ − ˜ � log p ( c | x , θ ) x ∈ L 2 Use θ ( i ) to label part of U → U Labeled , where the labels are assigned as: p ( c | x , θ ( i ) ) x → c = arg max c 3 Learn ˜ θ on L + U Labeled → ˜ θ ( i ) , maximizing: log p ( x | c , ˜ log p ( x | c , ˜ � � θ ) + λ θ ) x ∈ L x ∈ U Labeled Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Other methods Hybrid Model (Bishop and Lasserre, 2007) Multi-criteria objective function Combines generative and discriminative models with specific priors Optimizes: p ( θ, ˜ p ( X m | ˜ � � θ ) p ( C n | X n , θ ) θ ) n ∈ L m ∈ L ∪ U Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Other methods Hybrid Model (Bishop and Lasserre, 2007) Multi-criteria objective function Combines generative and discriminative models with specific priors Optimizes: p ( θ, ˜ p ( X m | ˜ � � θ ) p ( C n | X n , θ ) θ ) n ∈ L m ∈ L ∪ U Entropy Minimization (Grandvalet and Bengio, 2005) Uses the label entropy on unlabeled data as a regularizer. Assumes a prior which prefers minimal class overlap Optimizes: � � � p ( c ′ | x , θ ) log p ( c ′ | x , θ ) log p ( c | x , θ ) + λ x ∈ L x ∈ U c ′ ∈ C Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Experiments Data Set Synthetic Data (2 dimensions, 2 classes) Generated by elongated Gaussian distributions 2 labeled points per class 200 unlabeled per class 200 test samples per class Model p ( x | c ) → Iso-tropic Gaussian distribution Symmetric distribution (model misspecification) Setup Generate random data and label random points Run all algorithms for all hyper-parameter values Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Example Data Set Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Results with Two Labeled Points 0.85 Iterative Hybrid Algorithm Hybrid Model Entropy Minimization 0.80 Performance 0.75 0.70 0.65 0.0 0.2 0.4 0.6 0.8 1.0 Parameters have different semantics, not directly comparable Hybrid Model > Iterative Hybrid Algorithm > Entropy Minimization Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Results with Two Labeled Points (cont.) 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 1.0 1.0 1.0 1.0 1.0 0.8 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.6 0.4 0.4 0.4 0.4 0.4 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 0.0 0.5 1.0 Iterative Hybrid Algorithm Hybrid Model Entropy Minimization Hard to fix the hyper-parameters Unstable behavior of the Entropy Minimization method IHA and HM have stable behavior (iterative process possible) Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Particular Cases Manually fixed points Boundary induced by the labeled points far from the real one Important feature Overlap on the x axis between labeled points If NO Overlap → both perform well If Overlap → Hybrid Model superior Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Particular Cases (HM superior scenario) Figure 5: A case where there is an overlap overlap between the labeled points of each class on the x axis. The Iterative Hybrid Algorithm is shown on the top and the Hybrid Model on the bottom. The Iterative Hybrid Algorithm correctly classifies the labeled points, but fails to converge to the real boundary between the classes. However, the Hybrid Model for α = 0 . 8 converges to a satisfactory solution. Top: Iterative Hybrid Algorithm Bottom: Hybrid Model Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Increasing the number of labeled examples 0.85 0.85 0.85 0.80 0.80 0.80 Performance 0.75 0.75 0.75 0.70 0.70 0.70 0.65 0.65 0.65 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 (a) Two labeled points (b) Four labeled points (c) Six labeled points Iterative Hybrid Algorithm Hybrid Model Entropy Minimization As the number of labeled examples increases Difference between IHA and HM diminishes Entropy Minimization, improved performance, but still behind Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
To sum up Iterative Algorithm for combining generative and discriminative models Compared with two other methods (HM and EM) Experiments on synthetic data IHA dominates Entropy Minimization, but outperformed by the Hybrid Model Difference vanishes as | L | increases Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
It is your turn now ... Questions? Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Hybrid Model (details) Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Entropy Minimization Entropy Minimization (Grandvalet and Bengio, 2005) Uses the label entropy on unlabeled data as a regularizer. Assumes a prior which prefers minimal class overlap Optimizes: � � � log p ( c | x , θ ) + λ p ( c ′ | x , θ ) log p ( c ′ | x , θ ) x ∈ L x ∈ U c ′ ∈ C Using U to estimate the conditional Entropy H ( Y | X ) (measure of class overlap) Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Why Discriminative? Martin SAVESKI Iterative Hybrid Algorithm for Semi-supervised Classification
Recommend
More recommend