Semi-Supervised Learning Barnabas Poczos Slides Courtesy: Jerry - PowerPoint PPT Presentation

Semi-Supervised Learning Barnabas Poczos Slides Courtesy: Jerry Zhu, Aarti Singh

Supervised Learning Feature Space Label Space Goal: Optimal predictor (Bayes Rule) depends on unknown P XY , so instead learn a good prediction rule from training data Learning algorithm Labeled 2

Labeled and Unlabeled data “Crystal” “Needle” “Empty” “0” “1” “2” … “Sports” Human expert/ “News” Special equipment/ “Science” Experiment … Cheap and abundant ! Expensive and scarce ! 3

Free-of-cost labels? Luis von Ahn: Games with a purpose (ReCaptcha) Word challenging to OCR (Optical Character Recognition) You provide a free label! 4

Semi-Supervised learning Learning algorithm Supervised learning (SL) “Crystal” Semi-Supervised learning (SSL) Goal: Learn a better prediction rule than based on labeled data alone. 5

Semi-Supervised learning in Humans 6

Can unlabeled data help? Positive labeled data Negative labeled data Unlabeled data Supervised Decision Boundary Semi-Supervised Decision Boundary Assume each class is a coherent group (e.g. Gaussian) Then unlabeled data can help identify the boundary more accurately. 7

Can unlabeled data help? “0” “1” “2” … 7 7 1 1 2 2 9 9 4 4 8 8 3 3 5 5 This embedding can be done by manifold learning algorithms “Similar” data points have “similar” labels 8

Some SSL Algorithms ▪ Self-Training ▪ Generative methods, mixture models ▪ Graph-based methods ▪ Co-Training ▪ Semi-supervised SVM ▪ Many others 9

Notation 10

Self-training 11

Self-training Example Propagating 1-NN 12

Mixture Models for Labeled Data 15

Mixture Models for Labeled Data Estimate the parameters from the labeled data Decision for any test > 1/2 < point not in the labeled dataset 16

Mixture Models for Labeled Data 17

Mixture Models for SSL Data 18

Mixture Models 19

Mixture Models SL vs SSL 20

Mixture Models 21

Gaussian Mixture Models 22

EM for Gaussian Mixture Models 23

Assumption for GMMs 24

Related: Cluster and Label 27

Graph Based Methods Assumption: Similar unlabeled data have similar labels. 29

Graph Regularization Similarity Graphs: Model local neighborhood relations between data points Assumption: Nodes connected by heavy edges tend to have similar label 30

Graph Regularization If data points i and j are similar (i.e. weight w ij is large), then their labels are similar f i = f j Loss on labeled data Graph based smoothness prior (mean square,0-1) on labeled and unlabeled data 31

Co-training

Co-training Algorithm Co-training (Blum & Mitchell, 1998) (Mitchell, 1999) assumes that (i) features can be split into two sets; (ii) each sub- feature set is sufficient to train a good classifier. • Initially two separate classifiers are trained with the labeled data, on the two sub-feature sets respectively. • Each classifier then classifies the unlabeled data, and ‘teaches’ the other classifier with the few unlabeled examples (and the predicted labels) they feel most confident. • Each classifier is retrained with the additional training examples given by the other classifier, and the process repeats. 33

Co-training Algorithm Blum & Mitchell’98

Semi-Supervised SVMs 35

Semi-Supervised Learning ▪ Generative methods ▪ Graph-based methods ▪ Co-Training ▪ Semi-Supervised SVMs ▪ Many other methods SSL algorithms can use unlabeled data to help improve prediction accuracy if data satisfies appropriate assumptions 36

Semi-Supervised Learning Barnabas Poczos Slides Courtesy: Jerry - PowerPoint PPT Presentation

Semi-Supervised Learning Barnabas Poczos Slides Courtesy: Jerry Zhu, Aarti Singh Supervised Learning Feature Space Label Space Goal: Optimal predictor (Bayes Rule) depends on unknown P XY , so instead learn a good prediction rule from training

Margin-based Semi-supervised Learning Using Apollonius circle MONA EMADI AND JAFAR TANHA T TC S

Semi-Supervised Learning Maria-Florina Balcan 03/30/2015 Readings: Semi-Supervised Learning.

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Unsupervised and Semi-supervised Learning of Structure Graham Neubig Site

Support Vector Machines (SVMs). Semi-Supervised Learning. Semi-Supervised SVMs.

CS330 Paper Presentation: October 16th, 2019 Supervised Classification Semi-Supervised

Semi-Supervised Kernel Mean Shift Clustering A Semi-Supervised Clustering Approach Motivation:

Semi-Supervised Local Fisher Semi-Supervised Local Fisher Discriminant Analysis Discriminant

Iterative Hybrid Algorithm for Semi-supervised Classification Martin SAVESKI Supervised by

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

5 Semi-Supervised Learning BVM Tutorial: Advanced Deep Learning Methods David Zimmerer, Division

Semi-Supervised Learning Jia-Bin Huang Virginia Tech Spring 2019 ECE-5424G / CS-5824

10701 Semi supervised learning Can Unlabeled Data improve supervised learning? Important

Parallelizing Semi- ReDAS Lab Supervised Learning Algorithms with MapReduce Nick Gauthier

Keepin It Real: Semi-Supervised Learning with Realistic Tuning Andrew B. Goldberg Xiaojin

Semi-Supervised Learning Tutorial Xiaojin Zhu Department of Computer Sciences University of

Applying Link-based Classification to Label Blogs Graham Cormode Smriti Bhagat, Irina Rozenbaum

Introduction to Classification and Sequence Labeling Grzegorz Chrupa la Spoken Language

Lecture 7: Sequence Labeling Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel Center Recap:

Statistically Based Model Comparison Techniques H. T. Banks Center for Research in Scientific

Neural Methods for Semantic Role Labeling Diego Marcheggiani , Michael Roth, Ivan Titov, Benjamin

An Overview of Labelling-Based Justification Status Martin Caminada Yining Wu 1 1

Spectral gap-labelling conjecture for magnetic Schrdinger operators and recent progress Recent

TOPONYMS 26th International Cartographic Conference August 25-30, 2013 | Dresden Germany