Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong - PowerPoint PPT Presentation

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman 1

Motivation Mismatch between different domains/datasets TRAIN – Object recognition • Ex. [Torralba & Efros’11, Perronnin et al.’10] – Video analysis • Ex. [Duan et al.’09, 10] – Pedestrian detection • Ex. [Dollár et al.’09] – Other vision tasks Performance TEST degrades significantly! 2 Images from [Saenko et al.’10].

Unsupervised domain adaptation • Source domain (labeled) = = ! D {( x , y ), i 1,2, , N } ~ P X Y ( , ) S i i S • Target domain (unlabeled) = = ? ! D {( x , ), i 1,2 , , M } ~ P X Y ( , ) T i T The two distributions • Objective are not the same! Train classification model to work well on the target 3

Challenges • How to optimally, w.r.t. target domain, define discriminative loss function select model, tune parameters • How to solve this ill-posed problem? impose additional structure 4

Examples of existing approaches • Correcting sample bias – Ex. [Shimodaira’00, Huang et al.’06, Bickel et al.’07] – Assumption: marginal distributions are the only difference. • Learning transductively – Ex. [Bergamo & Torresani’10, Bruzzone & Marconcini’10] – Assumption: classifiers have high-confidence predictions across domains. • Learning a shared representation – Ex. [Daumé III’07, Pan et al.’09, Gopalan et al.’11] – Assumption: a latent feature space exists in which classification hypotheses fit both domains. 5

Our approach: learning a shared representation Key insight: bridging the gap F ( ) t Target – Fantasize infinite number of Source æ F ö (0) T domains ç ÷ ! ç ÷ ç ÷ ¥ = ç F z ( ) t T x ÷ ! ç ÷ – Integrate out analytically ç ÷ F (1) T è ø idiosyncrasies in domains á ¥ ¥ ñ z , z i j – Learn invariant features by constructing kernel 6

Main idea: geodesic flow kernel æ F ö (0) T ç ÷ ! ç ÷ 1 ç ÷ ¥ = ç F z ( ) t T x ÷ ! ç ÷ ç ÷ á ¥ ¥ ñ z , z F T F (1) è ø ( ) t i j Target 4 Source 2 3 1. Model data with linear subspaces 2. Model domain shift with geodesic flow 3. Derive domain-invariant features with kernel 4. Classify target data with the new features 7

Modeling data with linear subspaces Assume low-dimensional structure Target Source Ex. PCA, Partial Least Squares (source only) 8

Characterizing domains geometrically Target subspace Source subspace G ( , d D ) Grassmann manifold – Collection of d -dimensional subspaces of a vector space < D R ( d D ) – Each point corresponds to a subspace 9

Modeling domain shift with geodesic flow F (1) Target F (0) F £ £ ( ),0 t t 1 Source Geodesic flow on the manifold – starting at source & arriving at target in unit time – flow parameterized with one parameter – closed-form, easy to compute with SVD 10

Modeling domain shift with geodesic flow F (1) Subspaces: Target F (0) F £ £ ( ),0 t t 1 Source Domains: Source Target 11

Modeling domain shift with geodesic flow F (1) Subspaces: Target F (0) F £ £ ( ),0 t t 1 Source Domains: Along this flow, points (subspaces) represent intermediate domains. 12

Domain-invariant features ¥ = z F F F T T T [ (0) x , ! , ( ) t x , ! , (1) x ] F (1) F (0) F £ £ ( ),0 t t 1 Source Target More similar to source. 13

Domain-invariant features ¥ = z F F F T T T [ (0) x , ! , ( ) t x , ! , (1) x ] F (1) F (0) F £ £ ( ),0 t t 1 Source Target More similar to target. 14

Domain-invariant features ¥ = z F F F T T T [ (0) x , ! , ( ) t x , ! , (1) x ] F (1) F (0) F £ £ ( ),0 t t 1 Source Target Blend the two. 15

Measuring feature similarities with inner products ¥ = F F F T T T z [ (0) x , ! , ( ) t x , ! , (1) x ] i i i i ¥ = F F F T T T z [ (0) x , ! , ( ) t x , ! , (1 ) x ] j j j j More similar to More similar to source. target. z ¥ ¥ á ñ , z : Invariant to either source or target. i j 16

Learning domain-invariant features with kernels We define the geodesic flow kernel (GFK) : 1 ò ¥ ¥ á ñ = F F = T T T T z , z ( ( ) t x ) ( ( t ) x ) dt x Gx i j i j i j 0 • Advantages – Analytically computable – Robust to variants towards either source or target – Broadly applicable: can kernelize many classifiers 17

Contrast to discretely sampling GFK (ours) [Gopalan et al. ICCV 2011] F £ £ ( ),0 t t 1 F (1) F (0) ¥ ¥ á ñ = z , z Dimensionality i j 1 ò reduction F F = T T T T ( ( ) t x ) ( ( t ) x ) dt x Gx i j i j 0 Number of subspaces, No free parameters dimensionality of subspace, dimensionality after reduction GFK is conceptually cleaner and computationally more tractable. 18

Recap of key steps 1 )(+) - Target ⋮ subspace ( " = 2 3 Source )(/) - x ⋮ subspace )(0) - ! " á ¥ ¥ ñ = $ % &$ ' z , z i j 4 19

Experimental setup • Four domains Caltech-256 Amazon • Features Bag-of-SURF • Classifier: 1NN DSLR Webcam • Average over 20 random trials 20

Classification accuracy on target No adaptation [Gopalan et al.'11] GFK (ours) 40 Accuracy (%) 30 20 10 W-->C W-->A C-->D C-->A A-->W A-->C D-->A Source à Target 21

Which domain should be used as the source? DSLR Caltech-256 Amazon Webcam 24

Automatically selecting the best We introduce the Rank of Domains measure: Intuition – Geometrically, how subspaces disagree – Statistically, how distributions disagree 25

Automatically selecting the best Our Possible No adaptation [Gopalan et al.'11] GFK (ours) ROD sources Accuracy (%) measure 40 0.003 Caltech-256 30 0 Amazon 20 0.26 DSLR 10 0.05 Webcam W-->A C256-->A D-->A Source à Target Caltech-256 adapts the best to Amazon. 26

Semi-supervised domain adaptation Label three instances per category in the target No adaptation [Saenko et al.'10] [Gopalan et al.'11] GFK (ours) 60 50 Accuracy (%) 40 30 20 10 W-->C W-->A C-->D C-->A A-->W A-->C D-->A Source à Target 27

Analyzing datasets in light of domain adaptation Cross-dataset generalization [Torralba & Efros’11] Self Cross (no adaptation) Cross (with adaptation) 70 Accuracy (%) 60 50 40 30 PASCAL ImageNet Caltech-101 28

Analyzing datasets in light of domain adaptation Cross-dataset generalization [Torralba & Efros’11] Self Cross (no adaptation) Cross (with adaptation) Performance 70 drop! Accuracy (%) 60 50 40 30 PASCAL ImageNet Caltech-101 Caltech-101 generalizes the worst. Performance drop of ImageNet is big. 29

Analyzing datasets in light of domain adaptation Cross-dataset generalization [Torralba & Efros’11] Self Cross (no adaptation) Cross (with adaptation) 70 Performance Accuracy (%) drop becomes 60 smaller! 50 40 30 PASCAL ImageNet Caltech-101 Caltech-101 generalizes the worst (w/ or w/o adaptation). There is nearly no performance drop of ImageNet. 30

Summary • Unsupervised domain adaptation – Important in visual recognition – Challenge: no labeled data from the target • Geodesic flow kernel (GFK) – Conceptually clean formulation : no free parameter – Computationally tractable : closed-form solution – Empirically successful : state-of-the-art results • New insight on vision datasets – Cross-dataset generalization with domain adaptation – Leveraging existing datasets despite their idiosyncrasies 31

Future work • Beyond subspaces Other techniques to model domain shift • From GFK to statistical flow kernel Add more statistical properties to the flow • Applications of GFK Ex., face recognition, video analysis 32

Summary • Unsupervised domain adaptation – Important in visual recognition – Challenge: no labeled data from the target • Geodesic flow kernel (GFK) – Conceptually clean formulation – Computationally tractable – Empirically successful • New insight on vision datasets – Cross-dataset generalization with domain adaptation – Leveraging existing datasets despite their idiosyncrasies 33

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong - PowerPoint PPT Presentation

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman 1 Motivation Mismatch between different domains/datasets TRAIN Object

discrepancy for unsupervised domain adaptation Hongliang Yan 2017/06/21 Domain Adaptation DA

Unsupervised Clustering Approaches for Domain Adaptation in Speaker Recognition Systems Stephen

Lightweight Unsupervised Domain Adaptation by Convolutional Filter Reconstruction Rahaf Aljundi,

Towards Assumption-free Unsupervised Domain Adaptation for Visual recognition

Geodesic Distance Distance based based Geodesic Fuzzy Clustering Clustering Fuzzy Abonyi and

Domain Adaptation with Asymmetrically Relaxed Distribution Alignment Yifan Wu , Ezra Winston,

UNSUPERVISED LEARNING, CLUSTERING UNSUPERVISED LEARNING UNSUPERVISED LEARNING Supervised

Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Minema Minema

Unsupervised learning of multimodal image registration using domain adaptation with projected

Adaptation Philipp Koehn 27 October 2020 Philipp Koehn Machine Translation: Adaptation 27

Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation 1,2 1,4 Xiang

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Unsupervised Learning and Clustering l In unsupervised learning you are given a data set with no

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

Robust Causal Domain Adaptation in a Simple Diagnostic Setting Thijs van Ommen Ghent, July 4,

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Curve Sketching Since we have graphing calculators, we can find the graph easily of any function,

Lecture 19 Randomness, Pseudo Randomness, and Confidentiality Stephen Checkoway University

Space-Efficient Block Storage Integrity Alina Oprea, Michael Reiter, and Ke Yang NDSS 05

Derandomised lattice rules for high dimensional integration Ian Sloan i.sloan@unsw.edu.au

Main Effects vs. Simple Effects Scott Fraundorf MLM Reading Group April 7th, 2011 If you want

European Zero Emission Technology and Innovation Platform Introduction- Classifying CCU and CCS

Rethinking CO 2 : how can we put it to use Richard Howard, Head of Environment &

Yash Goyal Aishwarya Agrawal (Georgia Tech) (Georgia Tech) Outline Overview of Task and

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong - PowerPoint PPT Presentation

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman 1 Motivation Mismatch between different domains/datasets TRAIN Object

discrepancy for unsupervised domain adaptation Hongliang Yan 2017/06/21 Domain Adaptation DA

Unsupervised Clustering Approaches for Domain Adaptation in Speaker Recognition Systems Stephen

Lightweight Unsupervised Domain Adaptation by Convolutional Filter Reconstruction Rahaf Aljundi,

Towards Assumption-free Unsupervised Domain Adaptation for Visual recognition

Geodesic Distance Distance based based Geodesic Fuzzy Clustering Clustering Fuzzy Abonyi and

Domain Adaptation with Asymmetrically Relaxed Distribution Alignment Yifan Wu , Ezra Winston,

UNSUPERVISED LEARNING, CLUSTERING UNSUPERVISED LEARNING UNSUPERVISED LEARNING Supervised

Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Minema Minema

Unsupervised learning of multimodal image registration using domain adaptation with projected

Adaptation Philipp Koehn 27 October 2020 Philipp Koehn Machine Translation: Adaptation 27

Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation 1,2 1,4 Xiang

Tight Kernel Query Complexity of Kernel Ridge Regression and Kernel -means Clustering Manuel

Unsupervised Learning and Clustering l In unsupervised learning you are given a data set with no

4CSLL5 Parameter Estimation (Supervised and Unsupervised) Unsupervised Maximum Likelihood

Robust Causal Domain Adaptation in a Simple Diagnostic Setting Thijs van Ommen Ghent, July 4,

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Curve Sketching Since we have graphing calculators, we can find the graph easily of any function,

Lecture 19 Randomness, Pseudo Randomness, and Confidentiality Stephen Checkoway University

Space-Efficient Block Storage Integrity Alina Oprea, Michael Reiter, and Ke Yang NDSS 05

Derandomised lattice rules for high dimensional integration Ian Sloan i.sloan@unsw.edu.au

Main Effects vs. Simple Effects Scott Fraundorf MLM Reading Group April 7th, 2011 If you want

European Zero Emission Technology and Innovation Platform Introduction- Classifying CCU and CCS

Rethinking CO 2 : how can we put it to use Richard Howard, Head of Environment &amp;

Yash Goyal Aishwarya Agrawal (Georgia Tech) (Georgia Tech) Outline Overview of Task and

Rethinking CO 2 : how can we put it to use Richard Howard, Head of Environment &