Implicit Class-Conditioned Domain Alignment for Unsupervised Domain - PowerPoint PPT Presentation

Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation 1,2 1,4 Xiang Jiang Qicheng Lao 1,3 1 Stan Matwin Mohammad Havaei 1 Imagia 2 Dalhousie University 3 Polish Academy of Sciences 4 Mila, Universit´ e de Montr´ eal June 13, 2020 Implicit Alignment for UDA June 13, 2020 1 / 32

Introduction: Unsupervised Domain Adaptation Introduction: Unsupervised Domain Adaptation (UDA) The setup of UDA: predict observed variable X labeling function f , labels Y = f ( X ) domain variable D scanner The goal is to learn p ( y | x ) where predict D S = { ( x i , f S ( x i )) } n i =1 D T = { x j } m j =1 disease image f S = f T Implicit Alignment for UDA June 13, 2020 2 / 32

✶ Related Work Related Work Adversarial domain-discriminator based approaches [Ganin et al., 2016]: min L ( D S ) + λ dis ( D S , D T ) (1) θ max dis ( D S , D T ) (2) f Implicit Alignment for UDA June 13, 2020 3 / 32

✶ Related Work Related Work Adversarial domain-discriminator based approaches [Ganin et al., 2016]: min L ( D S ) + λ dis ( D S , D T ) (1) θ max dis ( D S , D T ) (2) f Limitation : p S ( x ) = p T ( x ) � p S ( x | y ) = p T ( x | y ) Implicit Alignment for UDA June 13, 2020 3 / 32

Related Work Related Work Adversarial domain-discriminator based approaches [Ganin et al., 2016]: min L ( D S ) + λ dis ( D S , D T ) (1) θ max dis ( D S , D T ) (2) f Limitation : p S ( x ) = p T ( x ) � p S ( x | y ) = p T ( x | y ) Prototype-based class-conditioned explicit alignment [Luo et al., 2017, Xie et al., 2018]: min L ( D S ) + λ 1 dis ( D S , D T ) + λ 2 L explicit (3) θ max dis ( D S , D T ) (4) f where S − c j T ] L explicit = E [ c j (5) S = 1 � c j ✶ { y i = j } f φ ( x i ) (6) N j ( x i , y i ) ∈D S Implicit Alignment for UDA June 13, 2020 3 / 32

Related Work Related Work Adversarial domain-discriminator based approaches [Ganin et al., 2016]: min L ( D S ) + λ dis ( D S , D T ) (1) θ max dis ( D S , D T ) (2) f Limitation : p S ( x ) = p T ( x ) � p S ( x | y ) = p T ( x | y ) Prototype-based class-conditioned explicit alignment [Luo et al., 2017, Xie et al., 2018]: min L ( D S ) + λ 1 dis ( D S , D T ) + λ 2 L explicit (3) θ max dis ( D S , D T ) (4) f where S − c j T ] L explicit = E [ c j (5) S = 1 � c j ✶ { y i = j } f φ ( x i ) (6) N j ( x i , y i ) ∈D S Limitation : Error accumulation in explicit optimization on pseudo-labels Implicit Alignment for UDA June 13, 2020 3 / 32

Motivation Motivations Applied motivation Theoretical motivation Implicit Alignment for UDA June 13, 2020 4 / 32

Motivation Applied Motivation Challenges for applying UDA in real-world applications [Tan et al., 2019]: within-domain class imbalance; between-domain class distribution shift, aka, prior probability shift. �� Implicit Alignment for UDA June 13, 2020 5 / 32 �

Motivation Theoretical Motivation: Empirical Domain Divergence Definition ([Ben-David et al., 2010]) The H ∆ H divergence between two domains is defined as | E D T [ h � = h ′ ] − E D S [ h � = h ′ ] | , d H ∆ H ( D S , D T ) = 2 sup (7) h , h ′ ∈H Definition (mini-batch based empirical domain discrepancy) Let B S , B T be minibatches from U S and U T , respectively, where B S ⊆ U S , B T ⊆ U T , and |B S | = |B T | . The empirical estimation of d H ∆ H ( B S , B T ) over the minibatches B S , B T is defined as � � � � ˆ � [ h � = h ′ ] − � [ h � = h ′ ] d H ∆ H ( B S , B T ) = sup � . (8) � � � � h , h ′ ∈H � B T B S Implicit Alignment for UDA June 13, 2020 6 / 32

Motivation Theoretical Motivation: The Decomposition Theorem (The decomposition of ˆ d H ∆ H ( B S , B T )) We define three disjoint sets on the label space: Y C := Y S ∩ Y T , Y S := Y S − Y C , and Y T := Y T − Y C . We also define the following disjoint sets on the input space where B C S := { x ∈ B S | y ∈ Y C } , B C ∈ Y C } , B C S := { x ∈ B S | y / T := { x ∈ B T | y ∈ Y C } , ∈ Y C } . The empirical ˆ B C T := { x ∈ B T | y / d H ∆ H ( B S , B T ) divergence can be decomposed into as the following: � � ˆ � ξ C ( h , h ′ ) + ξ C ( h , h ′ ) d H ∆ H ( B S , B T ) = sup � , (9) � � h , h ′ ∈H where ξ C ( h , h ′ ) = � h � = h ′ � � h � = h ′ � � � ✶ − ✶ , (10) B C B C T S ξ C ( h , h ′ ) = � h � = h ′ � � h � = h ′ � � � − . (11) ✶ ✶ B C B C T S Implicit Alignment for UDA June 13, 2020 7 / 32

Input samples � � � � 3 4 5 6 Label space � � � � � � shortcut shortcut Domain discriminator 1 0 (source, target) domain discriminator (source, target) domain discriminator 3 shortcut 6 3 shortcut （ , ） 6 （ , ） 3 6 goal Motivation goal （ , ） 4 4 （ , ） Theoretical Motivation: Domain-Discriminator Shortcut (source, target) domain discriminator (source, target) domain discriminator 3 shortcut 3 6 shortcut 6 （ , ） 3 6 Misaligned: （ , ） 3 6 goal goal （ , ） 4 4 4 4 Aligned: （ , ） Remark (The domain discriminator shortcut) Let f c be a classifier that maps x to a class label y c . Let f d be a domain discriminator that maps x to a binary domain label y d . For the empirical class-misaligned divergence ξ C ( h , h ′ ) with sample x ∈ B C S ∪ B C T , there exists a domain discriminator shortcut function � 1 f c ( x ) ∈ Y S f d ( x ) = (12) 0 f c ( x ) ∈ Y T , such that the domain label can be solely determined by the domain-specific class labels. (More pronounced under imbalance and distribution shift.) Implicit Alignment for UDA June 13, 2020 8 / 32

Proposed Approach Proposed Approach 𝑞 - (𝑦) 𝑞 - 𝑦 𝑧 𝑞(𝑧) 𝑞(𝑨|𝑦; 𝜚) 𝑞(𝑧 * |𝑨 * ; 𝜄) 𝑞 * (𝑦) 𝑞 * 𝑦 , 𝑧 𝑞(𝑧) 𝑧 * , pseudo-labels sampling implicit domain-invariant data classifier alignment representations (a) (b) (c) (d) For p S ( x ), we sample x ∼ p S ( x | y ) p ( y ) based on the alignment distribution p ( y ) For p T ( x ), we sample a class aligned minibatch x ∼ p T ( x | ˆ y ) p ( y ) using identical p ( y ), with the help of pseudo-labels ˆ y T Implicit Alignment for UDA June 13, 2020 9 / 32

Proposed Approach Proposed Approach 1: Input: dataset S = { ( x i , y i ) } N i =1 , T = { x i } M i =1 , label space Y , label alignment distribution p ( y ), 2: classifier f c ( · ; θ ) 3: 4: while not converged do # predict pseudo-labels for T 5: ˆ y i ) } M T ← { ( x i , ˆ i =1 where x i ∈ T and ˆ y i = f c ( x i ; θ ) 6: # sample N unique classes in the label space 7: Y ← draw N samples in Y from p ( y ) 8: # sample K examples conditioned on each y j ∈ Y 9: for y j in Y do 10: ( X ′ S , Y ′ S ) � draw K samples in S from p S ( x | y = y j ) 11: T � draw K samples in ˆ X ′ T from p T ( x | ˆ y = y j ) 12: end for 13: # domain adaptation training on this minibatch 14: train minibatch ( X ′ S , Y ′ S , X ′ T ) 15: 16: end while Implicit Alignment for UDA June 13, 2020 10 / 32

Proposed Approach Advantages of the proposed approach Minimizes the class-misaligned divergence ξ C ( h , h ′ ), providing a more reliable 1 empirical estimation of domain divergence; Implicit Alignment for UDA June 13, 2020 11 / 32

Proposed Approach Advantages of the proposed approach Minimizes the class-misaligned divergence ξ C ( h , h ′ ), providing a more reliable 1 empirical estimation of domain divergence; Provides balanced training across all classes; 2 Implicit Alignment for UDA June 13, 2020 11 / 32

Proposed Approach Advantages of the proposed approach Minimizes the class-misaligned divergence ξ C ( h , h ′ ), providing a more reliable 1 empirical estimation of domain divergence; Provides balanced training across all classes; 2 Removes the need to optimize model parameters from pseudo-labels explicitly; 3 Implicit Alignment for UDA June 13, 2020 11 / 32

Proposed Approach Advantages of the proposed approach Minimizes the class-misaligned divergence ξ C ( h , h ′ ), providing a more reliable 1 empirical estimation of domain divergence; Provides balanced training across all classes; 2 Removes the need to optimize model parameters from pseudo-labels explicitly; 3 Simple to implement and is orthogonal to different domain discrepancy 4 measures: DANN and MDD. Implicit Alignment for UDA June 13, 2020 11 / 32

Implicit Class-Conditioned Domain Alignment for Unsupervised Domain - PowerPoint PPT Presentation

Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation 1,2 1,4 Xiang Jiang Qicheng Lao 1,3 1 Stan Matwin Mohammad Havaei 1 Imagia 2 Dalhousie University 3 Polish Academy of Sciences 4 Mila, Universit e de Montr

Implicit Guarantees and Risk Taking: Implicit Guarantees and Risk Taking: Implicit Guarantees and

Probability of any given neighbourhood of root, conditioned on the root, conditioned on the tree

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

Sequence Alignment (chapter 6) p The biological problem p Global alignment p Local alignment p

Implicit Bias Implicit bias Implicit bias refers to attitudes or stereotypes that affect our

Implicit Surfaces Implicit Surfaces An implicit surface is simply an iso-contour CIS 781 of a

Ben Burr Trail PROJECT ALIGNMENT Project alignment Hamblen Elem School PROJECT ALIGNMENT

Ben Burr Trail PROJECT ALIGNMENT Project alignment Hamblen Elem School PROJECT ALIGNMENT

Data driven Ontology Alignment Data driven Ontology Alignment Nigam Shah nigam@stanford.edu

Sequence Alignment (chapter 6) The biological problem l Global alignment l Local alignment l

Image alignment Slides from Derek Hoiem, Svetlana Lazebnik Image source Alignment applications

Implicit Bias: Transcript Inclusive Teaching Series: Implicit Bias Welcome to the third module of

Implicit Extremes and Implicit MaxStable Laws Stilian Stoev ( sstoev@umich.edu ) University of

Multi-core Programming: Implicit Parallelism Tuukka Haapasalo April 16, 2009 Tuukka Haapasalo

Implicit Surfaces CPSC 599.86 / 601.86 Sonny Chan University of Calgary (some board work happened

A Conditioned Program Slicer Chris Fox, University of Essex 21st February 2005 Chris Fox,

Deadlock Questions? ! What is a deadlock? CSCI [4|6]730 ! What causes a deadlock? Operating

Outline Multilevel and mandatory access control CSci 5271 Introduction to Computer Security

CS 134: Operating Systems File System Implementation 1 / 34 Overview CS34 Overview

cedram Math literature Math E-literature DML MathDoc NUMDAM CEDRAM DML-F WDML? Math-aware

Windows File System Efficiency and Stability of a file system are contradicting requirements. How

Advanced UNIX CIS 118 Intro to UNIX 1 . The File Structure (Ch. 4 , Sobell) Objectives to

33:010:458 33:010:458 Accounting Information Accounting Information Systems Systems Dr. Peter

Software stack deployment for Earth System Modelling Sergey Kosukhin Max Planck Institute for