classification from pairwise similarity and unlabeled data
play

Classification from Pairwise Similarity and Unlabeled Data Han Bao - PowerPoint PPT Presentation

Classification from Pairwise Similarity and Unlabeled Data Han Bao 1,2 , Gang Niu 2 , Masashi Sugiyama 2,1 1 The University of Tokyo, Japan / 2 RIKEN, Japan July 13 th , 2018 Gentle Start: Binary Classification 2 Boundary Training data


  1. Classification from Pairwise Similarity and Unlabeled Data Han Bao 1,2 , Gang Niu 2 , Masashi Sugiyama 2,1 1 The University of Tokyo, Japan / 2 RIKEN, Japan July 13 th , 2018

  2. Gentle Start: Binary Classification 2 Boundary Training data 
 Positive data Negative data empirical risk minimization (ERM) Method: minimize classification error Goal: find a classifier where data is labeled as i . i . d . { ( x i , y i ) } n ∼ p ( x , y ) i =1 y = +1 f ( x ) = 0 x i ∈ R d y i ∈ { +1 , − 1 } f : R d → R y = − 1

  3. Classification of sensitive matters e.g., politics, religion, opinion on racial issue hard to obtain explicit label instead asking “Which person do you share the same belief as?” cf. randomized response technique 3 http://leanintokyo.org/wp-content/uploads/2017/12/MeToo.jpg two people share the same property [Warner 1965] Warner, S. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63‒69, 1965. Motivation: Pairwise Information in Classification

  4. Related: Semi-supervised Clustering Clustering from (manifold assumption, low-density separation) same class different class [Wagstaff+ ICML2001; many other papers] <latexit sha1_base64="30yT3voAaM6LAnNJPg23sA/ERKA=">AC6nichVFLS8NAEB7jq9ZHo14EL2JRPJWtCD5AEL14tGqsYCQkcVuX5kWyLWqIP8CrB5GeFD2I6J/w4h/w4MW7eKzgxYOTNCIqbTfs7uw381MvtEcg3mckOc2ob2js6s70ZPs7esfSImDQ1ueXZ1Kum2YbvbmupRg1lU4owbdNtxqWpqBs1rpZXQn69Q12O2tckPHbprqkWLFZiucoQUcUQ2Vb6vq4YvBYuyL2umfxAoTA4UMU0yJFpj/41sbKQhXmu2+AIy7IENOpTBAoWcLQNUMHDbweyQMBbBd8xFy0WOSnEASuWMohihIlrCs4ivnRi18B3m9CK2jlUM3C4yx2CPJEbUiOP5Ja8ks+GufwoR9jLId5anUsdJXUysvHRkmXizWH/h9WE8R3roreAu/H/+diH2cLfPCJUiGOVuUgZhko5ERJqpte7rRyd1TYW1if8SXJ3lCtC/JMHlAvq/KuX+foehWSO7s3+H+N6TpzHwm5tJLy3Hc0/AKIzDFA53FpZgFdZAwrLHcAV3cC+YwqlwLlTroUJbzBmGX0u4+gKh8atp</latexit> <latexit sha1_base64="30yT3voAaM6LAnNJPg23sA/ERKA=">AC6nichVFLS8NAEB7jq9ZHo14EL2JRPJWtCD5AEL14tGqsYCQkcVuX5kWyLWqIP8CrB5GeFD2I6J/w4h/w4MW7eKzgxYOTNCIqbTfs7uw381MvtEcg3mckOc2ob2js6s70ZPs7esfSImDQ1ueXZ1Kum2YbvbmupRg1lU4owbdNtxqWpqBs1rpZXQn69Q12O2tckPHbprqkWLFZiucoQUcUQ2Vb6vq4YvBYuyL2umfxAoTA4UMU0yJFpj/41sbKQhXmu2+AIy7IENOpTBAoWcLQNUMHDbweyQMBbBd8xFy0WOSnEASuWMohihIlrCs4ivnRi18B3m9CK2jlUM3C4yx2CPJEbUiOP5Ja8ks+GufwoR9jLId5anUsdJXUysvHRkmXizWH/h9WE8R3roreAu/H/+diH2cLfPCJUiGOVuUgZhko5ERJqpte7rRyd1TYW1if8SXJ3lCtC/JMHlAvq/KuX+foehWSO7s3+H+N6TpzHwm5tJLy3Hc0/AKIzDFA53FpZgFdZAwrLHcAV3cC+YwqlwLlTroUJbzBmGX0u4+gKh8atp</latexit> <latexit sha1_base64="30yT3voAaM6LAnNJPg23sA/ERKA=">AC6nichVFLS8NAEB7jq9ZHo14EL2JRPJWtCD5AEL14tGqsYCQkcVuX5kWyLWqIP8CrB5GeFD2I6J/w4h/w4MW7eKzgxYOTNCIqbTfs7uw381MvtEcg3mckOc2ob2js6s70ZPs7esfSImDQ1ueXZ1Kum2YbvbmupRg1lU4owbdNtxqWpqBs1rpZXQn69Q12O2tckPHbprqkWLFZiucoQUcUQ2Vb6vq4YvBYuyL2umfxAoTA4UMU0yJFpj/41sbKQhXmu2+AIy7IENOpTBAoWcLQNUMHDbweyQMBbBd8xFy0WOSnEASuWMohihIlrCs4ivnRi18B3m9CK2jlUM3C4yx2CPJEbUiOP5Ja8ks+GufwoR9jLId5anUsdJXUysvHRkmXizWH/h9WE8R3roreAu/H/+diH2cLfPCJUiGOVuUgZhko5ERJqpte7rRyd1TYW1if8SXJ3lCtC/JMHlAvq/KuX+foehWSO7s3+H+N6TpzHwm5tJLy3Hc0/AKIzDFA53FpZgFdZAwrLHcAV3cC+YwqlwLlTroUJbzBmGX0u4+gKh8atp</latexit> <latexit sha1_base64="30yT3voAaM6LAnNJPg23sA/ERKA=">AC6nichVFLS8NAEB7jq9ZHo14EL2JRPJWtCD5AEL14tGqsYCQkcVuX5kWyLWqIP8CrB5GeFD2I6J/w4h/w4MW7eKzgxYOTNCIqbTfs7uw381MvtEcg3mckOc2ob2js6s70ZPs7esfSImDQ1ueXZ1Kum2YbvbmupRg1lU4owbdNtxqWpqBs1rpZXQn69Q12O2tckPHbprqkWLFZiucoQUcUQ2Vb6vq4YvBYuyL2umfxAoTA4UMU0yJFpj/41sbKQhXmu2+AIy7IENOpTBAoWcLQNUMHDbweyQMBbBd8xFy0WOSnEASuWMohihIlrCs4ivnRi18B3m9CK2jlUM3C4yx2CPJEbUiOP5Ja8ks+GufwoR9jLId5anUsdJXUysvHRkmXizWH/h9WE8R3roreAu/H/+diH2cLfPCJUiGOVuUgZhko5ERJqpte7rRyd1TYW1if8SXJ3lCtC/JMHlAvq/KuX+foehWSO7s3+H+N6TpzHwm5tJLy3Hc0/AKIzDFA53FpZgFdZAwrLHcAV3cC+YwqlwLlTroUJbzBmGX0u4+gKh8atp</latexit> <latexit sha1_base64="9pK1tCgr2axsScUrDFhYvw/UlQ=">AC9nichVE7SwNBEJ6crxgfidoINmJ8RJCwJ4IPEQtLJNoVPBCuDs3uRe3G2C8cgf8A9YWIiCiNik1dbGP2BhYy+WCjYWzl1OREN0j92d/Wa+mblvFEtjDifkMS0tLa1d4Q7I13dPb3RWF/pmOWbJVmVMz7W1FdqjGDJrljGt027KprCsa3VK5/q0xth5nGBq9YNKfLewYrMFXmCOVjo5Iu831V1tzV6qLkJiRFdw+qeTb1ZUxMStV8LE6SxF/DjYGHEIVsqMPYEu2CiXQgYIBHG0NZHDw2wERCFiI5cBFzEaL+X4KVYgt4RFCNkRIt47uFrJ0ANfHs5HZ+tYhUNt43MYRgjD+SKvJ7ck2eyUfTXK6fw+ulgrdS51IrHz0aXH/l6XjzWH/m/UH4yvWRm8Bd/P/c7EP/R/3xGeQhyrzPnKMFTK8hFPM7Xebfnw+HV9ITPmjpNz8oJqnZFHcod6GeU39SJNMycQwXGLv4fbaGSnk/NJMT0TX1oO5h6GIRiBA53FpZgDVKQxbJHUIMbuBUqwqlwIVzWQ4VQwBmAH0uofQKOA6/X</latexit> <latexit sha1_base64="9pK1tCgr2axsScUrDFhYvw/UlQ=">AC9nichVE7SwNBEJ6crxgfidoINmJ8RJCwJ4IPEQtLJNoVPBCuDs3uRe3G2C8cgf8A9YWIiCiNik1dbGP2BhYy+WCjYWzl1OREN0j92d/Wa+mblvFEtjDifkMS0tLa1d4Q7I13dPb3RWF/pmOWbJVmVMz7W1FdqjGDJrljGt027KprCsa3VK5/q0xth5nGBq9YNKfLewYrMFXmCOVjo5Iu831V1tzV6qLkJiRFdw+qeTb1ZUxMStV8LE6SxF/DjYGHEIVsqMPYEu2CiXQgYIBHG0NZHDw2wERCFiI5cBFzEaL+X4KVYgt4RFCNkRIt47uFrJ0ANfHs5HZ+tYhUNt43MYRgjD+SKvJ7ck2eyUfTXK6fw+ulgrdS51IrHz0aXH/l6XjzWH/m/UH4yvWRm8Bd/P/c7EP/R/3xGeQhyrzPnKMFTK8hFPM7Xebfnw+HV9ITPmjpNz8oJqnZFHcod6GeU39SJNMycQwXGLv4fbaGSnk/NJMT0TX1oO5h6GIRiBA53FpZgDVKQxbJHUIMbuBUqwqlwIVzWQ4VQwBmAH0uofQKOA6/X</latexit> <latexit sha1_base64="9pK1tCgr2axsScUrDFhYvw/UlQ=">AC9nichVE7SwNBEJ6crxgfidoINmJ8RJCwJ4IPEQtLJNoVPBCuDs3uRe3G2C8cgf8A9YWIiCiNik1dbGP2BhYy+WCjYWzl1OREN0j92d/Wa+mblvFEtjDifkMS0tLa1d4Q7I13dPb3RWF/pmOWbJVmVMz7W1FdqjGDJrljGt027KprCsa3VK5/q0xth5nGBq9YNKfLewYrMFXmCOVjo5Iu831V1tzV6qLkJiRFdw+qeTb1ZUxMStV8LE6SxF/DjYGHEIVsqMPYEu2CiXQgYIBHG0NZHDw2wERCFiI5cBFzEaL+X4KVYgt4RFCNkRIt47uFrJ0ANfHs5HZ+tYhUNt43MYRgjD+SKvJ7ck2eyUfTXK6fw+ulgrdS51IrHz0aXH/l6XjzWH/m/UH4yvWRm8Bd/P/c7EP/R/3xGeQhyrzPnKMFTK8hFPM7Xebfnw+HV9ITPmjpNz8oJqnZFHcod6GeU39SJNMycQwXGLv4fbaGSnk/NJMT0TX1oO5h6GIRiBA53FpZgDVKQxbJHUIMbuBUqwqlwIVzWQ4VQwBmAH0uofQKOA6/X</latexit> <latexit sha1_base64="9pK1tCgr2axsScUrDFhYvw/UlQ=">AC9nichVE7SwNBEJ6crxgfidoINmJ8RJCwJ4IPEQtLJNoVPBCuDs3uRe3G2C8cgf8A9YWIiCiNik1dbGP2BhYy+WCjYWzl1OREN0j92d/Wa+mblvFEtjDifkMS0tLa1d4Q7I13dPb3RWF/pmOWbJVmVMz7W1FdqjGDJrljGt027KprCsa3VK5/q0xth5nGBq9YNKfLewYrMFXmCOVjo5Iu831V1tzV6qLkJiRFdw+qeTb1ZUxMStV8LE6SxF/DjYGHEIVsqMPYEu2CiXQgYIBHG0NZHDw2wERCFiI5cBFzEaL+X4KVYgt4RFCNkRIt47uFrJ0ANfHs5HZ+tYhUNt43MYRgjD+SKvJ7ck2eyUfTXK6fw+ulgrdS51IrHz0aXH/l6XjzWH/m/UH4yvWRm8Bd/P/c7EP/R/3xGeQhyrzPnKMFTK8hFPM7Xebfnw+HV9ITPmjpNz8oJqnZFHcod6GeU39SJNMycQwXGLv4fbaGSnk/NJMT0TX1oO5h6GIRiBA53FpZgDVKQxbJHUIMbuBUqwqlwIVzWQ4VQwBmAH0uofQKOA6/X</latexit> <latexit sha1_base64="s5sKGlaizRypxYBUvoucvMZcG+c=">AC9nichVE7SwNBEB7PV4yvqI1gI8b4AkbEXyAINpYqjEmkAvh7tzo4r242wTjkT+QP2BhIQoiYmOrY1/wCKNvVhGsLFw7nJBNCTZY3dnv5lvZu4b2VSZzQmpdAidXd09vYG+YP/A4NBwaGT0wDbylkITiqEaVkqWbKoynSY4ypNmRaVNFmlSflky/UnC9SymaHv86JM5p0pLMcUySOUDY0LWoSP1Yk1YmX1kVnTpQ157SUZQt1Y3ZeLGVDYRIl3psNGK+EQZ/7RihNxDhEAxQIA8aUNCBo62CBDZ+aYgBAROxDiIWgxz0+hBEHk5jGKYoSE6AmeR/hK+6iObzen7bEVrKLitpA5CRHySu5IlbyQe/JOvpvmcrwcbi9FvOUal5rZ4fJ4/KstS8Obw/EvqwWjHmuhN4e7+f852IfWxt86wlWIY5UVTxmGSpke4mqm1LotnJ1X42t7EWeGXJMPVOuKVMgz6qUXPpWbXbp3AUEcd+z/cBuNxGJ0NRrbXQpvbPpzD8AETMEcDncZNmAbdiCBZcvwAI/wJBSFS+FGuK2FCh0+Zwz+LOHhB7Psr+Y=</latexit> <latexit sha1_base64="s5sKGlaizRypxYBUvoucvMZcG+c=">AC9nichVE7SwNBEB7PV4yvqI1gI8b4AkbEXyAINpYqjEmkAvh7tzo4r242wTjkT+QP2BhIQoiYmOrY1/wCKNvVhGsLFw7nJBNCTZY3dnv5lvZu4b2VSZzQmpdAidXd09vYG+YP/A4NBwaGT0wDbylkITiqEaVkqWbKoynSY4ypNmRaVNFmlSflky/UnC9SymaHv86JM5p0pLMcUySOUDY0LWoSP1Yk1YmX1kVnTpQ157SUZQt1Y3ZeLGVDYRIl3psNGK+EQZ/7RihNxDhEAxQIA8aUNCBo62CBDZ+aYgBAROxDiIWgxz0+hBEHk5jGKYoSE6AmeR/hK+6iObzen7bEVrKLitpA5CRHySu5IlbyQe/JOvpvmcrwcbi9FvOUal5rZ4fJ4/KstS8Obw/EvqwWjHmuhN4e7+f852IfWxt86wlWIY5UVTxmGSpke4mqm1LotnJ1X42t7EWeGXJMPVOuKVMgz6qUXPpWbXbp3AUEcd+z/cBuNxGJ0NRrbXQpvbPpzD8AETMEcDncZNmAbdiCBZcvwAI/wJBSFS+FGuK2FCh0+Zwz+LOHhB7Psr+Y=</latexit> <latexit sha1_base64="s5sKGlaizRypxYBUvoucvMZcG+c=">AC9nichVE7SwNBEB7PV4yvqI1gI8b4AkbEXyAINpYqjEmkAvh7tzo4r242wTjkT+QP2BhIQoiYmOrY1/wCKNvVhGsLFw7nJBNCTZY3dnv5lvZu4b2VSZzQmpdAidXd09vYG+YP/A4NBwaGT0wDbylkITiqEaVkqWbKoynSY4ypNmRaVNFmlSflky/UnC9SymaHv86JM5p0pLMcUySOUDY0LWoSP1Yk1YmX1kVnTpQ157SUZQt1Y3ZeLGVDYRIl3psNGK+EQZ/7RihNxDhEAxQIA8aUNCBo62CBDZ+aYgBAROxDiIWgxz0+hBEHk5jGKYoSE6AmeR/hK+6iObzen7bEVrKLitpA5CRHySu5IlbyQe/JOvpvmcrwcbi9FvOUal5rZ4fJ4/KstS8Obw/EvqwWjHmuhN4e7+f852IfWxt86wlWIY5UVTxmGSpke4mqm1LotnJ1X42t7EWeGXJMPVOuKVMgz6qUXPpWbXbp3AUEcd+z/cBuNxGJ0NRrbXQpvbPpzD8AETMEcDncZNmAbdiCBZcvwAI/wJBSFS+FGuK2FCh0+Zwz+LOHhB7Psr+Y=</latexit> <latexit sha1_base64="s5sKGlaizRypxYBUvoucvMZcG+c=">AC9nichVE7SwNBEB7PV4yvqI1gI8b4AkbEXyAINpYqjEmkAvh7tzo4r242wTjkT+QP2BhIQoiYmOrY1/wCKNvVhGsLFw7nJBNCTZY3dnv5lvZu4b2VSZzQmpdAidXd09vYG+YP/A4NBwaGT0wDbylkITiqEaVkqWbKoynSY4ypNmRaVNFmlSflky/UnC9SymaHv86JM5p0pLMcUySOUDY0LWoSP1Yk1YmX1kVnTpQ157SUZQt1Y3ZeLGVDYRIl3psNGK+EQZ/7RihNxDhEAxQIA8aUNCBo62CBDZ+aYgBAROxDiIWgxz0+hBEHk5jGKYoSE6AmeR/hK+6iObzen7bEVrKLitpA5CRHySu5IlbyQe/JOvpvmcrwcbi9FvOUal5rZ4fJ4/KstS8Obw/EvqwWjHmuhN4e7+f852IfWxt86wlWIY5UVTxmGSpke4mqm1LotnJ1X42t7EWeGXJMPVOuKVMgz6qUXPpWbXbp3AUEcd+z/cBuNxGJ0NRrbXQpvbPpzD8AETMEcDncZNmAbdiCBZcvwAI/wJBSFS+FGuK2FCh0+Zwz+LOHhB7Psr+Y=</latexit> Wagstaff, K., Cardie, C., Rogers, S., and Schrödl, S. Constrained k-means clustering with background knowledge. In ICML, pp. 577‒584, 2001. 4 does not hold for many datasets Problem: Cluster assumption 
 Offspring of unsupervised clustering dissimilar similar unlabeled U = { x i } S = { ( x i , x 0 i ) } D = { ( x i , x 0 i ) }

Recommend


More recommend