sinkhorn algorithm as a special case of stochastic mirror
play

Sinkhorn Algorithm as a Special Case of Stochastic Mirror Descent - PowerPoint PPT Presentation

Sinkhorn Algorithm as a Special Case of Stochastic Mirror Descent Konstantin Mishchenko, KAUST <latexit


  1. Sinkhorn Algorithm as a Special Case of Stochastic Mirror Descent Konstantin Mishchenko, KAUST

  2. <latexit sha1_base64="ZshlrjUhVEWOmo/c6MDls9PFapE=">ACGHicbVDLSgNBEJz1GeNr1aOXwUQlLgbD3oMCuIxinlAEpfZ2UkyZHZ2mekNhCWf4cVf8eJBEa+5+TdOHgdNLGgoqrp7vJjwTU4zre1tLyurae2chubm3v7Np7+1UdJYqyCo1EpOo+0UxwySrAQbB6rBgJfcFqfu9m7Nf6TGkeyUcYxKwVko7kbU4JGMmz2+5DHCfUYiUxvnkDPebXOJmSKDr+nD0Dt9knmsE9rF0CXg2Tmn4EyAF4k7Izk0Q9mzR80goknIJFBtG64TgytlCjgVLBhtploFhPaIx3WMFSkOlWOnlsiI+NEuB2pExJwBP190RKQq0HoW86xwfreW8s/uc1EmhftVIu4wSYpNF7URgiPA4JRxwZSIRA0MIVdzcimXKELBZJk1IbjzLy+SarHgXhSK98Vc6XoWRwYdoiN0glx0iUroDpVRBVH0jF7RO/qwXqw369P6mrYuWbOZA/QH1ugHEVefKg=</latexit> <latexit sha1_base64="Sri3LqGwQ70vfLFR5QOylps7Ds=">ACD3icbVC7SgNBFJ2Nrxhfq5Y2g4kiBMJuLQM2lhYRDEPyMYwO5kQ2Znl5m7Qlj2D2z8FRsLRWxt7fwbJ49CEw9cOJxzL/fe40eCa3CcbyuztLyupZdz21sbm3v2Lt7dR3GirIaDUWomj7RTHDJasBsGakGAl8wRr+8HLsNx6Y0jyUdzCKWDsgfcl7nBIwUsc+vmaAC02PS+wFBAa+n9ym94n0gAdMY5l2kmIxLXTsvFNyJsCLxJ2RPJqh2rG/vG5I4BJoIJo3XKdCNoJUcCpYGnOizWLCB2SPmsZKonZ1k4m/6T4yChd3AuVKQl4ov6eSEig9SjwTef4Zj3vjcX/vFYMvfN2wmUA5N0uqgXCwhHoeDu1wxCmJkCKGKm1sxHRBFKJgIcyYEd/7lRVIvl9zTUvmnK9czOLIogN0iE6Qi85QBV2hKqohih7RM3pFb9aT9WK9Wx/T1ow1m9lHf2B9/gDolpvy</latexit> <latexit sha1_base64="KLK8Q+2mrEk8g+cVUXCbpPkNd0=">ACI3icbVDLSsNAFJ34rPUVdelmsBXaTUnqQhGEohuXFewD2lAmk0k7dJIJ8yiU0H9x46+4caEUNy78F6dtFrb1wMDhnHv3Hv8hFGpHOfb2tjc2t7Zze3l9w8Oj47tk9Om5Fpg0sCcdH2kSMxqShqGKknQiCIp+Rlj98mPmtERGS8vhZjRPiRagf05BipIzUs2+LrbtuhNRARGlAUX9S0uX2sjAqFyGVMODaZ2MoFcDJBXFPbvgVJw54DpxM1IAGeo9e9oNONYRiRVmSMqO6yTKS5EwxiZ5LtakgThIeqTjqExioj0vmNE3hplACGXJgXKzhX/3akKJyHPmcra9XPVm4n9eR6vwxktpnGhFYrz4KNQMKg5ngcGACoKVudykgQU1u0ITgEBYmVjzJgR39eR10qxW3KtK9alaqN1nceTAObgAJeCa1ADj6AOGgCDF/AGPsCn9Wq9W1Pra1G6YWU9Z2AJ1s8vI06klg=</latexit> Problem 1 Let X ∈ R n × n ++ Find vectors u, v ∈ R n + such that W = diag( u ) X diag( v ) is doubly stochastic

  3. <latexit sha1_base64="atfYZ/O6iOnzDChnKiGDeTDS7c=">AB8nicbVBNS8NAEN3Ur1q/qh69LBbBU0mqoMeiF48V7Ae0oWy2k3btZhN2J0IJ/RlePCji1V/jzX/jts1BWx8MPN6bYWZekEh0HW/ncLa+sbmVnG7tLO7t39QPjxqmTjVHJo8lrHuBMyAFAqaKFBCJ9HAokBCOxjfzvz2E2gjYvWAkwT8iA2VCAVnaKVu5+Jx2lvCNTtlytu1Z2DrhIvJxWSo9Evf/UGMU8jUMglM6bruQn6GdMouIRpqZcaSBgfsyF0LVUsAuNn85On9MwqAxrG2pZCOld/T2QsMmYSBbYzYjgy95M/M/rphe+5lQSYqg+GJRmEqKMZ39TwdCA0c5sYRxLeytlI+YZhxtSiUbgrf8ip1areRbV2f1mp3+RxFMkJOSXnxCNXpE7uSIM0CScxeSav5M1B58V5dz4WrQUnzkmf+B8/gC+pDl</latexit> <latexit sha1_base64="ZshlrjUhVEWOmo/c6MDls9PFapE=">ACGHicbVDLSgNBEJz1GeNr1aOXwUQlLgbD3oMCuIxinlAEpfZ2UkyZHZ2mekNhCWf4cVf8eJBEa+5+TdOHgdNLGgoqrp7vJjwTU4zre1tLyurae2chubm3v7Np7+1UdJYqyCo1EpOo+0UxwySrAQbB6rBgJfcFqfu9m7Nf6TGkeyUcYxKwVko7kbU4JGMmz2+5DHCfUYiUxvnkDPebXOJmSKDr+nD0Dt9knmsE9rF0CXg2Tmn4EyAF4k7Izk0Q9mzR80goknIJFBtG64TgytlCjgVLBhtploFhPaIx3WMFSkOlWOnlsiI+NEuB2pExJwBP190RKQq0HoW86xwfreW8s/uc1EmhftVIu4wSYpNF7URgiPA4JRxwZSIRA0MIVdzcimXKELBZJk1IbjzLy+SarHgXhSK98Vc6XoWRwYdoiN0glx0iUroDpVRBVH0jF7RO/qwXqw369P6mrYuWbOZA/QH1ugHEVefKg=</latexit> <latexit sha1_base64="KLK8Q+2mrEk8g+cVUXCbpPkNd0=">ACI3icbVDLSsNAFJ34rPUVdelmsBXaTUnqQhGEohuXFewD2lAmk0k7dJIJ8yiU0H9x46+4caEUNy78F6dtFrb1wMDhnHv3Hv8hFGpHOfb2tjc2t7Zze3l9w8Oj47tk9Om5Fpg0sCcdH2kSMxqShqGKknQiCIp+Rlj98mPmtERGS8vhZjRPiRagf05BipIzUs2+LrbtuhNRARGlAUX9S0uX2sjAqFyGVMODaZ2MoFcDJBXFPbvgVJw54DpxM1IAGeo9e9oNONYRiRVmSMqO6yTKS5EwxiZ5LtakgThIeqTjqExioj0vmNE3hplACGXJgXKzhX/3akKJyHPmcra9XPVm4n9eR6vwxktpnGhFYrz4KNQMKg5ngcGACoKVudykgQU1u0ITgEBYmVjzJgR39eR10qxW3KtK9alaqN1nceTAObgAJeCa1ADj6AOGgCDF/AGPsCn9Wq9W1Pra1G6YWU9Z2AJ1s8vI06klg=</latexit> <latexit sha1_base64="Sri3LqGwQ70vfLFR5QOylps7Ds=">ACD3icbVC7SgNBFJ2Nrxhfq5Y2g4kiBMJuLQM2lhYRDEPyMYwO5kQ2Znl5m7Qlj2D2z8FRsLRWxt7fwbJ49CEw9cOJxzL/fe40eCa3CcbyuztLyupZdz21sbm3v2Lt7dR3GirIaDUWomj7RTHDJasBsGakGAl8wRr+8HLsNx6Y0jyUdzCKWDsgfcl7nBIwUsc+vmaAC02PS+wFBAa+n9ym94n0gAdMY5l2kmIxLXTsvFNyJsCLxJ2RPJqh2rG/vG5I4BJoIJo3XKdCNoJUcCpYGnOizWLCB2SPmsZKonZ1k4m/6T4yChd3AuVKQl4ov6eSEig9SjwTef4Zj3vjcX/vFYMvfN2wmUA5N0uqgXCwhHoeDu1wxCmJkCKGKm1sxHRBFKJgIcyYEd/7lRVIvl9zTUvmnK9czOLIogN0iE6Qi85QBV2hKqohih7RM3pFb9aT9WK9Wx/T1ow1m9lHf2B9/gDolpvy</latexit> <latexit sha1_base64="5T7WIDdola1iK+WjoZHXbdHmwMs=">ACFXicbVDLSsNAFJ34rPUVdelmsAgupCRV0E2h6MZlBfuApobJdNJO5nEmRuxhPyEG3/FjQtF3Aru/BvTx0Jbz+pwzr3ce4XCa7Bsr6NhcWl5ZXV3Fp+fWNza9vc2a3rMFaU1WgoQtX0iGaCS1YDoI1I8VI4AnW8AaXI79xz5TmobyBYcTaAelK7nNKIJNc89jRceAm/bKd3krcBPeT8u2cxeTDnaAPUDihwoTOcQp5q5ZsIrWGHie2FNSQFNUXfPL6YQ0DpgEKojWLduKoJ0QBZwKluadWLOI0AHpslZGJQmYbifjVCk+zJQOHp3Qwl4rP7eSEig9TDwsmAQE/PeiPxP68Vg3/eTriMYmCSTg75scAQ4lFuMVoyCGSFU8exXTHtEQpZkfmsBHs28jypl4r2SbF0fVqoXEzryKF9dICOkI3OUAVdoSqIYoe0TN6RW/Gk/FivBsfk9EFY7qzh/7A+PwBCfeug=</latexit> <latexit sha1_base64="cXOQFGOZ3l+d+4gPV4jP9ljOuJc=">ACFXicbZC7SgNBFIZnvcZ4i1raDAbBQsJuFLQJBG0sI5gLZGOYncwmk8zOrjNnxbDsS9j4KjYWitgKdr6Nk0uhiaf6+P9zOf8XiS4Btv+thYWl5ZXVjNr2fWNza3t3M5uTYexoqxKQxGqhkc0E1yKnAQrBEpRgJPsLo3uBz59XumNA/lDQwj1gpIV3KfUwJGaueOXR0H7YSXnPRW4rqhflpy3LuYdLAL7AESP1SYyCFOcb+dy9sFe1x4Hpwp5NG0Ku3cl9sJaRwCVQrZuOHUErIQo4FSzNurFmEaED0mVNg5IETLeS8VcpPjRKB4/W+6EPFZ/TyQk0HoYeKYzINDTs95I/M9rxuCftxIuoxiYpJNFfiwhHgUEe5wxSiIoQFCFTe3YtojilAwQWZNCM7sy/NQKxack0Lx+jRfvpjGkUH76AdIQedoTK6QhVURQ9omf0it6sJ+vFerc+Jq0L1nRmD/0p6/MHCdKeug=</latexit> Problem 1 Let $X\in \mathbb{R}^{n\times n}_{++}$ Let X ∈ R n × n Find vectors $u, v\in \mathbb{R}_+^n$ such that ++ $W=\mathrm{diag}(u)X\mathrm{diag}(v)$ is doubly stochastic \sum_{i=1}^n W_{ij}=1\quad \text{for any } j Find vectors u, v ∈ R n + such that W = diag( u ) X diag( v ) is doubly stochastic n X W ij = 1 for any j i =1 W ij ≥ 0 n X W ij = 1 for any i j =1

  4. <latexit sha1_base64="zoKIkpGprxy9RdUbdItoRD+o/g=">ACdnicdVHLbtQwFHXCqwyvKawQErqIJFSWlUrupVLUblgUxbaTxTOR4nKlb2wn2DWJk5RP6c+z4DjYs8TwkoIUjWT46517PspGSYdp+j2K79y9d/BxsPeo8dPnj7rbz4/c3VruRjyWtU2L5kTShoxRIlK5I0VTJdKnJdXJwv/IuwTtbmE84bMdZsZmQlOcMgFf1rqUpfE6loZrhRVn6j93EG4pSCwem64C6Vhde7lweZt3EwEnglx3ky4t+btkUKIqv6F2CQRj9U7ls+7wN92BfEKxbuB/Lp0JSIv+IE3SJeA2ydZkQNY4Lfrf6LTmrRYGuWLOjbK0wbFnFiVXouvR1omG8Ss2E6NADQtNjf1ybB1sB2UKVW3DMQhL9c8Mz7Rzc12GyEWl7qa3EP/ljVqsDsZemqZFYfjqo6pVgDUsdgBTaQVHNQ+EcStDrcAvmGUcw6Z6YQjZzZvk7PdJHuX7H7YGxwdr8exQV6RLfKWZGSfHJH35JQMCSc/opfRVjSIfsav4+34zSo0jtY5L8hfiNfVN2/jA=</latexit> <latexit sha1_base64="QbpGgn2poU28p6pPQ7elDIH3yM0=">AB+nicbVC7TsMwFHV4lvJKYWSxqJCYqQMFawMBaJPqQ2qhzHa06dmTfgKrQT2FhACFWvoSNv8FtM0DLkSwdnXPv9b0nTAU34Hnfztr6xubWdmnvLu3f3DoVo7aRmWashZVQuluSAwTXLIWcBCsm2pGklCwTji+mfmdB6YNV/IeJikLEjKUPOaUgJUGbiVSWSgm2ICiI2KA04Fb9WreHiV+AWpogLNgfvVjxTNEiaBCmJMz/dSCHKi7TDBpuV+ZlhK6JgMWc9SRJmgny+hSfWSXCsdL2ScBz9XdHThJjJkloKxMCI7PszcT/vF4G8VWQc5lmwCRdfBRnAoPCsxwxDWjYC+POKGa212xDUATCjatsg3BXz5lbTrNf+iVr+rVxvXRwldIJO0Tny0SVqoFvURC1E0SN6Rq/ozXlyXpx352NRuYUPcfoD5zPH6FdlDs=</latexit> Problem 2 n X s.t. X 1 = 1 , X > 1 = 1 , X ≥ 0 C ij X ij min X 2 R n × n i,j =1 doubly stochastic \min_{X\in\mathbb{R}^{n\times n}} \sum_{i,j=1}^n C_{ij} X_{ij}\quad \text{s.t. } X\mathbf{1}=\mathbf{1}, X^\top \mathbf{1}=\mathbf{1}, X\ge 0 \min_{X\in\mathbb{R}^{n\times n}} \sum_{i,j=1}^n \left(C_{ij} X_{ij} +\gamma X_{ij}\log X_{ij}\right)\quad \text{s.t. } X\mathbf{1} =\mathbf{1}, X^\top \mathbf{1}=\mathbf{1} \min_{X\in\mathbb{R}^{n\times n}} \sum_{i,j=1}^n \mathcal{KL} (X||X^0)\quad \text{s.t. } X\mathbf{1}=\mathbf{1}, X^\top \mathbf{1}=\mathbf{1} \text{where }X^0\overset{\text{def}}{=} \exp\left(-\frac{C}{\gamma}\right) \text{ coordinate-wise} \mathcal{KL} (X||X^0) \overset{\text{def}}{=} \sum_{i,j=1}^n \Bigl(X_{ij}\log \frac{X_{ij}}{X^0_{ij}}-X_{ij}+X_{ij}^0 \Bigr)

  5. <latexit sha1_base64="zoKIkpGprxy9RdUbdItoRD+o/g=">ACdnicdVHLbtQwFHXCqwyvKawQErqIJFSWlUrupVLUblgUxbaTxTOR4nKlb2wn2DWJk5RP6c+z4DjYs8TwkoIUjWT46517PspGSYdp+j2K79y9d/BxsPeo8dPnj7rbz4/c3VruRjyWtU2L5kTShoxRIlK5I0VTJdKnJdXJwv/IuwTtbmE84bMdZsZmQlOcMgFf1rqUpfE6loZrhRVn6j93EG4pSCwem64C6Vhde7lweZt3EwEnglx3ky4t+btkUKIqv6F2CQRj9U7ls+7wN92BfEKxbuB/Lp0JSIv+IE3SJeA2ydZkQNY4Lfrf6LTmrRYGuWLOjbK0wbFnFiVXouvR1omG8Ss2E6NADQtNjf1ybB1sB2UKVW3DMQhL9c8Mz7Rzc12GyEWl7qa3EP/ljVqsDsZemqZFYfjqo6pVgDUsdgBTaQVHNQ+EcStDrcAvmGUcw6Z6YQjZzZvk7PdJHuX7H7YGxwdr8exQV6RLfKWZGSfHJH35JQMCSc/opfRVjSIfsav4+34zSo0jtY5L8hfiNfVN2/jA=</latexit> <latexit sha1_base64="lRXIUL0F+itwe6NEUyLtszYR9+A=">AClXicbVFb9MwFHbCbSuXlfGwB14sKqQhpigZSOyl0tgQ4o2B6BapbiPHdVJvthPsE0Rl5R/xa/a2f4PbBgHbjmT5O93js4tr6WwEMdXQXjn7r37DzY2ew8fPX6y1X+6fWqrxjA+YpWsTJpTy6XQfAQCJE9rw6nKJT/L46X+tkPbqyo9DdY1HyiaKlFIRgFT2X9X0QJnbmUCE0UhXmeu6/t1GkCQnGLdtiYhuVObF3PkzaqcZE8gJ2jz1z3uJ0/b0mJVWKdi6RVfkHGlHO4RX53tAZJsB/grMRNhnrsVLmHf+EeTqcEqhrfqmb9QRzFK8M3QdKBAersJOtfklnFGsU1MEmtHSdxDRNHDQgmedsjeU1ZRe05GMPNfUzT9xqy1+6ZkZLirjnwa8Yv/NcFRZu1C5j1x2aK9rS/I2bdxAcTBxQtcNcM3WhYpGYqjw8kR4JgxnIBceUGaE7xWzOTWUgT9kzy8huT7yTXC6HyVvov0vbweHR906NtBz9ALtogS9Q4foEzpBI8SC7eAgeB8chTvhMPwQflyHhkGX8wz9Z+Hn30ZaywY=</latexit> Problem 2 n X s.t. X 1 = 1 , X > 1 = 1 , X ≥ 0 C ij X ij min X 2 R n × n i,j =1 n X s.t. X 1 = 1 , X > 1 = 1 min ( C ij X ij + γ X ij log X ij ) X 2 R n × n i,j =1 \min_{X\in\mathbb{R}^{n\times n}} \sum_{i,j=1}^n C_{ij} X_{ij}\quad \text{s.t. } X\mathbf{1}=\mathbf{1}, X^\top \mathbf{1}=\mathbf{1}, X\ge 0 \min_{X\in\mathbb{R}^{n\times n}} \sum_{i,j=1}^n \left(C_{ij} X_{ij} +\gamma X_{ij}\log X_{ij}\right)\quad \text{s.t. } X\mathbf{1} =\mathbf{1}, X^\top \mathbf{1}=\mathbf{1} \min_{X\in\mathbb{R}^{n\times n}} \sum_{i,j=1}^n \mathcal{KL} (X||X^0)\quad \text{s.t. } X\mathbf{1}=\mathbf{1}, X^\top \mathbf{1}=\mathbf{1} \text{where }X^0\overset{\text{def}}{=} \exp\left(-\frac{C}{\gamma}\right) \text{ coordinate-wise} \mathcal{KL} (X||X^0) \overset{\text{def}}{=} \sum_{i,j=1}^n \Bigl(X_{ij}\log \frac{X_{ij}}{X^0_{ij}}-X_{ij}+X_{ij}^0 \Bigr)

Recommend


More recommend