L ea rn i n g D ee p K e rn el s L ea rn i n g D ee p K e rn el s f or E xpon e nt ial F a m il y D e ns i t ie s f or E xpon e nt ial F a m il y D e ns i t ie s Li K. Wenliang D. J. Sutherland H. Strathmann A. Gretton Gatsby unit, University College London Poster #221
K e rn el e xpon e nt ial fa m ilie s K e rn el e xpon e nt ial fa m ilie s Classic exponential family: Gaussian: Learning deep kernels for exponential family densities Poster #221
K e rn el e xpon e nt ial fa m ilie s K e rn el e xpon e nt ial fa m ilie s Classic exponential family: Gaussian: Fit depends only on (and ) Learning deep kernels for exponential family densities Poster #221
K e rn el e xpon e nt ial fa m ilie s K e rn el e xpon e nt ial fa m ilie s Classic exponential family: Gaussian: Fit depends only on (and ) Kernel exponential family: Learning deep kernels for exponential family densities Poster #221
K e rn el e xpon e nt ial fa m ilie s K e rn el e xpon e nt ial fa m ilie s Classic exponential family: Gaussian: Fit depends only on (and ) Kernel exponential family: Reproducing property: Learning deep kernels for exponential family densities Poster #221
K e rn el e xpon e nt ial fa m ilie s K e rn el e xpon e nt ial fa m ilie s Classic exponential family: Gaussian: Fit depends only on (and ) Kernel exponential family: Reproducing property: So Learning deep kernels for exponential family densities Poster #221
W h y ke rn el e xpon e nt ial fa m ilie s W h y ke rn el e xpon e nt ial fa m ilie s Learning deep kernels for exponential family densities Poster #221
W h y ke rn el e xpon e nt ial fa m ilie s W h y ke rn el e xpon e nt ial fa m ilie s Any density with Learning deep kernels for exponential family densities Poster #221
W h y ke rn el e xpon e nt ial fa m ilie s W h y ke rn el e xpon e nt ial fa m ilie s Any density with Much richer class; e.g. with Gaussian , dense in all continuous distributions on compact domains Learning deep kernels for exponential family densities Poster #221
W h y ke rn el e xpon e nt ial fa m ilie s W h y ke rn el e xpon e nt ial fa m ilie s Any density with Much richer class; e.g. with Gaussian , dense in all continuous distributions on compact domains Learning deep kernels for exponential family densities Poster #221
W h y ke rn el e xpon e nt ial fa m ilie s W h y ke rn el e xpon e nt ial fa m ilie s Any density with Much richer class; e.g. with Gaussian , dense in all continuous distributions on compact domains Learning deep kernels for exponential family densities Poster #221
W h y ke rn el e xpon e nt ial fa m ilie s W h y ke rn el e xpon e nt ial fa m ilie s Any density with Much richer class; e.g. with Gaussian , dense in all continuous distributions on compact domains Fit with score matching Learning deep kernels for exponential family densities Poster #221
C h oos i n g a ke rn el w i t h m e t a - lea rn i n g C h oos i n g a ke rn el w i t h m e t a - lea rn i n g Fit quality depends a lot on kernel choice Also on the regularization weight Need to �t these parameters Learning deep kernels for exponential family densities Poster #221
C h oos i n g a ke rn el w i t h m e t a - lea rn i n g C h oos i n g a ke rn el w i t h m e t a - lea rn i n g Fit quality depends a lot on kernel choice Also on the regularization weight Need to �t these parameters … but need to use held-out data to avoid trivially over�tting Learning deep kernels for exponential family densities Poster #221
C h oos i n g a ke rn el w i t h m e t a - lea rn i n g C h oos i n g a ke rn el w i t h m e t a - lea rn i n g Fit quality depends a lot on kernel choice Also on the regularization weight Need to �t these parameters … but need to use held-out data to avoid trivially over�tting Meta-learning: take of whole �t on a minibatch Learning deep kernels for exponential family densities Poster #221
D ee p ke rn el s D ee p ke rn el s Simple kernels, e.g. , aren't enough: Learning deep kernels for exponential family densities Poster #221
D ee p ke rn el s D ee p ke rn el s Simple kernels, e.g. , aren't enough: But we can learn lots of parameters with gradient descent: with a neural net, something simple Learning deep kernels for exponential family densities Poster #221
D ee p ke rn el s D ee p ke rn el s Simple kernels, e.g. , aren't enough: Combining a deep architecture with a kernel machine that takes the higher-level learned representation as input can be quite powerful. — Y. Bengio & Y. LeCun, “ Scaling Learning Algorithms towards AI ”, 2007 But we can learn lots of parameters with gradient descent: with a neural net, something simple Learning deep kernels for exponential family densities Poster #221
R e su l ts R e su l ts Learns local dataset geometry: better �ts On real data: slightly worse likelihoods, maybe better “shapes” than deep likelihood models Learning deep kernels for exponential family densities Poster #221
R e su l ts R e su l ts Learns local dataset geometry: better �ts On real data: slightly worse likelihoods, maybe better “shapes” than deep likelihood models Learning deep kernels for exponential family densities Poster #221
R e su l ts R e su l ts Learns local dataset geometry: better �ts On real data: slightly worse likelihoods, maybe better “shapes” than deep likelihood models Learning deep kernels for exponential family densities Poster #221
Recommend
More recommend