Empirical observations • Annealing from the data base rates model typically gives better AIS estimates than annealing from the uniform distribution • RAISE model approximates the original MRF reasonably well with 1,000 – 100,000 intermediate distributions
Empirical observations • Annealing from the data base rates model typically gives better AIS estimates than annealing from the uniform distribution • RAISE model approximates the original MRF reasonably well with 1,000 – 100,000 intermediate distributions • For models that don’t model the data distribution well (overfitting, undertrained etc) the RAISE model can be substantially better than the original MRF.
Empirical observations • It’s really hard to know when AIS is or isn’t working, and RAISE can give a clue about that
Empirical observations • It’s really hard to know when AIS is or isn’t working, and RAISE can give a clue about that • It’s likely that most, but not all, published results based on AIS estimates with enough intermediate distributions are reliable.
Computational Tricks RAISE requires estimating a large sum for each test sample, which is computationally expensive
Computational Tricks RAISE requires estimating a large sum for each test sample, which is computationally expensive Method of control variates gives a way of using few test samples to achieve reasonably reliable estimates
Computational Tricks RAISE requires estimating a large sum for each test sample, which is computationally expensive Method of control variates gives a way of using few test samples to achieve reasonably reliable estimates RAISE estimates and MRF unnormalized probabilities tend to be tightly correlated
Computational Tricks RAISE requires estimating a large sum for each test sample, which is computationally expensive Method of control variates gives a way of using few test samples to achieve reasonably reliable estimates RAISE estimates and MRF unnormalized probabilities tend to be tightly correlated Hence is a low-variance estimator of
Computational Tricks RAISE requires estimating a large sum for each test sample, which is computationally expensive Method of control variates gives a way of using few test samples to achieve reasonably reliable estimates RAISE estimates and MRF unnormalized probabilities tend to be tightly correlated Hence is a low-variance estimator of Here are random test set samples and k is small
Pretraining Very Very Deep Models • Train an RBM or a DBN
Pretraining Very Very Deep Models • Train an RBM or a DBN • Unroll the model using RAISE to create a sigmoid belief network with 100 or 1000 layers
Pretraining Very Very Deep Models • Train an RBM or a DBN • Unroll the model using RAISE to create a sigmoid belief network with 100 or 1000 layers • Use p and q to fine-tune the model with an appropriate algorithm: wake-sleep (Hinton et al, 95) reweighted wake-sleep (Bornschein, Bengio, 14) neural variational inference (Mnih, Gregor, 13)
Recommend
More recommend