Learning From Data Lecture 15 Reflecting on Our Path - Epilogue to Part I What We Did The Machine Learning Zoo Moving Forward M. Magdon-Ismail CSCI 4100/6100
recap: Three Learning Principles Scientist 1 Scientist 2 Scientist 3 resistivity ρ resistivity ρ resistivity ρ Occam’s razor: simpler is better; falsifiable. temperature T temperature T temperature T not falsifiable falsifiable Sampling bias: ensure that training and test distributions are the same, or else acknowl- edge/account for it. You cannot sample from one bin and use your estimates for another bin. ? ? you are charged for Data snooping: h ∈ H influenced by D . Choose the every choice Data learning process (usually H ) before looking at D . g ? We know the price of choosing g from H . your choices − → g D M Reflecting on Our Path : 2 /11 � A c L Creator: Malik Magdon-Ismail Zen Moment − →
� Zen Moment M Reflecting on Our Path : 3 /11 � A c L Creator: Malik Magdon-Ismail Our Plan − →
Our Plan 1. What is Learning? Output g ≈ f after looking at data ( x n , y n ). 2. Can We do it? simple H , finite d vc , large N E in ≈ E out E in ≈ 0 good H , algorithms 3. How to do it? Linear models, nonlinear transforms Algorithms: PLA, pseudoinverse, gradient descent concepts theory 4. How to do it well? practice Overfitting: stochastic & deterministic noise Cures: regularization, validation. 5. General principles? Occams razor, sampling bias, data snooping 6. Advanced techniques. 7. Other Learning Paradigms. M Reflecting on Our Path : 4 /11 � A c L Creator: Malik Magdon-Ismail LFD Jungle − →
Learning From Data: It’s A Jungle Out There stochastic noise K -means stochastic gradient descent exploration overfitting Lloyds algorithm reinforcement Gaussian processes augmented error bootstrapping ill-posed deterministic noise exploitation data snooping unlabelled data expectation-maximization distribution free learning Q -learning logistic regression linear regression Rademacher complexity learning curve gan s transfer learning CART bagging Bayesian VC dimension Gibbs sampling decision trees nonlinear transformation sampling bias neural networks Markov Chain Monte Carlo (MCMC) support vectors Mercer’s theorem adaboost training versus testing extrapolation SVM linear models no free lunch graphical models bioinformatics ordinal regression cross validation HMMs bias-variance tradeoff RBF DEEP LEARNING PAC-learning biometrics error measures active learning data contamination types of learning multiclass MDL perceptron learning random forests unsupervised one versus all weak learning conjugate gradients is learning feasible? momentum RKHS online-learning Levenberg-Marquardt Occam’s razor kernel methods mixture of experts noisy targets boosting weight decay ranking multi-agent systems ensemble methods AIC classification PCA LLE kernel-PCA Big Data permutation complexity regularization primal-dual Boltzmann machine colaborative filtering semi-supervised learning clustering M Reflecting on Our Path : 5 /11 � A c L Creator: Malik Magdon-Ismail Theory − →
Navigating the Jungle: Theory THEORY VC-analysis bias-variance complexity Bayesian Rademacher SRM . . . M Reflecting on Our Path : 6 /11 � A c L Creator: Malik Magdon-Ismail Techniques − →
Navigating the Jungle: Techniques THEORY TECHNIQUES VC-analysis Models Methods bias-variance complexity Bayesian Rademacher SRM . . . M Reflecting on Our Path : 7 /11 � A c L Creator: Malik Magdon-Ismail Models − →
Navigating the Jungle: Models THEORY TECHNIQUES VC-analysis Models Methods bias-variance linear complexity neural networks Bayesian SVM Rademacher similarity SRM Gaussian processes . . . graphical models bilinear/SVD . . . M Reflecting on Our Path : 8 /11 � A c L Creator: Malik Magdon-Ismail Methods − →
Navigating the Jungle: Methods THEORY TECHNIQUES VC-analysis Models Methods bias-variance linear regularization complexity neural networks validation Bayesian SVM aggregation Rademacher similarity preprocessing SRM . . Gaussian processes . . . . graphical models bilinear/SVD . . . M Reflecting on Our Path : 9 /11 � A c L Creator: Malik Magdon-Ismail Paradigms − →
Navigating the Jungle: Paradigms THEORY TECHNIQUES PARADIGMS VC-analysis supervised Models Methods bias-variance unsupervised linear regularization complexity reinforcement neural networks validation Bayesian active SVM aggregation Rademacher online similarity preprocessing SRM unlabeled . . Gaussian processes . . . . transfer learning graphical models big data bilinear/SVD . . . . . . M Reflecting on Our Path : 10 /11 � A c L Creator: Malik Magdon-Ismail Moving Forward − →
Moving Forward 1. What is Learning? Output g ≈ f after looking at data ( x n , y n ). 2. Can We do it? E in ≈ E out simple H , finite d vc , large N E in ≈ 0 good H , algorithms 3. How to do it? Linear models, nonlinear transforms Algorithms: PLA, pseudoinverse, gradient descent concepts 4. How to do it well? theory Overfitting: stochastic & deterministic noise practice Cures: regularization, validation. 5. General principles? Occams razor, sampling bias, data snooping 6. Advanced techniques. Similarity, neural networks, SVMs, preprocessing & aggregation 7. Other Learning Paradigms. Unsupervised, reinforcement M Reflecting on Our Path : 11 /11 � A c L Creator: Malik Magdon-Ismail
Recommend
More recommend