statistical geometry processing
play

Statistical Geometry Processing Winter Semester 2011/2012 Machine - PowerPoint PPT Presentation

Statistical Geometry Processing Winter Semester 2011/2012 Machine Learning Topics Topics Machine Learning Intro Learning is density estimation The curse of dimensionality Bayesian inference and estimation Bayes rule in action


  1. Statistical Geometry Processing Winter Semester 2011/2012 Machine Learning

  2. Topics Topics • Machine Learning Intro  Learning is density estimation  The curse of dimensionality • Bayesian inference and estimation  Bayes rule in action  Discriminative and generative learning • Markov random fields (MRFs) and graphical models • Learning Theory  Bias and Variance / No free lunch  Significance 2

  3. Machine Learning & Bayesian Statistics

  4. Statistics How does machine learning work? • Learning: learn a probability distribution • Classification: assign probabilities to data We will look only at classification problems: • Distinguish two classes of objects • From ambiguous data 4

  5. Application Application Scenario: • Automatic scales at supermarket camera • Detect type of fruit using a camera Banana 1.25kg Total 13.15 € 5

  6. Learning Probabilities Toy Example: • We want to distinguish pictures of oranges and bananas • We have 100 training pictures for each fruit category • From this, we want to derive a rule to distinguish the pictures automatically 6

  7. Learning Probabilities Very simple algorithm: • Compute average color • Learn distribution red green 7

  8. Learning Probabilities red green 8

  9. Simple Learning Simple Learning Algorithms: • Histograms red • Fitting Gaussians • We will see more green dim(  ) = 2..3 9

  10. Learning Probabilities red green 10

  11. Learning Probabilities “orange” red banana-orange (p=95%) decision boundary ? ? “banana” ? (p=90%) “banana” (p=51%) green 11

  12. Machine Learning Very simple idea: • Collect data • Estimate probability distribution • Use learned probabilities for classification (etc.) • We always decide for the most likely case (largest probability) Easy to see: • If the probability distributions are known exactly, this decision is optimal (in expectation) • “Minimal Bayesian risk classifier” 12

  13. What is the problem? Why is machine learning difficult? • We need to learn the probabilities • Typical problem: High dimensional input data 13

  14. High Dimensional Spaces color: image: 100 x 100 pixel 3D (RGB) 30 000 dimensions 14

  15. High Dimensional Spaces average color full image learning learning red ? green dim(  ) = 2..3 30 000 dimensions 15

  16. High Dimensional Spaces High dimensional probability spaces: • Too much space to fill • We can never get a sufficient number of examples • Learning is almost impossible What can we do? • We need additional assumptions • Simplify probability space • Model statistical dependencies This makes machine learning a hard problem. 16

  17. Learn From High Dimensional Input Learning Strategies: • Features to reduce the dimension  Average color  Boundary shape  Other heuristics Usually chosen manually. (black magic?) • High-dimensional learning techniques  Neural networks (old school)  Support vector machines (current “standard” technique)  Ada-boost, decision trees, ... (many other techniques) • Usually used in combination 17

  18. Basic Idea: Neural Networks Inputs Classic Solution: w 1 w 2 ... Neural Networks • Non-linear functions  Features as input  Combine basic functions with weights • Optimize to yield Outputs  (1,0) on bananas  (0,1) on oranges • Fit non-linear decision boundary to data 18

  19. Neural Networks Inputs l 1 l 2 ... bottleneck  Outputs 19

  20. Support Vector Machines training set best separating hyperplane 20

  21. Kernel Support Vector Machine “feature space” original space       Example Mapping:     2 2 x , y x , xy , y  21

  22. Other Learning Algorithms Popular Learning Algorithms • Fitting Gaussians • Linear discriminant functions • Ada-boost • Decision trees • ... 22

  23. More Complex Learning Tasks

  24. Learning Tasks Examples of Machine Learning Problems • Pattern recognition  Single class (banana / non-banana)  Multi class (banana, orange, apple, pear)  Howto: Density estimation, highest density minimizes risk • Regression  Fit curve to sparse data  Howto: Curve with parameters, density estimation for parameters • Latent variable regression  Regression between observables and hidden variables  Howto: Parametrize, density estimation 24

  25. Supervision Supervised learning • Training set is labeled Semi-supervised • Part of the training set is labeled Unsupervised • No labels, find structure on your own (“Clustering”) Reinforcement learning • Learn from experience (losses/gains; robotics) 25

  26. Principle Parameters 𝑦 1 , 𝑦 2 , … , 𝑦 𝑙 training set Model hypothesis 26

  27. Two Types of Learning Estimation: p( x ) • Output most likely parameters maximum  Maximum density distribution mean – “Maximum likelihood” – “Maximum a posteriori” x  Mean of the distribution p( x ) maximum Inference: mean distribution • Output probability density  Distribution for parameters  More information x • Marginalize to reduce dimension 27

  28. Bayesian Models Scenario • Customer picks banana ( X = 0) or orange ( X = 1) • Object X creates image D Modeling • Given image D (observed), what was X (latent)? 𝑄 𝑌 𝐸 = 𝑄 𝐸 𝑌 𝑄(𝑌) 𝑄 𝐸 𝑄 𝑌 𝐸 ~𝑄 𝐸 𝑌 𝑄(𝑌) 28

  29. Bayesian Models Model for Estimating X 𝑄 𝑌 𝐸 ~ 𝑄 𝐸 𝑌 𝑄(𝑌) posterior data term, prior likelihood 29

  30. Generative vs. Discriminative Generative Model: learn learn 𝑄 𝑌 𝐸 ~ 𝑄 𝐸 𝑌 𝑄(𝑌) fruit  img fruit | img freq. of fruits compute Properties • Comprehensive model: Full description of how data is created • Might be complex (how to create images of fruit?) 30

  31. Generative vs. Discriminative Discriminative Model: ignore ignore 𝑄 𝑌 𝐸 ~ 𝑄 𝐸 𝑌 𝑄(𝑌) fruit  img fruit | img freq. learn of fruits directly Properties • Easier:  Learn mapping from phenomenon to explanation  Not trying to explain / understand the whole phenomenon • Often easier, but less powerful 31

  32. Statistical Dependencies Markov Random Fields and Graphical Models

  33. Problem Estimation Problem: 𝑄 𝑌 𝐸 ~ 𝑄 𝐸 𝑌 𝑄(𝑌) posterior data term, prior likelihood • X = 3D mesh (10K vertices) • D = noisy scan (or the like) ? • Assume P( D | X ) is known • But: Model P( X ) cannot be build  Not even enough training data  In this part of the universe :-) 30 000 dimensions 33

  34. Reducing dependencies Problem: • 𝑞(𝑦 1 , 𝑦 2 , … , 𝑦 10000 ) is to high-dimensional • k States, n variables: O( k n ) density entries • General dependencies kill the model Idea • Hand-craft decencies • We might know or guess what actually depends on each other and what not • This is the art of machine learning 34

  35. Graphical Models 2 𝑦 𝑗 , 𝑦 𝑘 1 𝑦 𝑗 𝑞 𝑗,𝑘 𝑞 𝑗 Factorize Models • Pairwise models: 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑓 1,2 𝑓 2,3 𝑞 𝑦 1 , … , 𝑦 𝑜 𝑜 𝑦 5 𝑦 6 𝑦 7 𝑦 8 = 1 1 𝑦 𝑗 2 𝑦 𝑗 , 𝑦 𝑘 𝑎 𝑞 𝑗 𝑞 𝑗,𝑘 𝑗=1 𝑗,𝑘∈𝐹 𝑦 9 𝑦 10 𝑦 11 𝑦 12 • Model complexity:  O( nk 2 ) parameters • Higher order models:  Triplets, quadruples as factors  Local neighborhoods 35

  36. Graphical Models 2 𝑦 𝑗 , 𝑦 𝑘 1 𝑦 𝑗 𝑞 𝑗,𝑘 𝑞 𝑗 Markov Random fields • Factorize density in local 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑓 1,2 𝑓 2,3 “cliques” 𝑦 5 𝑦 6 𝑦 7 𝑦 8 Graphical model • Connect variables that are 𝑦 9 𝑦 10 𝑦 11 𝑦 12 directly dependent • Formal model: Conditional independence 36

  37. Graphical Models 2 𝑦 𝑗 , 𝑦 𝑘 1 𝑦 𝑗 𝑞 𝑗,𝑘 𝑞 𝑗 Conditional Independence • A node is conditionally 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑓 1,2 𝑓 2,3 independent of all others given the values of its 𝑦 5 𝑦 6 𝑦 7 𝑦 8 direct neighbors • I.e. set these values to 𝑦 9 𝑦 10 𝑦 11 𝑦 12 constants, x 7 is independent of all others Theorem (Hammersley – Clifford): • Given conditional independence as graph, a (positive) probability density factors over cliques in the graph 37

  38. Example: Texture Synthesis

  39. completion region selected

  40. Texture Synthesis Idea • One or more images as examples Example • Learn image statistics Data • Use knowledge: Boundary  Specify boundary conditions Conditions  Fill in texture 40

  41. The Basic Idea Pixel Markov Random Field Model • Image statistics • How pixels are colored depends on local neighborhood only (Markov Random Field) • Predict color from neighborhood Neighborhood 41

  42. A Little Bit of Theory... Image statistics: • An image of n × m pixels • Random variable: x = [ x 11 ,..., x nm ]  [0, 1, ..., 255] n × m • Probability distribution: p( x ) = p( x 11 , ..., x nm ) 256 choices ... 256 choices 256 n × m probability values It is impossible to learn full images from examples! 42

Recommend


More recommend