Statistical Geometry Processing Winter Semester 2011/2012 Machine - PowerPoint PPT Presentation

Statistical Geometry Processing Winter Semester 2011/2012 Machine Learning

Topics Topics • Machine Learning Intro  Learning is density estimation  The curse of dimensionality • Bayesian inference and estimation  Bayes rule in action  Discriminative and generative learning • Markov random fields (MRFs) and graphical models • Learning Theory  Bias and Variance / No free lunch  Significance 2

Machine Learning & Bayesian Statistics

Statistics How does machine learning work? • Learning: learn a probability distribution • Classification: assign probabilities to data We will look only at classification problems: • Distinguish two classes of objects • From ambiguous data 4

Application Application Scenario: • Automatic scales at supermarket camera • Detect type of fruit using a camera Banana 1.25kg Total 13.15 € 5

Learning Probabilities Toy Example: • We want to distinguish pictures of oranges and bananas • We have 100 training pictures for each fruit category • From this, we want to derive a rule to distinguish the pictures automatically 6

Learning Probabilities Very simple algorithm: • Compute average color • Learn distribution red green 7

Learning Probabilities red green 8

Simple Learning Simple Learning Algorithms: • Histograms red • Fitting Gaussians • We will see more green dim(  ) = 2..3 9

Learning Probabilities red green 10

Learning Probabilities “orange” red banana-orange (p=95%) decision boundary ? ? “banana” ? (p=90%) “banana” (p=51%) green 11

Machine Learning Very simple idea: • Collect data • Estimate probability distribution • Use learned probabilities for classification (etc.) • We always decide for the most likely case (largest probability) Easy to see: • If the probability distributions are known exactly, this decision is optimal (in expectation) • “Minimal Bayesian risk classifier” 12

What is the problem? Why is machine learning difficult? • We need to learn the probabilities • Typical problem: High dimensional input data 13

High Dimensional Spaces color: image: 100 x 100 pixel 3D (RGB) 30 000 dimensions 14

High Dimensional Spaces average color full image learning learning red ? green dim(  ) = 2..3 30 000 dimensions 15

High Dimensional Spaces High dimensional probability spaces: • Too much space to fill • We can never get a sufficient number of examples • Learning is almost impossible What can we do? • We need additional assumptions • Simplify probability space • Model statistical dependencies This makes machine learning a hard problem. 16

Learn From High Dimensional Input Learning Strategies: • Features to reduce the dimension  Average color  Boundary shape  Other heuristics Usually chosen manually. (black magic?) • High-dimensional learning techniques  Neural networks (old school)  Support vector machines (current “standard” technique)  Ada-boost, decision trees, ... (many other techniques) • Usually used in combination 17

Basic Idea: Neural Networks Inputs Classic Solution: w 1 w 2 ... Neural Networks • Non-linear functions  Features as input  Combine basic functions with weights • Optimize to yield Outputs  (1,0) on bananas  (0,1) on oranges • Fit non-linear decision boundary to data 18

Neural Networks Inputs l 1 l 2 ... bottleneck  Outputs 19

Support Vector Machines training set best separating hyperplane 20

Kernel Support Vector Machine “feature space” original space       Example Mapping:     2 2 x , y x , xy , y  21

Other Learning Algorithms Popular Learning Algorithms • Fitting Gaussians • Linear discriminant functions • Ada-boost • Decision trees • ... 22

More Complex Learning Tasks

Learning Tasks Examples of Machine Learning Problems • Pattern recognition  Single class (banana / non-banana)  Multi class (banana, orange, apple, pear)  Howto: Density estimation, highest density minimizes risk • Regression  Fit curve to sparse data  Howto: Curve with parameters, density estimation for parameters • Latent variable regression  Regression between observables and hidden variables  Howto: Parametrize, density estimation 24

Supervision Supervised learning • Training set is labeled Semi-supervised • Part of the training set is labeled Unsupervised • No labels, find structure on your own (“Clustering”) Reinforcement learning • Learn from experience (losses/gains; robotics) 25

Principle Parameters 𝑦 1 , 𝑦 2 , … , 𝑦 𝑙 training set Model hypothesis 26

Two Types of Learning Estimation: p( x ) • Output most likely parameters maximum  Maximum density distribution mean – “Maximum likelihood” – “Maximum a posteriori” x  Mean of the distribution p( x ) maximum Inference: mean distribution • Output probability density  Distribution for parameters  More information x • Marginalize to reduce dimension 27

Bayesian Models Scenario • Customer picks banana ( X = 0) or orange ( X = 1) • Object X creates image D Modeling • Given image D (observed), what was X (latent)? 𝑄 𝑌 𝐸 = 𝑄 𝐸 𝑌 𝑄(𝑌) 𝑄 𝐸 𝑄 𝑌 𝐸 ~𝑄 𝐸 𝑌 𝑄(𝑌) 28

Bayesian Models Model for Estimating X 𝑄 𝑌 𝐸 ~ 𝑄 𝐸 𝑌 𝑄(𝑌) posterior data term, prior likelihood 29

Generative vs. Discriminative Generative Model: learn learn 𝑄 𝑌 𝐸 ~ 𝑄 𝐸 𝑌 𝑄(𝑌) fruit  img fruit | img freq. of fruits compute Properties • Comprehensive model: Full description of how data is created • Might be complex (how to create images of fruit?) 30

Generative vs. Discriminative Discriminative Model: ignore ignore 𝑄 𝑌 𝐸 ~ 𝑄 𝐸 𝑌 𝑄(𝑌) fruit  img fruit | img freq. learn of fruits directly Properties • Easier:  Learn mapping from phenomenon to explanation  Not trying to explain / understand the whole phenomenon • Often easier, but less powerful 31

Statistical Dependencies Markov Random Fields and Graphical Models

Problem Estimation Problem: 𝑄 𝑌 𝐸 ~ 𝑄 𝐸 𝑌 𝑄(𝑌) posterior data term, prior likelihood • X = 3D mesh (10K vertices) • D = noisy scan (or the like) ? • Assume P( D | X ) is known • But: Model P( X ) cannot be build  Not even enough training data  In this part of the universe :-) 30 000 dimensions 33

Reducing dependencies Problem: • 𝑞(𝑦 1 , 𝑦 2 , … , 𝑦 10000 ) is to high-dimensional • k States, n variables: O( k n ) density entries • General dependencies kill the model Idea • Hand-craft decencies • We might know or guess what actually depends on each other and what not • This is the art of machine learning 34

Graphical Models 2 𝑦 𝑗 , 𝑦 𝑘 1 𝑦 𝑗 𝑞 𝑗,𝑘 𝑞 𝑗 Factorize Models • Pairwise models: 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑓 1,2 𝑓 2,3 𝑞 𝑦 1 , … , 𝑦 𝑜 𝑜 𝑦 5 𝑦 6 𝑦 7 𝑦 8 = 1 1 𝑦 𝑗 2 𝑦 𝑗 , 𝑦 𝑘 𝑎 𝑞 𝑗 𝑞 𝑗,𝑘 𝑗=1 𝑗,𝑘∈𝐹 𝑦 9 𝑦 10 𝑦 11 𝑦 12 • Model complexity:  O( nk 2 ) parameters • Higher order models:  Triplets, quadruples as factors  Local neighborhoods 35

Graphical Models 2 𝑦 𝑗 , 𝑦 𝑘 1 𝑦 𝑗 𝑞 𝑗,𝑘 𝑞 𝑗 Markov Random fields • Factorize density in local 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑓 1,2 𝑓 2,3 “cliques” 𝑦 5 𝑦 6 𝑦 7 𝑦 8 Graphical model • Connect variables that are 𝑦 9 𝑦 10 𝑦 11 𝑦 12 directly dependent • Formal model: Conditional independence 36

Graphical Models 2 𝑦 𝑗 , 𝑦 𝑘 1 𝑦 𝑗 𝑞 𝑗,𝑘 𝑞 𝑗 Conditional Independence • A node is conditionally 𝑦 1 𝑦 2 𝑦 3 𝑦 4 𝑓 1,2 𝑓 2,3 independent of all others given the values of its 𝑦 5 𝑦 6 𝑦 7 𝑦 8 direct neighbors • I.e. set these values to 𝑦 9 𝑦 10 𝑦 11 𝑦 12 constants, x 7 is independent of all others Theorem (Hammersley – Clifford): • Given conditional independence as graph, a (positive) probability density factors over cliques in the graph 37

Example: Texture Synthesis

completion region selected

Texture Synthesis Idea • One or more images as examples Example • Learn image statistics Data • Use knowledge: Boundary  Specify boundary conditions Conditions  Fill in texture 40

The Basic Idea Pixel Markov Random Field Model • Image statistics • How pixels are colored depends on local neighborhood only (Markov Random Field) • Predict color from neighborhood Neighborhood 41

A Little Bit of Theory... Image statistics: • An image of n × m pixels • Random variable: x = [ x 11 ,..., x nm ]  [0, 1, ..., 255] n × m • Probability distribution: p( x ) = p( x 11 , ..., x nm ) 256 choices ... 256 choices 256 n × m probability values It is impossible to learn full images from examples! 42

Statistical Geometry Processing Winter Semester 2011/2012 Machine - PowerPoint PPT Presentation

Statistical Geometry Processing Winter Semester 2011/2012 Machine Learning Topics Topics Machine Learning Intro Learning is density estimation The curse of dimensionality Bayesian inference and estimation Bayes rule in action

Stochastic geometry and random generation 1 Stochastic geometry and random generation

48-175 Descriptive Geometry Basic Concepts of Descriptive Geometry Descriptive geometry is

Hyperbolic Geometry Victor Gonzalez Mentor: Ryan Kirk May 4, 2016 Hyperbolic Geometry We are

Geometry Problems Geometry Problems Examples for Typical ACM Instances Elementary Geometry

3d Geometry for Computer Graphics Lesson 1: Basics & PCA 3d geometry 3d geometry 3d

Statistical Geometry Processing Winter Semester 2011/2012 Representations of Geometry Motivation

Statistical Geometry Processing Winter Semester 2011/2012 n r u v Differential Geometry

Computer Graphics at University of Toronto 2 Modeling 5 Geometry Processing is biology 6

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

2.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2018 CSCI 621:

3.1 Classic Differential Geometry Hao Li http://cs599.hao-li.com 1 Spring 2014 CSCI 599:

3.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2017 CSCI 621:

Geometry Euclid of Alexandria The Founder of Geometry. He was a Greek mathematician, often

Ansys - Old Geometry - Cathode 1 Ansys - New Geometry - Cathode lamella (PCB and copper

Snapshots from the History of Toric Geometry David A. Cox Geometry 19701988 Toric Geometry

Average-Case Acceleration Through Spectral Density Estimation and Universal Asymptotic Optimality

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Extended Path Integral Formulation for Volumetric Transport T. Hachisuka I. Georgiev W. Jarosz

Sub-quadratic Markov tree mixture models for probability density estimation Sourour Ammar 1 , Ph.

( ) { ( ) } A random variable is subject to the following A random variable is subject to

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

0 tt pp X Yuri Oksuzian University of Florida PHENO 2010 1 Why and How? t Goal

Reorder Density Function A Metric for Packet Reordering Anura P. Jayasumana Nischal M Piratla,

Statistical Geometry Processing Winter Semester 2011/2012 Machine - PowerPoint PPT Presentation

Statistical Geometry Processing Winter Semester 2011/2012 Machine Learning Topics Topics Machine Learning Intro Learning is density estimation The curse of dimensionality Bayesian inference and estimation Bayes rule in action

Stochastic geometry and random generation 1 Stochastic geometry and random generation

48-175 Descriptive Geometry Basic Concepts of Descriptive Geometry Descriptive geometry is

Hyperbolic Geometry Victor Gonzalez Mentor: Ryan Kirk May 4, 2016 Hyperbolic Geometry We are

Geometry Problems Geometry Problems Examples for Typical ACM Instances Elementary Geometry

3d Geometry for Computer Graphics Lesson 1: Basics &amp; PCA 3d geometry 3d geometry 3d

Statistical Geometry Processing Winter Semester 2011/2012 Representations of Geometry Motivation

Statistical Geometry Processing Winter Semester 2011/2012 n r u v Differential Geometry

Computer Graphics at University of Toronto 2 Modeling 5 Geometry Processing is biology 6

FOOD PROCESSING FOOD PROCESSING GREEN BEAN PROCESSING GREEN BEAN PROCESSING GREEN BEAN

Statistical Statistical Statistical Model Statistical Model Model Checking Model Checking

2.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2018 CSCI 621:

3.1 Classic Differential Geometry Hao Li http://cs599.hao-li.com 1 Spring 2014 CSCI 599:

3.2 Classic Differential Geometry 1 Hao Li http://cs621.hao-li.com 1 Spring 2017 CSCI 621:

Geometry Euclid of Alexandria The Founder of Geometry. He was a Greek mathematician, often

Ansys - Old Geometry - Cathode 1 Ansys - New Geometry - Cathode lamella (PCB and copper

Snapshots from the History of Toric Geometry David A. Cox Geometry 19701988 Toric Geometry

Average-Case Acceleration Through Spectral Density Estimation and Universal Asymptotic Optimality

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Extended Path Integral Formulation for Volumetric Transport T. Hachisuka I. Georgiev W. Jarosz

Sub-quadratic Markov tree mixture models for probability density estimation Sourour Ammar 1 , Ph.

( ) { ( ) } A random variable is subject to the following A random variable is subject to

CS70: Jean Walrand: Lecture 36. Gaussian and CLT CS70: Jean Walrand: Lecture 36. Gaussian and

0 tt pp X Yuri Oksuzian University of Florida PHENO 2010 1 Why and How? t Goal

Reorder Density Function A Metric for Packet Reordering Anura P. Jayasumana Nischal M Piratla,

3d Geometry for Computer Graphics Lesson 1: Basics & PCA 3d geometry 3d geometry 3d