deviation ( ) 10/6/16 CSE-571: Robotics 2 2 X ~ N ( , - PowerPoint PPT Presentation

Slide from Pieter Abbeel ¡ Gaussian with mean ( ) and standard µ deviation ( ) σ 10/6/16 CSE-571: Robotics 2

2 X ~ N ( , ) ⎫ µ σ 2 2 Y ~ N ( a b , a ) ⇒ µ + σ ⎬ Y aX b = + ⎭

2 2 2 X ~ N ( , ) 1 ⎫ ⎛ ⎞ µ σ σ σ 1 1 1 p ( X ) p ( X ) ~ N 2 1 , ⎜ ⎟ ⇒ ⋅ µ + µ ⎬ 1 2 1 2 ⎜ ⎟ 2 2 2 2 2 2 2 − − X ~ N ( , ) σ + σ σ + σ σ + σ µ σ ⎝ ⎠ ⎭ 2 2 2 1 2 1 2 1 2

Picture from [Bishop: Pattern Recognition and Machine Learning, 2006] p ( x ) = Ν ( µ , Σ ) x b ⎛ ⎞ ⎛ ⎞ µ a x a x = µ = ⎜ ⎟ ⎜ ⎟ ⎟ , ⎜ ⎜ ⎟ µ b x b ⎝ ⎠ ⎝ ⎠ ⎛ ⎞ Σ aa Σ ab Σ = ⎜ ⎟ ⎜ ⎟ Σ ba Σ bb ⎝ ⎠ x a − 1 2( x − µ ) T Σ − 1 ( x − µ ) 1 p ( x ) = 1/2 e (2 π ) d /2 Σ 10/6/16 CSE-571: Robotics 5

Slide from Pieter Abbeel " µ = [1; 0] " µ = [-.5; 0] " µ = [-1; -1.5] " Σ = [1 0; 0 1] " Σ = [1 0; 0 1] " Σ = [1 0; 0 1] 10/6/16 CSE-571: Robotics 6

Slide from Pieter Abbeel ! µ = [0; 0] " µ = [0; 0] " µ = [0; 0] " Σ = [.6 0 ; 0 .6] " Σ = [2 0 ; 0 2] ! Σ = [1 0 ; 0 1] 10/6/16 CSE-571: Robotics 7

Slide from Pieter Abbeel " µ = [0; 0] " µ = [0; 0] " µ = [0; 0] " Σ = [1 0; 0 1] " Σ = [1 0.5; 0.5 1] " Σ = [1 0.8; 0.8 1] 10/6/16 CSE-571: Robotics 8

Slide from Pieter Abbeel " µ = [0; 0] " µ = [0; 0] " µ = [0; 0] " Σ = [1 -0.5 ; -0.5 1] " Σ = [1 -0.8 ; -0.8 1] " Σ = [3 0.8 ; 0.8 1] 1 3 10/6/16 CSE-571: Robotics 9

Pictures from [Bishop: PRML, 2006] ¡ Marginalizing joint distribution results in a Gaussian ⎛ ⎞ ⎛ ⎞ p ( x a ) = p ( x a , x b ) dx b ∫ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x a µ a Σ aa Σ ab ⎜ ⎟ ⎜ ⎟ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ p ⎥ , ⎟ = Ν ⎜ ⎜ ⎟ ⎢ x b ⎥ ⎢ ⎢ ⎥ µ b Σ ba Σ bb ( ) p ( x a ) = Ν µ a , Σ aa ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎝ ⎠ ⎝ ⎠ ¡ Conditioning also leads to a Gaussian ( ) p ( x a | x b ) = Ν µ a | b , Σ a | b − 1 ( x b − µ b ) µ a | b = µ a + Σ ab Σ bb Prior mean Cross co-variance Prior Variance (b) Observed (b) value − 1 Σ ba Σ a | b = Σ aa − Σ ab Σ bb Prior Variance (a) Shrink term (>= 0) 10/6/16 CSE-571: Robotics 10

10/6/16 CSE-571: Robotics 11

¡ Modeling the relationship between real-valued variables in data ▪ Sensor models, dynamics models, stock market etc ¡ Two broad classes of models: ▪ Parametric: ▪ Learn a model of the data, use model to make new predictions ▪ Eg: Linear, Non-linear, Neural Networks etc. ▪ Non-Parametric: ▪ Keep the data around and use it to make new predictions ▪ Eg: Nearest Neighbor methods, Locally Weighted Regression, Gaussian Processes etc. 10/6/16 CSE-571: Robotics 12

¡ Idea: Summarize Parametric models data using a 2 learned model: § Linear, Polynomial 1 § Neural Networks etc 0 y − 1 ¡ Computationally efficient, tradeoff − 2 Training set complexity vs Linear Polynomial − 4 − 3 generalization Polynomial − 8 0 2 4 6 8 10 x 10/6/16 CSE-571: Robotics 13

¡ Idea: Use nearest Non − Parametric models neighbor’s prediction (with 4 some interpolation) 3 § Non-parametric, keeps all data 2 § Ex: 1-NN, NN with linear 1 interpolation 0 Y ¡ Easy. Needs lot of data − 1 § Best you can do in limit of infinite data − 2 − 3 ¡ Computationally expensive Training set in high dimensions − 4 1 − NN NN − Linear 0 2 4 6 8 10 X 10/6/16 CSE-571: Robotics 14

¡ Idea: Interpolate based on Smooth Non − Parametric models “close” training data 4 § Closeness defined using a “kernel” function 3 § Test output is a weighted 2 interpolation of training outputs 1 § Locally Weighted 0 Regression, Gaussian Y Processes − 1 − 2 ¡ Can model arbitrary − 3 (smooth) functions Training set LWR − NN § Need to keep around some − 4 GP (maybe all) training data GP − Var 0 2 4 6 8 10 X 10/6/16 CSE-571: Robotics 15

10/6/16 CSE-571: Robotics 16

¡ Non-parametric regression model ¡ Distribution over functions ¡ Fully specified by training data, mean and covariance functions ¡ Covariance given by “kernel” which measures distance of inputs in kernel space 10/6/16 CSE-571: Robotics 17

¡ Given, inputs (x) and targets(y): D {( x , y ),( x , y ), ,( x , y )} ( , ) X y = … = 1 1 2 2 n n ¡ GPs model the targets as a noisy function of the inputs: 2 ) y i = f ( x i ) + ε ; ε ~ N (0, σ n ¡ Formally, a GP is a collection of random variables, any finite number of which have a joint Gaussian distribution: f ( x ) ~ GP ( m ( x ), k ( x , x ')) m ( x ) = E [ f ( x )] k ( x , x ') = E [( f ( x ) − m ( x ))( f ( x ') − m ( x '))] 10/6/16 CSE-571: Robotics 18

¡ Given a (finite) set of inputs (X), GP models the outputs (y) as jointly Gaussian: Noise 2 I ) P ( y | X ) = N ( m ( X ), K ( X , X ) + σ n ⎛ ⎞ ⎛ ⎞ m ( x 1 ) k ( x 1 , x 1 ) … k ( x 1 , x n ) ⎜ ⎟ ⎜ ⎟ m ( x 2 ) ⎜ ⎟ k ( x 2 , x 1 ) ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ m = ! K = ! k ( x i , x i ) ! ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ m ( x n ) ⎜ ⎟ ⎜ ⎟ k ( x n , x 1 ) … k ( x n , x n ) ⎝ ⎠ ⎝ ⎠ ¡ Usually, we assume zero-mean prior ▪ Can define other mean functions (constant, polynomials etc) 10/6/16 CSE-571: Robotics 19

¡ Covariance matrix (K) is defined through the “kernel” function: § Specifies covariance of the outputs as the function of inputs ¡ Example: Squared Exponential Kernel § Covariance proportional to distance in input space § Similar input points will have similar outputs − 1 x ) T 2 e 2( x − ʹ x ) W ( x − ʹ k ( x , ʹ x ) = σ f 10/6/16 CSE-571: Robotics 20

Pictures from [Bishop: PRML, 2006] ¡ GP prior: Outputs jointly zero-mean Gaussian: 2 I ) P ( y | X ) = Ν ( 0 , K + σ n 10/6/16 CSE-571: Robotics 21

¡ Training data: D {( x , y ),( x , y ), ,( x , y )} ( , ) X y = … = 1 1 2 2 n n ¡ Test pair (y unknown): { x * , y * } ¡ GP outputs are jointly Gaussian: P ( y , y * | X , x * ) = N ( µ , Σ ); P ( y | X ) = N (0, Κ + σ 2 n I ) ¡ Conditioning on y: ( ) ( ) 2 P ( y * | x * , y , X ) = N µ * , σ * p ( x a | x b ) = Ν µ a | b , Σ a | b − 1 y T K + σ n ( ) 2 I µ * = k * µ a | b = µ a + Σ ab Σ bb − 1 ( x b − µ b ) − 1 k * T K + σ n 2 = k ** − k * ( 2 I ) Σ a | b = Σ aa − Σ ab Σ bb − 1 Σ ba σ * k * [ i ] = k ( x * , x i ); k ** = k ( x * , x * ) Recall conditional 10/6/16 CSE-571: Robotics 22

10/6/16 CSE-571: Robotics 23

¡ Noise Standard deviation ( ) σ 2 n § Affects how likely a new observation changes predictions (and covariance) ¡ Kernel (choose based on data) § SE, Exponential, Matern etc. ¡ Kernel hyperparameters: − 1 x ) T 2 e 2( x − ʹ x ) W ( x − ʹ § SE kernel: k ( x , ʹ x ) = σ f ▪ Length scale (how fast the function changes) ▪ Scale factor (how large the function variance is) 10/6/16 CSE-571: Robotics 24

Pictures from [Bishop: PRML, 2006] x ) = θ 0 exp − θ 1 ⎛ ⎞ 2 k ( x , ′ 2 x − ′ ⎟ + θ 2 + θ 3 x T x ' x ⎜ ⎝ ⎠ 10/6/16 CSE-571: Robotics 25

¡ Maximize data log likelihood: θ * = argmax p ( y | X , θ ) θ ( ) − 1 y − 1 ( ) − n log p ( y | X , θ ) = − 1 2 y T K + σ n 2 log K + σ n 2 log2 π 2 I 2 I ¡ Compute derivatives wrt. params 2 2 , , l θ = 〈 σ σ 〉 n f ¡ Optimize using conjugate gradient descent 10/6/16 CSE-571: Robotics 26

10/6/16 CSE-571: Robotics 27

• Learn hyperparameters via numerical methods • Learn noise model at the same time 28 10/6/16 CSE-571: Robotics

• System: • Commercial blimp envelope with custom gondola • XScale based computer with Bluetooth connectivity • Two main motors with tail motor (3D control) • Ground truth obtained via VICON motion capture system 29 10/6/16 CSE-571: Robotics

e R v ⎡ ⎤ p ⎡ ⎤ b ⎢ ⎥ ⎢ ⎥ H ( ) ξ d ξ ⎢ ⎥ � ⎢ ⎥ s = = ⎢ ⎥ 1 M ( Forces * Mv ) − dt v ∑ ⎢ ⎥ − ω ⎢ ⎥ ⎢ ⎥ 1 ω ⎢ J − ( Torques * J ) ⎥ ∑ − ω ω ⎣ ⎦ ⎣ ⎦ ¡ 12-D state=[pos,rot,transvel,rotvel] ¡ Describes evolution of state as ODE ¡ Forces / torques considered: buoyancy, gravity, drag, thrust ¡ 16 parameters are learned by optimization on ground truth motion capture data 10/6/16 CSE-571: Robotics 30

c 2 s o Δ 2 2 s … Δ c 1 o s 1 s 3 2 3 s 1 • Use ground truth state to extract: – Dynamics data D [ s , c ], s , [ s , c ], s = Δ Δ … S 1 1 1 2 2 2 • Learn model using Gaussian process regression – Learn process noise inherent in system … 31 10/6/16 CSE-571: Robotics

c 1 s Δ 1 f ([ s 1 c , ]) 1 s s 2 1 • Combine GP model with parametric model D [ s , c ], s f ([ s , c ]) = Δ − X 1 1 1 1 1 • Advantages – Captures aspects of system not considered by parametric model – Learns noise model in same way as GP-only models – Higher accuracy for same amount of training data 32 10/6/16 CSE-571: Robotics

deviation ( ) 10/6/16 CSE-571: Robotics 2 2 X ~ N ( , - PowerPoint PPT Presentation

Slide from Pieter Abbeel Gaussian with mean ( ) and standard deviation ( ) 10/6/16 CSE-571: Robotics 2 2 X ~ N ( , ) 2 2 Y ~ N ( a b , a ) + Y aX b = + 2 2 2 X ~ N

Mean Absolute Deviation Mean Absolute Deviation O Definition: Mean Absolute Deviation (MAD) is

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Sensors for Robotics

HAKAN version 3 Hasicska 2643 address: 756 61 ROZNOV pod RADHOSTEM +420 571 843 162, +420 571

Human-Oriented Robotics Octave/Matlab Tutorial Kai Arras Social Robotics Lab, University of

Robotics Engineering Prof. Michael Gennert Robotics Engineering Program Director Fall 2016

LEGO Develops a new LEGO Develops a new robotics platform - WeDo robotics platform - WeDo

Standard Deviation MDM4U: Mathematics of Data Management A deviation is the difference between any

Human-Oriented Robotics Basics of Probabilistic Reasoning Kai Arras Social Robotics Lab,

Human-Oriented Robotics Temporal Reasoning Part 3/3 Kai Arras Social Robotics Lab, University

Human-Oriented Robotics Unsupervised Learning Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Probability Refresher Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Robot Motion Planning Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Supervised Learning Part 3/3 Kai Arras Social Robotics Lab, University

Human-Oriented Robotics Supervised Learning Part 2/3 Kai Arras Social Robotics Lab, University

The Bishops Initiative LET US TALK What is the Personality and Character of your

Dirk Helbing Scientific Coordinator Steven Bishop Management Coordinator

Homework Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression

Water Bishops Environmental Club has dedicated this years Rock 4 concert to water and

Bible Study Bishop Sally Dyck Matthew 16:13-16 John 21:15-17 Acts 10:34-35 & 11:15-18 The

Deep Learning - Theory and Practice Linear Regression, Least Squares 13-02-2020 Classification

Interpolating sequences for the Dirichlet space Nicola Arcozzi, with R. Rochberg and E. Sawyer

Slope Stability Dr. Hend AlShatnawi Hashemite University Class of 2019-2020 Slope Stability

deviation ( ) 10/6/16 CSE-571: Robotics 2 2 X ~ N ( , - PowerPoint PPT Presentation

Slide from Pieter Abbeel Gaussian with mean ( ) and standard deviation ( ) 10/6/16 CSE-571: Robotics 2 2 X ~ N ( , ) 2 2 Y ~ N ( a b , a ) + Y aX b = + 2 2 2 X ~ N

Mean Absolute Deviation Mean Absolute Deviation O Definition: Mean Absolute Deviation (MAD) is

Mobile &amp; Service Robotics Mobile &amp; Service Robotics Sensors for Robotics Sensors for

Mobile &amp; Service Robotics Mobile &amp; Service Robotics Sensors for Robotics Sensors for

Mobile &amp; Service Robotics Mobile &amp; Service Robotics Sensors for Sensors for Robotics

HAKAN version 3 Hasicska 2643 address: 756 61 ROZNOV pod RADHOSTEM +420 571 843 162, +420 571

Human-Oriented Robotics Octave/Matlab Tutorial Kai Arras Social Robotics Lab, University of

Robotics Engineering Prof. Michael Gennert Robotics Engineering Program Director Fall 2016

LEGO Develops a new LEGO Develops a new robotics platform - WeDo robotics platform - WeDo

Standard Deviation MDM4U: Mathematics of Data Management A deviation is the difference between any

Human-Oriented Robotics Basics of Probabilistic Reasoning Kai Arras Social Robotics Lab,

Human-Oriented Robotics Temporal Reasoning Part 3/3 Kai Arras Social Robotics Lab, University

Human-Oriented Robotics Unsupervised Learning Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Probability Refresher Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Robot Motion Planning Kai Arras Social Robotics Lab, University of

Human-Oriented Robotics Supervised Learning Part 3/3 Kai Arras Social Robotics Lab, University

Human-Oriented Robotics Supervised Learning Part 2/3 Kai Arras Social Robotics Lab, University

The Bishops Initiative LET US TALK What is the Personality and Character of your

Dirk Helbing Scientific Coordinator Steven Bishop Management Coordinator

Homework Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression

Water Bishops Environmental Club has dedicated this years Rock 4 concert to water and

Bible Study Bishop Sally Dyck Matthew 16:13-16 John 21:15-17 Acts 10:34-35 &amp; 11:15-18 The

Deep Learning - Theory and Practice Linear Regression, Least Squares 13-02-2020 Classification

Interpolating sequences for the Dirichlet space Nicola Arcozzi, with R. Rochberg and E. Sawyer

Slope Stability Dr. Hend AlShatnawi Hashemite University Class of 2019-2020 Slope Stability

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Robotics Sensors for

Mobile & Service Robotics Mobile & Service Robotics Sensors for Sensors for Robotics

Bible Study Bishop Sally Dyck Matthew 16:13-16 John 21:15-17 Acts 10:34-35 & 11:15-18 The