TensorFlow Probability Joshua V. Dillon Software Engineer Google - PowerPoint PPT Presentation

TensorFlow Probability Joshua V. Dillon Software Engineer Google Research

What is TensorFlow Probability? A open source Python library built using TF which makes it easy to combine deep learning with probabilistic models on modern hardware. It is for: ● Statisticians/data scientists . R-like capabilities that run out-of-the-box on TPUs + GPUs. ● ML researchers/practitioners . Build deep models which capture uncertainty. Confidential + Proprietary

Why use TensorFlow Probability? A deep network predicting binary outcomes is "just" a fancy parametrization of a Bernoulli distribution. Great! Now what? Enode knowledge through richer distributional assumptions ! ● control prediction variance ● prior knowledge ● ask (and answer) tougher questions

Take Home Message Express your domain knowledge as a probabilistic model. Use TFP to execute it. Confidential + Proprietary

How do I use TensorFlow Probability? Build model. Do inference. Confidential + Proprietary

How do I use TensorFlow Probability? Build model. Do inference. Canned approach: GLMs Confidential + Proprietary

Generalized Linear Models # Build model. model = tfp.glm.Bernoulli() # Fit model. coeffs, linear_response, is_converged, num_iter = \ tfp.glm.fit_sparse( model_matrix=x, response=y, l1_regularizer=0.5, # Induces sparse weights. l2_regularizer=1., # Also prevents over-fitting. model=model) Confidential + Proprietary

How do I use TensorFlow Probability? Build model. Do inference. Distributions MCMC Bijectors Variational Inference Layers / Losses Optimizers Edward2 Confidential + Proprietary

class Distribution(object): Monte Carlo def sample(self, sample_shape=(), seed=None): pass def prob(self, value): pass def cdf(self, value): pass Evaluate def survival_function(self, value): pass def mean(self): pass def variance(self): pass Summarize def stddev(self): pass def mode(self): pass def quantile(self, p): pass def entropy(self): pass Compare def cross_entropy(self, other): pass def event_shape(self): pass Shape def batch_shape(self): pass Confidential + Proprietary

"Hello, World!" import tensorflow_probability as tfp tfd = tfp.distributions d = tfd.Normal(loc=0., scale=1.) x = d.sample() # Draw random point. px = d.prob(x) # Compute density/mass. Confidential + Proprietary

Distributions are Expressive factorial_mog = tfd.Independent( tfd.MixtureSameFamily( # Uniform weight on each component. mixture_distribution=tfd.Categorical( logits=tf.zeros([num_vars, num_components])), components_distribution=\ tfd.MultivariateNormalDiag( loc=mu, scale_diag=[sigma])), reinterpreted_batch_ndims=1) samples = factorial_mog.sample(1000) Confidential + Proprietary

class Bijector(object): def forward(self, x): pass Compute Samples def forward_log_det_jacobian(self, x): pass def inverse(self, x): pass Compute def inverse_log_det_jacobian( Probabilities self, x, event_ndims): pass def forward_event_shape(self, x): pass def forward_min_event_ndims(self, x): pass Shape def inverse_event_shape(self, x): pass def inverse_min_event_ndims(self, x): pass Confidential + Proprietary

Bijectors Transform Distributions # Masked Autoregressive Flow for Density Estimation. # Papamakarios, et. al. NIPS, 2017. iaf = tfp.distributions.TransformedDistribution( distribution=tfp.distributions.Normal(loc=0., scale=1.), bijector=( Or your tfp.bijectors.MaskedAutoregressiveFlow( own DNN. shift_and_log_scale_fn=\ tfb.masked_autoregressive_default_template( hidden_layers=[512, 512]))), event_shape=[dims]) loss = -iaf.log_prob(x) # DNN powered PDF. Wow! Confidential + Proprietary

Bijectors Transform Distributions # Improved Variational Inference with Inverse Autoregressive Flow # Kingma, et. al., NIPS 2016. iaf = tfp.distributions.TransformedDistribution( distribution=tfp.distributions.Normal(loc=0., scale=1.), Different bijector=tfp.bijectors.Invert( paper but tfp.bijectors.MaskedAutoregressiveFlow( easy in shift_and_log_scale_fn=\ TFP. tfb.masked_autoregressive_default_template( hidden_layers=[512, 512]))), event_shape=[dims]) loss = -iaf.log_prob(x) # DNN powered PDF. Wow! Confidential + Proprietary

Use Case: Anomaly Detection (“Bayesian Methods for Hackers” by Cameron Davidson-Pilon) Confidential + Proprietary

Code this up in TFP def joint_log_prob(count_data, lambda_1, lambda_2, tau): alpha = 1. / count_data.mean() rv_lambda = tfd.Exponential(rate=alpha) rv_tau = tfd.Uniform() indices = tf.to_int32( tau * count_data.size <= tf.range(count_data.size)) lambda_ = tf.gather( [lambda_1, lambda_2], indices) rv_x= tfd.Poisson(rate=lambda_) return (rv_lambda.log_prob(lambda_1) + rv_lambda.log_prob(lambda_2) + rv_tau.log_prob(tau) + tf.reduce_sum( rv_x.log_prob(count_data))) Confidential + Proprietary

Code this up in TFP def joint_log_prob(count_data, lambda_1, lambda_2, tau): alpha = 1. / count_data.mean() rv_lambda = tfd.Exponential(rate=alpha) rv_tau = tfd.Uniform() indices = tf.to_int32( Just add up the log tau * count_data.size <= density, and return! tf.range(count_data.size)) lambda_ = tf.gather( [lambda_1, lambda_2], indices) rv_x= tfd.Poisson(rate=lambda_) return (rv_lambda.log_prob(lambda_1) + rv_lambda.log_prob(lambda_2) + rv_tau.log_prob(tau) + tf.reduce_sum( rv_x.log_prob(count_data))) Confidential + Proprietary

What are the posterior distributions? Confidential + Proprietary

Sampling Posterior [lambda_1, lambda_2, tau], _ = tfp.mcmc.sample_chain( num_results=int(10e3), num_burnin_steps=int(1e3), current_state=initial_chain_state, kernel=tfp.mcmc.TransformedTransitionKernel( inner_kernel=tfp.mcmc.HamiltonianMonteCarlo( target_log_prob_fn=lambda *s: joint_log_prob(count_data, *s), num_leapfrog_steps=2, step_size=tf.Variable(1.), Setup: We'll use step_size_update_fn=\ transformed HMC to tfp.mcmc.make_simple_step_size_update_policy()), bijector=[ draw 10K samples tfp.bijectors.Exp(), # Lambda1 from our posterior. tfp.bijectors.Exp(), # Lambda2 tfp.bijectors.Sigmoid()])) # Tau Confidential + Proprietary

Sampling Posterior [lambda_1, lambda_2, tau], _ = tfp.mcmc.sample_chain( Map random variables' num_results=int(10e3), supports to num_burnin_steps=int(1e3), unconstrained reals. current_state=initial_chain_state, kernel=tfp.mcmc.TransformedTransitionKernel( Ensures HMC samples inner_kernel=tfp.mcmc.HamiltonianMonteCarlo( always have >0 target_log_prob_fn=lambda *s: joint_log_prob(count_data, *s), probability and chain num_leapfrog_steps=2, doesn't get stuck. step_size=tf.Variable(1.), step_size_update_fn=\ tfp.mcmc.make_simple_step_size_update_policy()), bijector=[ tfp.bijectors.Exp(), # Lambda1 tfp.bijectors.Exp(), # Lambda2 tfp.bijectors.Sigmoid()])) # Tau Confidential + Proprietary

Sampling Posterior [lambda_1, lambda_2, tau], _ = tfp.mcmc.sample_chain( Unnormalized num_results=int(10e3), posterior log-density num_burnin_steps=int(1e3), via closure. So easy! current_state=initial_chain_state, kernel=tfp.mcmc.TransformedTransitionKernel( inner_kernel=tfp.mcmc.HamiltonianMonteCarlo( target_log_prob_fn=lambda *s: joint_log_prob(count_data, *s), num_leapfrog_steps=2, step_size=tf.Variable(1.), step_size_update_fn=\ tfp.mcmc.make_simple_step_size_update_policy()), bijector=[ tfp.bijectors.Exp(), # Lambda1 tfp.bijectors.Exp(), # Lambda2 tfp.bijectors.Sigmoid()])) # Tau Confidential + Proprietary

And the answer is?! Confidential + Proprietary

More complicated model. Same story. ("Multilevel Bayesian Models of Categorical Data Annotation" by Bob Carpenter) Confidential + Proprietary

Code this up in TFP def joint_log_prob(x, annotators, items, # ...continued from previous column. pi, rho, c, delta, mu, sigma, gamma): # Observations plate. (K) # Items plate. (I) d = tf.gather(delta, items) rv_pi = tfd.Uniform(low=0., high=1.) g = tf.gather(gamma, annotators, axis=0) rv_rho = tfd.Uniform(low=0., high=50.) rv_x = tfd.Bernoulli( rv_c = tfd.Uniform(low=0., high=1.) logits=tf.where(tf.gather(c < pi, items), rv_delta = tfd.Normal( g[:, 1] - d, -g[:, 0] + d)) loc=0,scale=tf.gather(rho, tf.to_int32(c<pi))) # Compute the actual log prob. # Annotators plate. (J) return sum(map(tf.reduce_sum, [ rv_mu = tfd.Normal(loc=0., scale=10.) rv_pi.log_prob(pi), rv_rho.log_prob(rho), rv_sigma = tfd.Uniform(low=0., high=[50., 100.]) rv_c.log_prob(c), rv_delta.log_prob(delta), rv_gamma = tfd.Normal(loc=mu, scale=sigma) rv_mu.log_prob(mu), rv_sigma.log_prob(sigma), # ...continued in next column. rv_x.log_prob(x), rv_gamma.log_prob(gamma)])) Confidential + Proprietary

TensorFlow Probability Joshua V. Dillon Software Engineer Google - PowerPoint PPT Presentation

TensorFlow Probability Joshua V. Dillon Software Engineer Google Research What is TensorFlow Probability? A open source Python library built using TF which makes it easy to combine deep learning with probabilistic models on modern hardware.

C-FX-02-V1.0 DSV 4.0 2 45 15 TensorFlow TensorBoard TensorFlow

Getting Started with TensorFlow Part I: TensorFlow Graphs and Sessions Nick Winovich Department

A Trip Through the NGC TensorFlow Container GTC 2019 S9256 AGENDA A Trip Through the TensorFlow

Distributed TensorFlow Stony Brook University CSE545, Fall 2017 Goals Understand

TensorFlow w/XLA: TensorFlow, Compiled! Expressiveness with performance Pre-release

TensorFlow: a Framework for Scalable Machine Learning ACM Learning Center, 2016 You probably

TensorFlow: neural networks lab Paolo Dragone and Andrea Passerini paolo.dragone@unitn.it

Some resources for ML/TensorFlow TensorFlow resources A good tutorial (about 2:40:00 long)

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Machine learning on mobile and edge devices with TensorFlow Lite Developer advocate for

TensorFlow Extended (TFX) An End-to-End ML Platform Clemens Mewald TensorFlow Extended (TFX): An

Getting Started with TensorFlow Part II: Monitoring Training and Validation Nick Winovich

Problem Norman pays Oklahoma City $3.10/1000 gallons for drinking water Previously from

Advanced Section #5: Generalized Linear Models: Logistic Regression and Beyond Marios Mattheakis

Discursive Framing & Community Mobilization: Stopping the Melancthon Mega Quarry Rebecca

Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers Georgios

The Brachistochrone Curve Paige R MacDonald May 16, 2014 Paige R MacDonald The Brachistochrone

Pedagogic Transformation Parth Shah, James Campbell, Umang Shah What is the Vision? We

On the regularity of the two-phase free boundaries u = 0 u < 0 u > 0 Bozhidar Velichkov

Joint Modeling of Feedback-Use and Time Data Advances in Bayesian Item Response Modeling

TensorFlow Probability Joshua V. Dillon Software Engineer Google - PowerPoint PPT Presentation

TensorFlow Probability Joshua V. Dillon Software Engineer Google Research What is TensorFlow Probability? A open source Python library built using TF which makes it easy to combine deep learning with probabilistic models on modern hardware.

C-FX-02-V1.0 DSV 4.0 2 45 15 TensorFlow TensorBoard TensorFlow

Getting Started with TensorFlow Part I: TensorFlow Graphs and Sessions Nick Winovich Department

A Trip Through the NGC TensorFlow Container GTC 2019 S9256 AGENDA A Trip Through the TensorFlow

Distributed TensorFlow Stony Brook University CSE545, Fall 2017 Goals Understand

TensorFlow w/XLA: TensorFlow, Compiled! Expressiveness with performance Pre-release

TensorFlow: a Framework for Scalable Machine Learning ACM Learning Center, 2016 You probably

TensorFlow: neural networks lab Paolo Dragone and Andrea Passerini paolo.dragone@unitn.it

Some resources for ML/TensorFlow TensorFlow resources A good tutorial (about 2:40:00 long)

Probability Basics Martin Emms October 1, 2020 Probability Basics Outline Probability

Continuing Probability. Wrap up: Total Probability and Conditional Probability. Continuing

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Probability Basics Probability Background Martin Emms October 1, 2020 Probability Basics

Chapter 2 Probability 1. Definition of Probability 2. Probability of disjoint events 3.

Machine learning on mobile and edge devices with TensorFlow Lite Developer advocate for

TensorFlow Extended (TFX) An End-to-End ML Platform Clemens Mewald TensorFlow Extended (TFX): An

Getting Started with TensorFlow Part II: Monitoring Training and Validation Nick Winovich

Problem Norman pays Oklahoma City $3.10/1000 gallons for drinking water Previously from

Advanced Section #5: Generalized Linear Models: Logistic Regression and Beyond Marios Mattheakis

Discursive Framing &amp; Community Mobilization: Stopping the Melancthon Mega Quarry Rebecca

Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers Georgios

The Brachistochrone Curve Paige R MacDonald May 16, 2014 Paige R MacDonald The Brachistochrone

Pedagogic Transformation Parth Shah, James Campbell, Umang Shah What is the Vision? We

On the regularity of the two-phase free boundaries u = 0 u &lt; 0 u &gt; 0 Bozhidar Velichkov

Joint Modeling of Feedback-Use and Time Data Advances in Bayesian Item Response Modeling

Discursive Framing & Community Mobilization: Stopping the Melancthon Mega Quarry Rebecca

On the regularity of the two-phase free boundaries u = 0 u < 0 u > 0 Bozhidar Velichkov