probabilit y r u les
play

Probabilit y r u les FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS - PowerPoint PPT Presentation

Probabilit y r u les FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R Rasm u s Bth Data Scientist FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R Bad and good ne w s Bad ne w s The comp u tation


  1. Probabilit y r u les FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R Rasm u s Bååth Data Scientist

  2. FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  3. FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  4. Bad and good ne w s Bad ne w s The comp u tation method w e 'v e u sed scales horribl y. Good ne w s Ba y esian comp u tation is a hot research topic . There are man y methods to � t Ba y esian models more e � cientl y. The res u lt w ill be the same , y o u' ll j u st get it faster . FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  5. Probabilit y theor y Probabilit y A n u mber bet w een 0 and 1. A statement of certaint y/u ncertain y. Mathematical notation : P ( n_visitors = 13) is a probabilit y P ( n_visitors ) is a probabilit y distrib u tion P ( n_visitors = 13 | prop_clicks = 10%) is a conditional probabilit y P ( n_visitors | prop_clicks = 10%) is a conditional probabilit y distrib u tion FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  6. P ( n _v isitors | prop _ clicks = 10%) n_visitors <- rbinom(n = 10000, size = 100, prob = 0.1) hist(n_visitors) FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  7. Manip u lating probabilit y FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  8. Manip u lating probabilit y The s u m r u le FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  9. Manip u lating probabilit y The s u m r u le p (1 or 2 or 3) FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  10. Manip u lating probabilit y The s u m r u le p (1 or 2 or 3) = 1/6 + 1/6 + 1/6 = 0.5 FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  11. Manip u lating probabilit y The s u m r u le p (1 or 2 or 3) = 1/6 + 1/6 + 1/6 = 0.5 The prod u ct r u le FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  12. Manip u lating probabilit y The s u m r u le p (1 or 2 or 3) = 1/6 + 1/6 + 1/6 = 0.5 The prod u ct r u le p (6 and 6) FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  13. Manip u lating probabilit y The s u m r u le p (1 or 2 or 3) = 1/6 + 1/6 + 1/6 = 0.5 The prod u ct r u le p (6 and 6) = 1/6 * 1/6 = 1 / 36 = 2.8% FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  14. Manip u lating probabilit y The s u m r u le p (1 or 2 or 3) = 1/6 + 1/6 + 1/6 = 0.5 The prod u ct r u le p (6 and 6) = 1/6 * 1/6 = 1 / 36 = 2.8% FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  15. Let ' s tr y o u t these r u les ! FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

  16. We can calc u late ! FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R Rasm u s Bååth Data Scientist

  17. Sim u lation v s calc u lation Sim u lation u sing ' r '- f u nctions , for e x ample , rbinom and rpois Sim u lating P ( n_visitors = 13 | prob_success = 10%) n_visitors <- rbinom(n = 100000, size = 100, prob = 0.1) sum(n_visitors == 13) / length(n_visitors) 0.074 Calc u lation u sing the ' d '- f u nctions , for e x ample , dbinom and dpois Calc u lating P ( n_visitors = 13 | prob_success = 10%) dbinom(13, size = 100, prob = 0.1) 0.074 FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  18. Calc u lating P ( n_visitors = 13 or n_visitors = 14 | prob_success = 10%) dbinom(13, size = 100, prob = 0.1) + dbinom(14, size = 100, prob = 0.1) 0.126 Calc u lating P ( n_visitors | prop_success = 10%) n_visitors = seq(0, 100, by = 1) probability <- dbinom(n_visitors, size = 100, prob = 0.1) n_visitors 0 1 2 3 4 5 6 7 ... probability 0.000 0.000 0.002 0.006 0.016 0.034 0.060 0.089 ... FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  19. Plotting a calc u lated distrib u tion plot(n_visitors, probability, type = "h") FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  20. Contin u o u s distrib u tions The Uniform distrib u tion x <- runif(n = 100000, min = 0.0, max = 0.2) hist(x) FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  21. Contin u o u s distrib u tions The Uniform distrib u tion The d -v ersion of runif is dunif : dunif(x = 0.12, min = 0.0, max = 0.2) 5 Probabilit y densit y : Kind of a relati v e probabilit y x = seq(0, 0.2, by=0.01) dunif(x, min = 0.0, max = 0.2) 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  22. Tr y this o u t ! FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R

  23. Ba y esian calc u lation FU N DAME N TAL S OF BAYE SIAN DATA AN ALYSIS IN R Rasm u s Bååth Data Scientist

  24. FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  25. FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  26. Ba y esian inference b y calc u lation FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  27. Ba y esian inference b y calc u lation n_ads_shown <- 100 FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  28. Ba y esian inference b y calc u lation n_ads_shown <- 100 n_visitors proportion_clicks FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  29. Ba y esian inference b y calc u lation n_ads_shown <- 100 n_visitors <- seq(0, 100, by = 1) proportion_clicks FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  30. Ba y esian inference b y calc u lation n_ads_shown <- 100 n_visitors <- seq(0, 100, by = 1) proportion_clicks <- seq(0, 1, by = 0.01) FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  31. Ba y esian inference b y calc u lation n_ads_shown <- 100 n_visitors <- seq(0, 100, by = 1) proportion_clicks <- seq(0, 1, by = 0.01) pars <- expand.grid(proportion_clicks = proportion_clicks, n_visitors = n_visitors) proportion_clicks n_visitors 0.04 38 0.11 93 0.16 100 0.67 98 0.96 3 0.48 73 0.14 13 ... ... FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  32. Ba y esian inference b y calc u lation n_ads_shown <- 100 n_visitors <- seq(0, 100, by = 1) proportion_clicks <- seq(0, 1, by = 0.01) pars <- expand.grid(proportion_clicks = proportion_clicks, n_visitors = n_visitors) proportion_clicks <- runif(n_samples, min = 0.0, max = 0.2) proportion_clicks n_visitors 0.04 38 0.11 93 0.16 100 0.67 98 0.96 3 0.48 73 0.14 13 ... ... FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  33. Ba y esian inference b y calc u lation n_ads_shown <- 100 n_visitors <- seq(0, 100, by = 1) proportion_clicks <- seq(0, 1, by = 0.01) pars <- expand.grid(proportion_clicks = proportion_clicks, n_visitors = n_visitors) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) proportion_clicks n_visitors prior 0.04 38 5 0.11 93 5 0.16 100 5 0.67 98 0 0.96 3 0 0.48 73 0 0.14 13 5 ... ... ... FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  34. Ba y esian inference b y calc u lation n_ads_shown <- 100 n_visitors <- seq(0, 100, by = 1) proportion_clicks <- seq(0, 1, by = 0.01) pars <- expand.grid(proportion_clicks = proportion_clicks, n_visitors = n_visitors) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) n_visitors <- rbinom(n = n_samples, size = n_ads_shown, prob = proportion_clicks) proportion_clicks n_visitors prior 0.04 38 5 0.11 93 5 0.16 100 5 0.67 98 0 0.96 3 0 0.48 73 0 0.14 13 5 ... ... ... FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  35. Ba y esian inference b y calc u lation n_ads_shown <- 100 n_visitors <- seq(0, 100, by = 1) proportion_clicks <- seq(0, 1, by = 0.01) pars <- expand.grid(proportion_clicks = proportion_clicks, n_visitors = n_visitors) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(pars$n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) proportion_clicks n_visitors prior likelihood 0.04 38 5 3.409439e-27 0.11 93 5 5.006969e-80 0.16 100 5 2.582250e-80 0.67 98 0 4.863666e-15 0.96 3 0 3.592054e-131 0.48 73 0 2.215148e-07 0.14 13 5 1.129620e-01 ... ... ... ... FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

  36. Ba y esian inference b y calc u lation n_ads_shown <- 100 n_visitors <- seq(0, 100, by = 1) proportion_clicks <- seq(0, 1, by = 0.01) pars <- expand.grid(proportion_clicks = proportion_clicks, n_visitors = n_visitors) pars$prior <- dunif(pars$proportion_clicks, min = 0, max = 0.2) pars$likelihood <- dbinom(pars$n_visitors, size = n_ads_shown, prob = pars$proportion_clicks) pars$probability <- pars$likelihood * pars$prior proportion_clicks n_visitors prior likelihood probability 0.04 38 5 3.409439e-27 1.704720e-26 0.11 93 5 5.006969e-80 2.503485e-79 0.16 100 5 2.582250e-80 1.291125e-79 0.67 98 0 4.863666e-15 0.000000e+00 0.96 3 0 3.592054e-131 0.000000e+00 0.48 73 0 2.215148e-07 0.000000e+00 0.14 13 5 1.129620e-01 5.648101e-01 ... ... ... ... ... FUNDAMENTALS OF BAYESIAN DATA ANALYSIS IN R

Recommend


More recommend