review
play

Review DS GA 1002 Statistical and Mathematical Models - PowerPoint PPT Presentation

Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/DSGA1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with uncertainty Statistics: Framework


  1. Markov chain We have � � � 1 � 1 1 − a q 1 + � � q 2 = + 1 − 1 � 2 − a � 1 − a = 0

  2. Markov chain We have � � � 1 � 1 1 − a � q 1 + � q 2 = + 1 − 1 � 2 − a � 1 − a = 0 � p � X ( 0 )

  3. Markov chain We have � � � 1 � 1 1 − a � q 1 + � q 2 = + 1 − 1 � 2 − a � 1 − a = 0 � 1 � � p � X ( 0 ) = 0

  4. Markov chain We have � � � 1 � 1 1 − a � q 1 + � q 2 = + 1 − 1 � 2 − a � 1 − a = 0 � 1 � � p � X ( 0 ) = 0 = 1 − a 2 − a ( � q 1 + � q 2 )

  5. Markov chain � p � X ( i )

  6. Markov chain X ( i ) = T i � X � p � p � � X ( 0 )

  7. Markov chain X ( i ) = T i � X � p � p � � X ( 0 ) 1 − a = T i 2 − a ( � q 1 + � q 2 ) � X

  8. Markov chain X ( i ) = T i � X � p � p � � X ( 0 ) 1 − a = T i 2 − a ( � q 1 + � q 2 ) � X � � = 1 − a λ i q 1 + λ i 1 � 2 � q 2 2 − a

  9. Markov chain X ( i ) = T i � X � p � p � � X ( 0 ) 1 − a = T i 2 − a ( � q 1 + � q 2 ) � X � � = 1 − a λ i q 1 + λ i 1 � 2 � q 2 2 − a �� � � 1 �� 1 = 1 − a + ( a − 1 ) i 1 − a 2 − a 1 − 1

  10. Markov chain X ( i ) = T i � X � p � p � � X ( 0 ) 1 − a = T i 2 − a ( � q 1 + � q 2 ) � X � � = 1 − a λ i q 1 + λ i 1 � 2 � q 2 2 − a �� � � 1 �� 1 = 1 − a + ( a − 1 ) i 1 − a 2 − a 1 − 1 � � 1 − ( a − 1 ) i + 1 1 � 1 − ( a − 1 ) i � = 2 − a ( 1 − a )

  11. Markov chain For a = 1 we have � 1 � � X ( i ) = p � 0

  12. Markov chain For a = 0 we have � � 1 − ( − 1 ) i + 1 X ( i ) = 1 p � � 1 − ( − 1 ) i 2

  13. Markov chain For a = 0 we have � � 1 − ( − 1 ) i + 1 X ( i ) = 1 p � � 1 − ( − 1 ) i 2  � �  0    if i is odd ,  1 � � =  1   if i is even .   0

  14. Sampling from multivariate distributions We are interested in generating samples from the joint distribution of two random variables X and Y . If we generate a sample x according to the pdf f X and a sample y according to the pdf f Y , are these samples a realization of the joint distribution of X and Y ? Explain your answer with a simple example.

  15. Sampling from multivariate distributions Now, assume that X is discrete and Y is continuous. Propose a method to generate a sample from the joint distribution using the pmf of X and the conditional cdf of Y given X using two independent samples from a distribution that is uniform between 0 and 1. Assume that the conditional cdf is invertible.

  16. Sampling from multivariate distributions 1. Obtain two independent samples u 1 and u 2 from the uniform distribution.

  17. Sampling from multivariate distributions 1. Obtain two independent samples u 1 and u 2 from the uniform distribution. 2. Set x to equal the smallest value a such that p X ( a ) � = 0 and u 1 ≤ F X ( a ) .

  18. Sampling from multivariate distributions 1. Obtain two independent samples u 1 and u 2 from the uniform distribution. 2. Set x to equal the smallest value a such that p X ( a ) � = 0 and u 1 ≤ F X ( a ) . 3. Define F x ( · ) := F Y | X ( · | x ) Set y := F − 1 ( u 2 ) x

  19. Sampling from multivariate distributions Explain how to generate samples from a random variable with pdf f W ( w ) = 0 . 1 λ 1 exp ( − λ 1 w ) + 0 . 9 λ 2 exp ( − λ 2 w ) , w ≥ 0 , where λ 1 and λ 2 are positive constants, using two iid uniform samples between 0 and 1.

  20. Sampling from multivariate distributions Let us define a Bernoulli random variable X with parameter 0.9, such that if X = 0 then Y is exponential with parameter λ 1 and if X = 1 then Y is exponential with parameter λ 2

  21. Sampling from multivariate distributions Let us define a Bernoulli random variable X with parameter 0.9, such that if X = 0 then Y is exponential with parameter λ 1 and if X = 1 then Y is exponential with parameter λ 2 The marginal distribution of Y is f Y ( w ) = p X ( 0 ) f Y | X ( w | 0 ) + p X ( 1 ) f Y | X ( w | 1 )

  22. Sampling from multivariate distributions Let us define a Bernoulli random variable X with parameter 0.9, such that if X = 0 then Y is exponential with parameter λ 1 and if X = 1 then Y is exponential with parameter λ 2 The marginal distribution of Y is f Y ( w ) = p X ( 0 ) f Y | X ( w | 0 ) + p X ( 1 ) f Y | X ( w | 1 ) = 0 . 1 λ 1 exp ( − λ 1 w ) + 0 . 9 λ 2 exp ( − λ 2 w )

  23. Sampling from multivariate distributions 1. We obtain two independent samples u 1 and u 2 from the uniform distribution.

  24. Sampling from multivariate distributions 1. We obtain two independent samples u 1 and u 2 from the uniform distribution. 2. If u 1 ≤ 0 . 1 we set � � w := 1 1 log λ 1 1 − u 2 otherwise we set � � w := 1 1 log λ 2 1 − u 2

  25. Convergence Let U be a random variable uniformly distributed between 0 and 1. If we define the discrete random process � X � X ( i ) = U for all i , does � X converge to 1 − U in probability?

  26. Convergence Does � X converge to 1 − U in distribution?

  27. Convergence You draw some iid samples x 1 , x 2 , . . . from a Cauchy random variable. Will � n the empirical mean 1 i = 1 x i converge in probability as n grows large? n Explain why briefly and if the answer is yes state what it converges to.

  28. Convergence You draw m iid samples x 1 , x 2 , . . . , x m from a Cauchy random variable. Then you draw iid samples y 1 , y 2 , . . . uniformly from { x 1 , x 2 , . . . , x m } (each y i is equal to each element of { x 1 , x 2 , . . . , x m } with probability 1 / m ). Will � n the empirical mean 1 i = 1 y i converge in probability as n grows large? n Explain why very briefly and if the answer is yes state what it converges to.

  29. Earthquake We are interested in learning a model for the occurrence of earthquakes. We decide to model the time between earthquakes as an exponential random variable with parameter λ . Compute the maximum-likelihood estimate of λ given t 1 , t 2 , . . . , t n , which are interarrival times for past earthquakes. Assume that the data are iid.

  30. Earthquake L ( λ )

  31. Earthquake L ( λ ) := f � T ( n ) ( t 1 , . . . , t n ) T ( 1 ) ,..., �

  32. Earthquake L ( λ ) := f � T ( n ) ( t 1 , . . . , t n ) T ( 1 ) ,..., � n � = λ exp ( − λ t i ) i = 1

  33. Earthquake L ( λ ) := f � T ( n ) ( t 1 , . . . , t n ) T ( 1 ) ,..., � n � = λ exp ( − λ t i ) i = 1 � � n � = λ n exp − λ t i i = 1

  34. Earthquake L ( λ ) := f � T ( n ) ( t 1 , . . . , t n ) T ( 1 ) ,..., � n � = λ exp ( − λ t i ) i = 1 � � n � = λ n exp − λ t i i = 1 log L ( λ )

  35. Earthquake L ( λ ) := f � T ( n ) ( t 1 , . . . , t n ) T ( 1 ) ,..., � n � = λ exp ( − λ t i ) i = 1 � � n � = λ n exp − λ t i i = 1 � n log L ( λ ) = n log λ − λ t i i = 1

  36. Earthquake d log L t 1 ,..., t n ( λ ) d λ

  37. Earthquake n � d log L t 1 ,..., t n ( λ ) = n λ − t i d λ i = 1

  38. Earthquake n � d log L t 1 ,..., t n ( λ ) = n λ − t i d λ i = 1 d 2 log L t 1 ,..., t n ( λ ) d λ 2

  39. Earthquake n � d log L t 1 ,..., t n ( λ ) = n λ − t i d λ i = 1 d 2 log L t 1 ,..., t n ( λ ) = − n d λ 2 λ 2

  40. Earthquake n � d log L t 1 ,..., t n ( λ ) = n λ − t i d λ i = 1 d 2 log L t 1 ,..., t n ( λ ) = − n d λ 2 λ 2 λ ML

  41. Earthquake n � d log L t 1 ,..., t n ( λ ) = n λ − t i d λ i = 1 d 2 log L t 1 ,..., t n ( λ ) = − n d λ 2 λ 2 1 λ ML = � n 1 i = 1 t i n

  42. Earthquake Find an approximate 0.95 confidence interval based on the central limit theorem for the value of λ . Assume that you know a bound b on the standard deviation (i.e. the variance of the exponential 1 /λ 2 is bounded by b 2 ) and express your answer using the Q function. (Hint: Express the ML estimate in terms of the empirical mean.) (See solutions.)

  43. Earthquake What is the posterior distribution of the parameter Λ if we model it as a random variable with a uniform distribution between 0 and u ? Express your answer in terms of the sum � n i = 1 t i , u and the marginal pdf of the data evaluated at t 1 , t 2 , . . . , t n c := f � T ( n ) ( t 1 , . . . , t n ) . T ( 1 ) ,..., �

  44. Earthquake T ( n ) ( λ | t 1 , . . . , t n ) f Λ | � T ( 1 ) ,..., �

  45. Earthquake T ( n ) ( λ | t 1 , . . . , t n ) = f Λ ( λ ) λ n exp ( − λ � n i = 1 t i ) f Λ | � T ( 1 ) ,..., � f � T ( n ) ( t 1 , . . . , t n ) T ( 1 ) ,..., �

  46. Earthquake T ( n ) ( λ | t 1 , . . . , t n ) = f Λ ( λ ) λ n exp ( − λ � n i = 1 t i ) f Λ | � T ( 1 ) ,..., � f � T ( n ) ( t 1 , . . . , t n ) T ( 1 ) ,..., � � � n � = 1 u c λ n exp − λ t i i = 1

  47. Earthquake T ( n ) ( λ | t 1 , . . . , t n ) = f Λ ( λ ) λ n exp ( − λ � n i = 1 t i ) f Λ | � T ( 1 ) ,..., � f � T ( n ) ( t 1 , . . . , t n ) T ( 1 ) ,..., � � � n � = 1 u c λ n exp − λ t i i = 1 for 0 ≤ λ ≤ u and zero otherwise

  48. Earthquake T ( n ) ( λ | t 1 , . . . , t n ) f Λ | � T ( 1 ) ,..., � λ

  49. Earthquake Explain how you would use the answer in the previous question to construct a confidence interval for the parameter

  50. Chad You hate a coworker and want to predict when he is in the office from the temperature. Chad 61 65 59 61 61 65 61 63 63 59 No Chad 68 70 68 64 64 - - - - - You model his presence using a random variable C which is equal to 1 if he is there and 0 if he is not. Estimate p C .

  51. Chad The empirical pmf is p C ( 0 ) = 5 15 = 1 3 , p C ( 1 ) = 10 15 = 2 3 .

  52. Chad You model the temperature using a random variable T . Sketch the kernel density estimator of the conditional distribution of T given C using a rectangular kernel with width equal to 2.

Recommend


More recommend