Continuous Distributions 1.8-1.9: Continuous Random Variables 1.10.1: Uniform Distribution (Continuous) 1.10.4-5 Exponential and Gamma Distributions: Distance between crossovers Prof. Tesler Math 283 Fall 2015 Prof. Tesler Continuous Distributions Math 283 / Fall 2015 1 / 24
Continuous distributions Example Pick a real number x between 20 and 30 with all real values in [ 20 , 30 ] equally likely. Sample space: S = [ 20 , 30 ] Number of outcomes: | S | = ∞ Probability of each outcome: P ( X = x ) = 1 ∞ = 0 Yet, P ( X � 21 . 5 ) = 15 % Prof. Tesler Continuous Distributions Math 283 / Fall 2015 2 / 24
Continuous distributions The sample space S is often a subset of R n . We’ll do the 1-dimensional case S ⊂ R . The probability density function (pdf) f X ( x ) is defined differently than the discrete case: f X ( x ) is a real-valued function on S with f X ( x ) � 0 for all x ∈ S . � (vs. � f X ( x ) dx = 1 P X ( x ) = 1 for discrete) x ∈ S S � (vs. � The probability of event A ⊂ S is P ( A ) = f X ( x ) dx P X ( x ) ). x ∈ A A In n dimensions, use n -dimensional integrals instead. Uniform distribution Let a < b be real numbers. The Uniform Distribution on [ a , b ] is that all numbers in [ a , b ] are “equally likely.” � 1 if a � x � b ; b − a More precisely, f X ( x ) = otherwise. 0 Prof. Tesler Continuous Distributions Math 283 / Fall 2015 3 / 24
Uniform distribution (real case) The uniform distribution on [ 20 , 30 ] We could regard the sample space as [ 20 , 30 ] , or as all reals. 0.10 f X ( x ) � for 20 � x � 30 ; 1 / 10 f X ( x ) = 0.00 otherwise. 0 0 10 20 30 40 x � 20 � 21 . 5 21 . 5 10 dx = 0 + x 1 � P ( X � 21 . 5 ) = 0 dx + � 10 � 20 − ∞ 20 = 21 . 5 − 20 0.10 10 f X ( x ) = . 15 = 15 % 0.00 0 10 20 30 40 x Prof. Tesler Continuous Distributions Math 283 / Fall 2015 4 / 24
Cumulative distribution function (cdf) The Cumulative Distribution Function (cdf) of a random variable X is F X ( x ) = P ( X � x ) For a continuous random variable, � x ′ ( x ) F X ( x ) = P ( X � x ) = − ∞ f X ( t ) dt and f X ( x ) = F X The integral cannot have “ x ” as the name of the variable in both of F X ( x ) and f X ( x ) because one is the upper limit of the integral and the other is the integration variable. So we use two variables x , t . We can either write � x F X ( x ) = P ( X � x ) = f X ( t ) dt − ∞ or � t F X ( t ) = P ( X � t ) = f X ( x ) dx − ∞ Prof. Tesler Continuous Distributions Math 283 / Fall 2015 5 / 24
CDF of uniform distribution Uniform distribution on [ 20 , 30 ] � x For x < 20 : F X ( x ) = − ∞ 0 dt = 0 � 20 � x 10 dt = x − 20 1 For 20 � x < 30 : F X ( x ) = − ∞ 0 dt + 20 10 � 20 � 30 � x 1 For 30 � x : F X ( x ) = − ∞ 0 dt + 10 dt + 30 0 dt = 1 20 Together: if x < 20 if x < 20 0 0 x − 20 1 ′ ( x )= F X ( x )= f X ( x )= F X if 20 � x � 30 if 20 � x � 30 10 10 if x � 30 if x � 30 1 0 Prof. Tesler Continuous Distributions Math 283 / Fall 2015 6 / 24
PDF vs. CDF Probability density function Cumulative distribution function 1 0.10 F X ( x ) 0.5 f X ( x ) 0 0.00 0 10 20 30 40 0 10 20 30 40 x x � F X ( x ) = . 1 if 20 � x � 30 ; f X ( x )= if x < 20 ; 0 otherwise. 0 It’s discontinuous at x = 20 ( x − 20 ) / 10 if 20 � x � 30 ; and 30 . if x � 30 . 1 PDF is derivative of CDF: CDF is integral of PDF: � x ′ ( x ) f X ( x ) = F X F X ( x ) = f X ( t ) dt − ∞ Prof. Tesler Continuous Distributions Math 283 / Fall 2015 7 / 24
PDF vs. CDF: Second example Probability density function Cumulative distribution function 1 density f R (r) 0.6 F R (r) 0.4 0.5 0.2 0 0 0 1 2 3 0 1 2 3 r r � 2 r / 9 if 0 � r < 3 ; if r < 0 ; 0 f R ( r )= r 2 / 9 if r � 0 or r > 3 0 F R ( r ) = if 0 � r � 3 ; It’s discontinuous at r = 3 . if r � 3 . 1 PDF is derivative of CDF: CDF is integral of PDF: � r ′ ( r ) f R ( r ) = F R F R ( r ) = f R ( t ) dt − ∞ Prof. Tesler Continuous Distributions Math 283 / Fall 2015 8 / 24
Probability of an interval Compute P (− 1 � R � 2 ) from the PDF and also from the CDF Computation from the PDF � 2 � 0 � 2 P (− 1 � R � 2 ) = f R ( r ) dr = f R ( r ) dr + f R ( r ) dr − 1 − 1 0 � 0 � 2 2 r = 0 dr + 9 dr − 1 0 = 2 2 − 0 2 � 2 � r 2 � = 4 � = 0 + � 9 9 9 � r = 0 Computation from the CDF P (− 1 � R � 2 ) = P (− 1 − < R � 2 ) = F R ( 2 ) − F R (− 1 − ) = 2 2 9 − 0 = 4 9 Prof. Tesler Continuous Distributions Math 283 / Fall 2015 9 / 24
Continuous vs. discrete random variables Cumulative distribution function Cumulative distribution function 1 1 F X (x) F R (r) 0.5 0.5 0 0 0 1 2 3 ! 1 0 1 2 r x In a continuous distribution: The probability of an individual point is 0 : P ( R = r ) = 0 . So, P ( R � r ) = P ( R < r ) , i.e., F R ( r ) = F R ( r − ) . The CDF is continuous. (In a discrete distribution, the CDF is discontinuous due to jumps at the points with nonzero probability.) P ( a < R < b )= P ( a � R < b ) = P ( a < R � b ) = P ( a � R � b ) = F R ( b ) − F R ( a ) Prof. Tesler Continuous Distributions Math 283 / Fall 2015 10 / 24
Cumulative distribution function (cdf) The Cumulative Distribution Function (cdf) of a random variable X is F X ( x ) = P ( X � x ) Continuous case � x F X ( x ) = − ∞ f X ( t ) dt Weakly increasing. Varies smoothly from 0 to 1 as x varies from − ∞ to ∞ . ′ ( x ) . To get the pdf from the cdf, use f X ( x ) = F X Discrete case F X ( x ) = � t � x P X ( t ) Weakly increasing. Stair-steps from 0 to 1 as x goes from − ∞ to ∞ . The cdf jumps where P X ( x ) � 0 and is constant in-between. To get the pdf from the cdf, use P X ( x ) = F X ( x ) − F X ( x − ) (which is positive at the jumps, 0 otherwise). Prof. Tesler Continuous Distributions Math 283 / Fall 2015 11 / 24
CDF, percentiles, and median The k th percentile of a distribution X is the point x where k % of the probability is up to that point: F X ( x ) = P ( X � x ) = k % = k / 100 Example: F R ( r ) = P ( R � r ) = r 2 / 9 (for 0 � r � 3 ) � r 2 / 9 = ( k / 100 ) r = 9 ( k / 100 ) ⇒ � 75th percentile: r = 9 ( . 75 ) ≈ 2 . 60 � Median (50th percentile): r = 9 ( . 50 ) ≈ 2 . 12 0th and 100th percentiles: r = 0 and r = 3 if we restrict to the range 0 � r � 3 . But they are not uniquely defined, since F R ( r ) = 0 for all r � 0 and F R ( r ) = 1 for all r � 3 . Prof. Tesler Continuous Distributions Math 283 / Fall 2015 12 / 24
Expected value and variance (continuous r.v.) Replace sums by integrals. It’s the same definitions in terms of “ E ( · ) ”: � ∞ σ 2 = Var ( X ) µ = E ( X ) = x · f X ( x ) dx = E (( X − µ ) 2 ) = E ( X 2 ) − ( E ( X )) 2 − ∞ � ∞ E ( g ( X )) = g ( x ) f X ( x ) dx − ∞ µ and σ for the uniform distribution on [ a , b ] (with a < b ) � b = ( b 2 − a 2 ) / 2 b b − a dx = x 2 / 2 � = b + a 1 � µ = E ( X ) = x · � b − a b − a 2 � a x = a � b = ( b 3 − a 3 ) / 3 = b 2 + ab + a 2 b b − a dx = x 3 / 3 � 1 x 2 · E ( X 2 ) = � � b − a b − a 3 � a x = a σ 2 = Var ( X ) = E ( X 2 ) − ( E ( X )) 2 = b 2 + ab + a 2 � 2 = ( b − a ) 2 � b + a − 3 2 12 √ σ = SD ( X ) = ( b − a ) / 12 Prof. Tesler Continuous Distributions Math 283 / Fall 2015 13 / 24
Exponential distribution How far is it from the start of a chromosome to the first crossover? How far is it from one crossover to the next? Let D be the random variable giving either of those. It is a real number > 0 , with the exponential distribution � λ e − λ d if d � 0 ; f D ( d ) = if d < 0 . 0 where crossovers happen at a rate λ = 1 M − 1 = 0 . 01 cM − 1 . General case Crossovers Mean E ( D ) = 1 /λ = 100 cM = 1 M 10000 cM 2 = 1 M 2 Var ( D ) = 1 /λ 2 Variance = Standard Dev. SD ( D ) = 1 /λ = 100 cM = 1 M Prof. Tesler Continuous Distributions Math 283 / Fall 2015 14 / 24
Exponential distribution Exponential distribution 0.012 µ µ± ! Exponential: " =0.01 0.01 0.008 pdf 0.006 0.004 0.002 0 0 100 200 300 400 d Prof. Tesler Continuous Distributions Math 283 / Fall 2015 15 / 24
Exponential distribution In general, if events occur on the real number line x � 0 in such a way that the expected number of events in all intervals [ x , x + d ] is λ d (for x > 0 ), then the exponential distribution with parameter λ models the time/distance/etc. until the first event. It also models the time/distance/etc. between consecutive events. Chromosomes are finite; to make this model work, treat “there is no next crossover” as though there is one but it happens somewhere past the end of the chromosome. Prof. Tesler Continuous Distributions Math 283 / Fall 2015 16 / 24
Recommend
More recommend