The Matrix Geometric/Analytic Methods for Structured Markov Chains - PowerPoint PPT Presentation

William J. Stewart Department of Computer Science N.C. State University 2. We check that the system is stable by verifying Equation (8). The infinitesimal generator matrix   − 5 5 0   A = A 0 + A 1 + A 2 = 3 − 8 5     0 3 − 3 has stationary probability vector π A = ( . 1837 , . 3061 , . 5102) and . 4388 = π A A 2 e < π A A 0 e = 1 . 2245 17 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 3. We now initiate the iterative procedure to compute the rate matrix R . The inverse of A 1 is   − . 2466 − . 1598 − . 2283 A − 1   = − . 0959 − . 1918 − . 2740   1   − . 0822 − . 1644 − . 5205 which allows us to compute   − . 2466 − . 1598 − . 2283 V = A 2 A − 1   = 0 0 0   1   − . 0411 − . 0822 − . 2603   0 0 0 W = A 0 A − 1   =  . − . 3836 − . 7671 − 1 . 0959   1  0 0 0 18 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Equation (5) becomes     . 2466 . 1598 . 2283 0 0 0    + R 2   R ( k +1) = 0 0 0 . 3836 . 7671 1 . 0959     ( k )    . 0411 . 0822 . 2603 0 0 0 and iterating successively, beginning with R (0) = 0 , we find     . 2466 . 1598 . 2283 . 2689 . 2044 . 2921     R (1) =  , R (2) =  , 0 0 0 0 0 0       . 0411 . 0822 . 2603 . 0518 . 1036 . 2909   . 2793 . 2252 . 2921   R (3) =  , · · · 0 0 0    . 0567 . 1134 . 3049 Observe that the elements are non-decreasing. 19 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University After 48 iterations, successive differences are less than 10 − 12 , at which point   . 2917 . 2500 . 3571   R (48) =  . 0 0 0    . 0625 . 1250 . 3214 4. Proceeding to the boundary conditions: 0 1 − 6 5 . 0 1 0 0 B C 3 − 3 . 5 0 0 . 5 0 1 B C @ B 00 B 01 B C A = ( π 0 , π 1 ) B C ( π 0 , π 1 ) = (0 , 0) 0 0 − 6 6 . 0 0 B C B 10 A 1 + RA 0 B C B C 2 2 3 − 12 . 0 5 . 0 B C @ A 0 0 0 3 . 5 − 3 . 5 20 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Solve this by replacing the last equation with π 0 1 = 1 , i.e., set the first component of the subvector π 0 to 1. 0 1 − 6 5 . 0 1 0 1 B C 3 − 3 . 5 0 0 0 B C B C B C ( π 0 , π 1 ) = (0 , 0 | 0 , 0 , 1) 0 0 − 6 6 . 0 0 B C B C B C 2 2 3 − 12 . 0 0 B C @ A 0 0 0 3 . 5 0 with solution ( π 0 , π 1 ) = (1 . 0 , 1 . 6923 , | . 3974 , . 4615 , . 9011) Now on to the normalization stage. 21 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 5. The normalization constant is π 0 e + π 1 ( I − R ) − 1 e α =   1 . 4805 . 4675 . 7792   = (1 . 0 , 1 . 6923) e + ( . 3974 , . 4615 , . 9011)  e 0 1 0    . 1364 . 2273 . 15455 = 2 . 6923 + 3 . 2657 = 5 . 9580 which allows us to compute π 0 /α = ( . 1678 , . 2840) and π 1 /α = ( . 0667 , . 0775 , . 1512) 22 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 6. Subcomponents of the stationary distribution: — computed from π k = π k − 1 R .   . 2917 . 2500 . 3571   π 2 = π 1 R = ( . 0667 , . 0775 , . 1512) 0 0 0     . 0625 . 1250 . 3214 = ( . 0289 , . 0356 , . 0724) and   . 2917 . 2500 . 3571   π 3 = π 2 R = ( . 0289 , . 0356 , . 0724) 0 0 0     . 0625 . 1250 . 3214 = ( . 0130 , . 0356 , . 0336) and so on. 23 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Block Lower-Hessenberg Markov Chains   B 00 B 01 0 0 0 0 0 · · ·   B 10 B 11 A 0 0 0 0 0 · · ·       B 20 B 21 A 1 A 0 0 0 0 · · ·     B 30 B 31 A 2 A 1 A 0 0 0 · · ·   Q =     B 40 B 41 A 3 A 2 A 1 A 0 0 · · ·     . ... ... ... ... ...   .   . · · ·    . . . . . . .  . . . . . . . . . . . . . . · · · Transitions are now permitted from any level to any lower level. Objective:compute the stationary probability vector π from πQ = 0 . π is partitioned conformally with Q , i.e. π = ( π 0 , π 1 , π 2 , · · · ) — π i = ( π ( i, 1) , π ( i, 2) , · · · π ( i, K )) . 24 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University A matrix geometric solution exists which mirrors that of a QBD process,. There exists a positive matrix R such that π i = π i − 1 R, for i = 2 , 3 , . . . i.e., that π i = π 1 R i − 1 , for i = 2 , 3 , . . . From πQ = 0 ∞ � π k + j A k = 0 , j = 1 , 2 , . . . k =0 and in particular, ∞ � π 1 A 0 + π 2 A 1 + π k +1 A k = 0 k =2 25 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Substituting π i = π 1 R i − 1 into ∞ � π 1 R k A k = 0 π 1 A 0 + π 1 RA 1 + k =2 gives � ∞ � � R k A k π 1 A 0 + RA 1 + = 0 k =2 So find R from ∞ � R k A k = 0 A 0 + RA 1 + (9) k =2 Equation (9) reduces to Equation (4) when A k = 0 for k > 2 . 26 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Rearranging Equation (9), we find ∞ � R = − A 0 A − 1 R k A k A − 1 − 1 1 k =2 ∞ � R ( l +1) = − A 0 A − 1 ( l ) A k A − 1 R k R (0) = 0; − 1 , l = 1 , 2 , . . . 1 k =2 In many cases, the structure of the infinitesimal generator is such that the blocks A i are zero for relatively small values of i , which limits the computational effort needed in each iteration. 27 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Derivation of the initial subvectors π 0 and π 1 . From the first equation of πQ = 0 , ∞ � π i B i 0 = 0 i =0 and we may write � ∞ ∞ ∞ � � � � π 1 R i − 1 B i 0 = π 0 B 00 + π 1 R i − 1 B i 0 π 0 B 00 + π i B i 0 = π 0 B 00 + = 0 , i =1 i =1 i =1 (10) From the second equation of πQ = 0 , ∞ ∞ � � R i − 1 B i 1 = 0 . π 0 B 01 + π i B i 1 = 0 , i . e ., π 0 B 01 + π 1 (11) i =1 i =1 28 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University In matrix form, we can compute π 0 and π 1 from   B 00 B 01   ( π 0 , π 1 )  = (0 , 0) .    � ∞ � ∞ i =1 R i − 1 B i 0 i =1 R i − 1 B i 1 Once found, normalize by dividing by � ∞ � � R k − 1 e = π 0 e + π 1 ( I − R ) − 1 e. α = π 0 e + π 1 i =1 For discrete-time Markov chains, replace − A − 1 with ( I − A 1 ) − 1 . 1 29 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Same example as before, but with additional transitions ( ξ 1 = . 25 and ξ 2 = . 75 ) to lower non-neighboring states. ξ ξ ξ ξ ξ 1 1 1 1 1 ξ λ 1 λ λ λ λ 1 1 1 1 1 5,1 1,1 3,1 4,1 2,1 λ 1 γ γ γ γ γ γ γ γ γ 0,1 γ 1 1 1 1 2 1 2 2 2 2 µ µ µ µ µ γ γ 1,2 2,2 3,2 4,2 5,2 1 2 γ γ γ γ γ γ γ γ γ 0,2 1 1 γ 1 1 1 2 2 2 2 2 λ 2 1,3 3,3 4,3 2,3 5,3 λ λ λ λ λ 2 2 ξ 2 2 2 2 ξ ξ ξ ξ ξ 2 2 2 2 2 Figure 2: State transition diagram for a GI/M/1-type process. 30 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Q = 0 − 6 5 . 0 1 B 3 − 3 . 5 . 5 B B B − 6 5 1 B B B 2 . 00 2 . 00 3 − 12 5 B B B 3 − 3 . 5 . 5 B B B . 25 − 6 . 25 5 1 B B 4 3 . 00 − 12 5 . 00 B B B . 75 3 − 4 . 25 . 5 B B B . 25 − 6 . 25 5 1 B B B 4 3 . 00 − 12 5 . 00 B B B . 75 3 − 4 . 25 B B @ ... ... ... 31 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The computation of the matrix R proceeds as previously: 0 1 − . 2233 − . 1318 − . 1550 A − 1 B C = − . 0791 − . 1647 − . 1938 1 B C @ A − . 0558 − . 1163 − . 3721 which allows us to compute 0 1 0 1 − . 2233 − . 1318 − . 1550 0 0 0 A 0 A − 1 A 2 A − 1 B C B C = A , = 0 0 0 − . 3163 − . 6589 − . 7752 1 B C 1 B C @ @ A − . 0279 − . 0581 − . 1860 0 0 0 0 1 − . 0558 − . 0329 − . 0388 A 3 A − 1 B C = A , 0 0 0 B C 1 @ − . 0419 − . 0872 − . 2791 32 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The iterative process is 0 1 0 1 . 2233 . 1318 . 1550 0 0 0 A + R 2 B C B C R ( k +1) = 0 0 0 . 3163 . 6589 . 7752 B C ( k ) B C @ @ A . 0279 . 0581 . 1860 0 0 0 0 1 . 0558 . 0329 . 0388 + R 3 B C 0 0 0 ( k ) B C @ A . 0419 . 0872 . 2791 Iterating successively, beginning with R (0) = 0 , we find 0 1 0 1 . 2233 . 1318 . 1550 . 2370 . 1593 . 1910 B C B C R (1) = A , R (2) = A , 0 0 0 0 0 0 B C B C @ @ . 0279 . 0581 . 1860 . 0331 . 0686 . 1999 33 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 0 1 . 2415 . 1684 . 2031 B C R (3) = A , · · · 0 0 0 B C @ . 0347 . 0719 . 2043 34 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University After 27 iterations, successive differences are less than 10 − 12 , at which point   . 2440 . 1734 . 2100   R (27) =  . 0 0 0    . 0356 . 0736 . 1669 The boundary conditions are now   B 00 B 01  = (0 , 0) ( π 0 , π 1 )  B 11 + RB 21 + R 2 B 31 B 10 + RB 20 35 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 0 1 − 6 . 0 5 . 0 1 0 0 B C 3 . 0 − 3 . 5 0 0 . 5 B C B C B C = ( π 0 , π 1 ) = (0 , 0) . . 0610 . 1575 − 5 . 9832 5 . 6938 . 0710 B C B C B C 2 . 0000 2 . 000 3 . 000 − 12 . 0000 5 . 0000 B C @ A . 0089 . 1555 . 0040 3 . 2945 − 3 . 4624 Solve this by replacing the last equation with π 0 1 = 1 . 0 1 − 6 . 0 5 . 0 1 0 1 B C 3 . 0 − 3 . 5 0 0 0 B C B C B C ( π 0 , π 1 ) = (0 , 0 | 0 , 0 , 1) . 0610 . 1575 − 5 . 9832 5 . 6938 0 B C B C B C 2 . 0000 2 . 000 3 . 000 − 12 . 0000 0 B C @ A . 0089 . 1555 . 0040 3 . 2945 0 Solution ( π 0 , π 1 ) = (1 . 0 , 1 . 7169 , | . 3730 , . 4095 , . 8470) 36 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The normalization constant is π 0 e + π 1 ( I − R ) − 1 e α =   1 . 3395 . 2584 . 3546   = (1 . 0 , 1 . 7169) e + ( . 3730 , . 4095 , . 8470)  e 0 1 0    . 0600 . 1044 1 . 2764 = 2 . 7169 + 2 . 3582 = 5 . 0751 Thus: π 0 /α = ( . 1970 , . 3383) , and π 1 /α = ( . 0735 , . 0807 , . 1669) . Successive subcomponents are now computed from π k = π k − 1 R . 37 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University   . 2440 . 1734 . 2100   π 2 = π 1 R = ( . 0735 , . 0807 , . 1669) 0 0 0     . 0356 . 0736 . 1669 = ( . 0239 , . 0250 , . 0499) and   . 2440 . 1734 . 2100   π 3 = π 2 R = ( . 0239 , . 0250 , . 0499) 0 0 0     . 0356 . 0736 . 1669 = ( . 0076 , . 0078 , . 0135) and so on. 38 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Simplifications occur when the initial B blocks have the same dimensions as the A blocks and when 0 1 B 00 A 0 0 0 0 0 0 · · · B C B 10 A 1 A 0 0 0 0 0 · · · B C B C B C B 20 A 2 A 1 A 0 0 0 0 · · · B C B C B C B 30 A 3 A 2 A 1 A 0 0 0 · · · Q = B C B C B C B 40 A 4 A 3 A 2 A 1 A 0 0 · · · B C B C . ... ... ... ... ... . B C . · · · B C B C . . . . . . . @ A . . . . . . . . . . . . . . · · · In this case π i = π 0 R i , for i = 1 , 2 , . . . , � ∞ i =0 R i B i 0 is an infinitesimal generator matrix π 0 is the stationary probability vector of � ∞ i =0 R i B i 0 — normalized so that π 0 ( I − R ) − 1 e = 1 . 39 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Also, in some applications more than two boundary columns can occur. 0 1 B 00 B 01 B 02 A 0 B C B 10 B 11 B 12 A 1 A 0 B C B C B C B 20 B 21 B 22 A 2 A 1 A 0 B C B C B C B 30 B 31 B 32 A 3 A 2 A 1 A 0 B C B C B C B 40 B 41 B 42 A 4 A 3 A 2 A 1 A 0 B C Q = B C B B 50 B 51 B 52 A 5 A 4 A 3 A 2 A 1 A 0 C B C B C B 60 B 61 B 62 A 6 A 5 A 4 A 3 A 2 A 1 A 0 B C B C B C B 70 B 71 B 72 A 7 A 6 A 5 A 4 A 3 A 2 A 1 A 0 B C B C B C B 80 B 81 B 82 A 8 A 7 A 6 A 5 A 4 0 0 A 1 A 0 B C B C . . . ... ... ... ... ... ... ... ... ... @ A . . . . . . At present, this matrix is not block lower Hessenberg. 40 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Restructured into the form 0 1 B 00 B 01 B 02 A 0 B C B 10 B 11 B 12 A 1 A 0 B C B C B C B 20 B 21 B 22 A 2 A 1 A 0 B C B C B C B 30 B 31 B 32 A 3 A 2 A 1 A 0 B C B C B 40 B 41 B 42 A 4 A 3 A 2 A 1 A 0 B C B C B C B 50 B 51 B 52 A 5 A 4 A 3 A 2 A 1 A 0 B C B C B C B 60 B 61 B 62 A 6 A 5 A 4 A 3 A 2 A 1 A 0 B C B C B C B 70 B 71 B 72 A 7 A 6 A 5 A 4 A 3 A 2 A 1 A 0 B C B C B C B 80 B 81 B 82 A 8 A 7 A 6 A 5 A 4 0 0 A 1 A 0 B C B C @ A . . . . . . ... ... ... ... ... ... . . . . . . . . . . . . 0 1 0 1 0 1 A 0 A 3 A 2 A 1 B 00 B 01 B 02 B C B C B C A 0 = A , A 1 = A , B 00 = A , A 1 A 0 A 4 A 3 A 2 B 10 B 11 B 12 · · · B C B C B C @ @ @ A 2 A 1 A 0 A 5 A 4 A 3 B 20 B 21 B 22 41 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Block Upper-Hessenberg Markov Chains For QBD and GI/M/1-type processes, we posed the problem in terms of continuous-time Markov chains. Discrete-time Markov chains can be treated if the matrix inverse A − 1 is 1 replaced with the inverse ( I − A 1 ) − 1 . This time we shall consider the discrete-time case.   B 00 B 01 B 02 B 03 · · · B 0 j · · ·   B 10 A 1 A 2 A 3 · · · A j · · ·       0 A 0 A 1 A 2 · · · A j − 1 · · ·   P =     0 0 A 0 A 1 · · · A j − 2 · · ·     0 0 0 A 0 · · · A j − 3 · · ·     . . . . . . .   . . . . . . . . . . . . . . 42 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University A = � ∞ i =0 A i is a stochastic matrix assumed to be irreducible. π A A = π A , and π A e = 1 . P is known to be positive-recurrent if � ∞ � � π A iA i e ≡ π A b < 1 . (12) i =1 We seek to compute π from πP = π . As before, we partition π conformally with P , i.e. π = ( π 0 , π 1 , π 2 , · · · ) where π i = ( π ( i, 1) , π ( i, 2) , · · · π ( i, K )) The analysis of M/G/1-type processes is more complicated than that of QBD or GI/M/1-type processes because the subvectors π i no longer have a matrix geometric relationship with one another. 43 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The key to solving upper block-Hessenberg structured Markov chains is the computation of a certain stochastic matrix G . G ij is the conditional probability that starting in state i of any level n ≥ 2 , the process enters level n − 1 for the first time by arriving at state j of that level. This matrix satisfies the fixed point equation ∞ � A i G i G = i =0 and is indeed is the minimal non-negative solution of ∞ � A i X i . X = i =0 It can be found by means of the iteration ∞ � A i G i G (0) = 0; G ( k +1) = ( k ) = 0 , k = 0 , 1 , . . . i =0 44 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Once the matrix G has been computed, then successive components of π can be found. From πP = π π ( I − P ) = 0 , 0 1 I − B 00 − B 01 − B 02 − B 03 − B 0 j · · · · · · B C − B 10 I − A 1 − A 2 − A 3 − A j · · · · · · B C B C B 0 − A 0 I − A 1 − A 2 − A j − 1 C · · · · · · B C ( π 0 , π 1 , · · · , π j , · · · ) B C 0 0 − A 0 I − A 1 − A j − 2 B C · · · · · · B C B C 0 0 0 − A 0 − A j − 3 B C · · · · · · B C @ A . . . . . . . . . . . . . . . . . . . . . (13) = (0 , 0 , · · · 0 , · · · ) . The submatrix in the lower right block is block Toeplitz. There is a decomposition of this Toeplitz matrix into a block upper triangular matrix U and block lower triangular matrix L . 45 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 0 1 0 1 A ∗ A ∗ A ∗ A ∗ I 0 0 0 · · · · · · 1 2 3 4 B C B C 0 A ∗ A ∗ A ∗ − G I 0 0 B C B C · · · · · · 1 2 3 B C B C B C B C A ∗ A ∗ 0 0 0 − G I 0 U = · · · and L = · · · . B C B C 1 2 B C B C B C B C A ∗ 0 0 0 0 0 − G I · · · · · · B 1 C B C B C B C @ A @ A . . . . . ... ... ... . . . . . . . . . . Once the matrix G has been formed then L is known. The inverse of L can be written in terms of the powers of G . 0 1 0 1 0 1 I 0 0 0 I 0 0 0 1 0 0 0 · · · · · · · · · B C B C B C − G I 0 0 G I 0 0 0 1 0 0 B · · · C B · · · C B · · · C B C B C B C B C B C B C G 2 0 − G I 0 G I 0 0 0 1 0 · · · · · · = · · · . B C B C B C B C B C B C G 3 G 2 B C B C B C 0 0 − G I G I 0 0 0 1 · · · · · · · · · B C B C B C B C B C B C @ A @ A @ A . . . . . . . ... ... ... ... ... . . . . . . . . . . . . . . 46 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University From Equation (13), 0 1 I − B 00 − B 01 − B 02 − B 03 − B 0 j · · · · · · B C − B 10 B C B C B 0 C B C ( π 0 , π 1 , · · · , π j , · · · ) B C 0 UL B C B C B C 0 B C B C @ A . . . = (0 , 0 , · · · 0 , · · · ) which allows us to write π 0 ( − B 01 , − B 02 , · · · ) + ( π 1 , π 2 , · · · ) UL = 0 or π 0 ( B 01 , B 02 , · · · ) L − 1 = ( π 1 , π 2 , · · · ) U, 47 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 0 1 I 0 0 0 · · · B C G I 0 0 · · · B C B C B C G 2 G I 0 · · · π 0 ( B 01 , B 02 , · · · ) = ( π 1 , π 2 , · · · ) U B C B C B C G 3 G 2 G I · · · B C B C . . ... ... @ A . . . . π 0 ( B ∗ 01 , B ∗ 02 , · · · ) = ( π 1 , π 2 , · · · ) U (14) ∞ B 01 + B 02 G + B 03 G 2 + · · · = X B 0 k G k − 1 B ∗ = 01 k =1 ∞ B 02 + B 03 G + B 04 G 2 + · · · = X B 0 k G k − 2 B ∗ = 02 k =2 . . . ∞ B 0 i + B 0 ,i +1 G + B 0 ,i +2 G 2 + · · · = X B 0 k G k − i B ∗ = 0 i k = i 48 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Can compute the successive components of π once π 0 and U are known:   A ∗ A ∗ A ∗ A ∗ · · · 1 2 3 4   A ∗ A ∗ A ∗ 0 · · ·   1 2 3   π 0 ( B ∗ 01 , B ∗ A ∗ A ∗   0 0 · · · 02 , · · · ) = ( π 1 , π 2 , · · · ) 1 2     A ∗  0 0 0 · · ·  1   . . .  ...  . . . . . . Observe that − 1 π 0 B ∗ 01 = π 1 A ∗ π 1 = π 0 B ∗ 01 A ∗ = ⇒ 1 1 − 1 − π 1 A ∗ − 1 π 0 B ∗ 02 = π 1 A ∗ 2 + π 2 A ∗ π 2 = π 0 B ∗ 02 A ∗ 2 A ∗ = ⇒ 1 1 1 − 1 − π 1 A ∗ − 1 − π 2 A ∗ − 1 π 0 B ∗ 03 = π 1 A ∗ 3 + π 2 A ∗ 2 + π 3 A ∗ π 3 = π 0 B ∗ 03 A ∗ 3 A ∗ 2 A ∗ = ⇒ 1 1 1 1 . . . 49 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University In general: − 1 , π 0 B ∗ 0 i − π 1 A ∗ i − π 2 A ∗ i − 1 − · · · − π i − 1 A ∗ A ∗ � � π i = i = 1 , 2 , . . . 2 1 i − 1 � � − 1 . � π 0 B ∗ π k A ∗ A ∗ = 0 i − i − k +1 1 k =1 π 0 ( B ∗ 01 , B ∗ First subvector π 0 : 02 , · · · ) = ( π 1 , π 2 , · · · ) U 0 1 I − B 00 − B ∗ − B ∗ − B ∗ − B ∗ · · · · · · 01 02 03 0 j B C A ∗ A ∗ A ∗ A ∗ − B 10 B C · · · · · · 1 2 3 j B C B C A ∗ A ∗ 0 0 A j − 1 · · · · · · B C 1 2 B C ( π 0 , π 1 , · · · , π j , · · · ) B C 0 0 0 A ∗ A ∗ · · · · · · B C 1 j − 2 B C B C . B C 0 0 0 0 . · · · B C . B C @ A . . . = (0 , 0 , · · · 0 , · · · ) 50 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University First two equations: − π 0 B ∗ 01 + π 1 A ∗ π 0 ( I − B 00 ) − π 1 B 10 = 0 , 1 = 0 . Second gives − 1 . π 1 = π 0 B ∗ 01 A ∗ 1 Substitute into first − 1 B 10 = 0 π 0 ( I − B 00 ) − π 0 B ∗ 01 A ∗ 1 or � � − 1 B 10 I − B 00 − B ∗ 01 A ∗ π 0 = 0 1 Can now compute π 0 to a multiplicative constant. To normalize, enforce the condition: � ∞ � � ∞ � − 1 � � B ∗ A ∗ π 0 e + π 0 e = 1 . (15) 0 i i i =1 i =1 51 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Computation of the matrix U from 0 1 I − A 1 − A 2 − A 3 − A j · · · · · · B C − A 0 I − A 1 − A 2 − A j − 1 B · · · · · · C B C B C 0 − A 0 I − A 1 − A j − 2 UL = . · · · · · · B C B C B C 0 0 − A 0 − A j − 3 · · · · · · B C B C @ A . . . . . . . . . . . . . . . . . . 52 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 0 1 A ∗ A ∗ A ∗ A ∗ · · · 1 2 3 4 B C A ∗ A ∗ A ∗ 0 B C · · · 1 2 3 B C B C A ∗ A ∗ 0 0 · · · B C 1 2 B C B C 0 0 0 A ∗ · · · B 1 C B C @ A . . . ... . . . . . . 0 1 0 1 I − A 1 − A 2 − A 3 − A j I 0 0 0 · · · · · · · · · B C B C − A 0 I − A 1 − A 2 − A j − 1 G I 0 0 B C B C · · · · · · · · · B C B C B C B C G 2 0 − A 0 I − A 1 − A j − 2 G I 0 = · · · · · · · · · B C B C B C B C G 3 G 2 B C B C 0 0 − A 0 − A j − 3 G I · · · · · · · · · B C B C B C B C @ A @ A . . . . . . . . ... ... . . . . . . . . . . . . . . . . ∞ 1 = I − A 1 − A 2 G − A 3 G 2 − A 4 G 3 − · · · = I − X A k G k − 1 A ∗ k =1 ∞ 2 = − A 2 − A 3 G − A 4 G 2 − A 5 G 3 − · · · = − X A k G k − 2 A ∗ k =2 53 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University ∞ I − A 1 − A 2 G − A 3 G 2 − A 4 G 3 − · · · = I − X A k G k − 1 A ∗ = 1 k =1 ∞ − A 2 − A 3 G − A 4 G 2 − A 5 G 3 − · · · = − X A k G k − 2 A ∗ = 2 k =2 ∞ − A 3 − A 4 G − A 5 G 2 − A 6 G 3 − · · · = − X A k G k − 3 A ∗ = 3 k =3 . . . ∞ − A i − A i +1 G − A i +2 G 2 − A i +3 G 3 − · · · = − X A k G k − i , A ∗ = i ≥ 2 . i k = i We now have all the results we need. 54 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The basic algorithm is • Construct the matrix G . • Obtain π 0 by solving the system of equations − 1 B 10 � I − B 00 − B ∗ 01 A ∗ � π 0 = 0 , 1 subject to the normalizing condition, Equation (15). − 1 . • Compute π 1 from π 1 = π 0 B ∗ 01 A ∗ 1 • Find all other required π i from � � 0 i − � i − 1 − 1 . π 0 B ∗ k =1 π k A ∗ A ∗ π i = 1 i − k +1 where ∞ ∞ � � A k G k − 1 B ∗ B 0 k G k − i , A ∗ 0 i = i ≥ 1; 1 = I − k = i k =1 ∞ � A ∗ A k G k − i , and i = − i ≥ 2 . k = i 55 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Computational questions: (1) The matrix G . The iterative procedure suggested is very slow: ∞ � A i G i G (0) = 0; G ( k +1) = ( k ) , k = 0 , 1 , . . . i =0 Faster variant from Neuts: � � ∞ � G ( k +1) = ( I − A 1 ) − 1 A i G i G (0) = 0; A 0 + , k = 0 , 1 , . . . ( k ) i =2 Among fixed point iterations, Bini and Meini has the fastest convergence � − 1 � ∞ � A i G i − 1 G (0) = 0; G ( k +1) = I − A 0 , k = 0 , 1 , . . . ( k ) i =1 More advanced techniques based on cyclic reduction have been developed and converge much faster. 56 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 2) Computation of infinite summations: Frequently the structure of the matrix is such that A k and B k are zero for relatively small values of k . Since � ∞ k =0 A k and � ∞ k =0 B k are stochastic A k and B k are negligibly small for large values of i and can be set to zero once k exceeds some threshold k M . When k M is not small, finite summations of the type � k M k = i Z k G k − i should be evaluated using Horner’s rule. For example, if k M = 5 5 Z k G k − 1 = Z 1 G 4 + Z 2 G 3 + Z 3 G 2 + Z 4 G + A 5 � Z ∗ 1 = k =1 should be evaluated from the inner-most parenthesis outwards as Z ∗ 1 = ( [ ( Z 1 G + Z 2 ) G + Z 3 ] G + Z 4 ) G + Z 5 . 57 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Example: Same as before but with incorporates additional transitions ( ζ 1 = 1 / 48 and ζ 2 = 1 / 16 ) to higher numbered non-neighboring states. ζ ζ ζ ζ ζ ζ 1 1 1 1 1 1 λ λ λ λ λ 1 1 1 1 1 5,1 1,1 2,1 3,1 4,1 λ 1 γ γ γ γ γ γ γ γ γ 0,1 γ 1 1 1 1 1 2 2 2 2 2 µ µ µ µ µ γ γ 1,2 2,2 3,2 4,2 5,2 1 2 γ γ γ γ γ γ 0,2 γ γ γ 1 γ 1 1 1 1 2 2 2 2 2 λ 2 1,3 3,3 4,3 2,3 5,3 λ λ λ λ λ 2 2 2 2 2 ζ ζ ζ ζ ζ ζ 2 2 2 2 2 2 Figure 3: State transition diagram for an M/G/1-type process. 58 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 0 23 / 48 5 / 12 1 / 12 1 / 48 B 1 / 4 31 / 48 1 / 24 1 / 16 B B B 23 / 48 5 / 12 1 / 12 1 / 48 B B B 1 / 3 1 / 3 1 / 4 1 / 12 B B B 1 / 4 31 / 48 1 / 24 1 / 16 B B B 23 / 48 5 / 12 1 / 12 B P = B 2 / 3 1 / 4 1 / 12 B B B 1 / 4 31 / 48 1 / 24 B B B 1 / 2 5 / 12 B B B 2 / 3 1 / 4 1 / 12 B B B 1 / 4 31 / 48 B B @ ... ... 59 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 0 1 0 1 0 1 0 0 0 23 / 48 5 / 12 0 1 / 12 0 0 B C B C B C A 0 = A , A 1 = A , A 2 = A , 0 2 / 3 0 1 / 4 0 1 / 12 0 0 0 B C B C B C @ @ @ 0 0 0 0 1 / 4 31 / 48 0 0 1 / 24 0 1 1 / 48 0 0 0 1 0 1 @ 23 / 48 5 / 12 @ 1 / 12 0 0 B C A , A , A 3 = A , B 00 = B 01 = 0 0 0 B C 1 / 4 31 / 48 0 0 1 / 24 @ 0 0 1 / 16 0 1 0 0 0 1 @ 1 / 48 0 0 B C B 02 = and B 10 = A . 1 / 3 1 / 3 A B C 0 0 1 / 16 @ 0 0 60 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University First, using Equation (12), we verify that the Markov chain with transition probability matrix P is positive-recurrent. 0 1 . 583333 . 416667 0 B C A = A 0 + A 1 + A 2 + A 3 = A . . 250000 . 666667 . 083333 B C @ 0 . 250000 . 750000 π A = ( . 310345 , . 517241 , . 172414) . Also 0 1 0 1 0 1 . 708333 . 416667 0 1 1 . 125000 B C B C B C b = ( A 1 +2 A 2 +3 A 3 ) e = A = A . . 250000 0 . 083333 1 0 . 333333 B C B C B C @ A @ @ 0 . 250000 . 916667 1 1 . 166667 The Markov chain is positive-recurrent since 0 1 1 . 125000 B C π A b = ( . 310345 , . 517241 , . 172414) A = . 722701 < 1 0 . 333333 B C @ 1 . 166667 61 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Computation of the matrix G : The ij element of G is the conditional probability that starting in state i of any level n ≥ 2 , the process enters level n − 1 for the first time by arriving at state j of that level. For this particular example this means that the elements in column 2 of G must all be equal to 1 and all other elements must be zero —- the only transitions from any level n to level n − 1 are from and to the second element. Nevertheless, let see how each of the three different fixed point formula actually perform. 62 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University We take the initial value, G (0) , to be zero. G ( k +1) = � ∞ i =0 A i G i Formula #1: ( k ) , k = 0 , 1 , . . . G ( k +1) = A 0 + A 1 G ( k ) + A 2 G 2 ( k ) + A 3 G 3 ( k ) After 10 iterations, the computed matrix is   0 . 867394 0   G (10) =  . 0 . 937152 0    0 . 766886 0 Formula #2: G ( k +1) = ( I − A 1 ) − 1 � � A 0 + � ∞ i =2 A i G i , k = 0 , 1 , . . . ( k ) G ( k +1) = ( I − A 1 ) − 1 � � A 0 + A 2 G 2 ( k ) + A 3 G 3 ( k ) 63 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University After 10 iterations:   0 . 999844 0   G (10) =  . 0 . 999934 0    0 . 999677 0 � − 1 � I − � ∞ i =1 A i G i − 1 Formula #3: G ( k +1) = A 0 , k = 0 , 1 , . . . ( k ) � − 1 � I − A 1 − A 2 G ( k ) − A 3 G 2 G ( k +1) = A 0 ( k ) This is the fastest of the three. After 10 iterations:   0 . 999954 0   G (10) =  . 0 . 999979 0    0 . 999889 0 64 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University We continue with the algorithm using the exact value of G . In preparation, we compute the following quantities, using the fact that A k = 0 for k > 3 and B 0 k = 0 for k > 2 . 0 1 . 520833 − . 520833 0 ∞ A k G k − 1 = I − A 1 − A 2 G − A 3 G 2 = X B C A ∗ 1 = I − − . 250000 1 − . 083333 B C @ A k =1 0 − . 354167 . 354167 0 1 − . 083333 − . 020833 0 ∞ A k G k − 2 = − ( A 2 + A 3 G ) = X B C A ∗ 2 = − 0 0 0 B C @ A k =2 0 − . 062500 − . 041667 0 1 − . 020833 0 0 ∞ A k G k − 3 = − A 3 = X B C A ∗ 3 = − 0 0 0 B C @ A k =3 0 0 − . 062500 65 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 0 1 ∞ @ . 083333 . 020833 0 B 0 k G k − 1 = B 01 + B 02 G = X B ∗ 01 = A 0 . 062500 . 041667 k =1 0 1 ∞ @ . 020833 0 0 B 0 k G k − 2 = B 02 = X B ∗ 02 = A 0 0 . 062500 k =2 0 1 2 . 640 1 . 50 . 352941 − 1 = B C A ∗ . 720 1 . 50 . 352941 1 B C @ A . 720 1 . 50 3 . 176470 Now compute the initial subvector, π 0 , from 0 1 . 468750 − . 468750 “ ” − 1 B 10 I − B 00 − B ∗ 01 A ∗ 0 = π 0 = π 0 1 @ A − . 302083 . 302083 gives (un-normalized) π 0 = ( . 541701 , . 840571) . 66 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Normalization: � ∞ � � ∞ � − 1 � � B ∗ A ∗ π 0 e + π 0 e = 1 . 0 i i i =1 i =1 i.e., 3 ) − 1 e = 1 . π 0 e + π 0 ( B ∗ 01 + B ∗ 02 ) ( A ∗ 1 + A ∗ 2 + A ∗ Evaluating 3 ) − 1 ( B ∗ 01 + B ∗ 02 ) ( A ∗ 1 + A ∗ 2 + A ∗ − 1 0 1 . 416667 − . 541667 0 0 1 @ . 104167 . 020833 0 B C = − . 250000 1 − . 083333 B C A 0 . 062500 . 104167 @ A 0 − . 416667 . 250000 0 1 @ . 424870 . 291451 . 097150 = A . 264249 . 440415 . 563472 67 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 0 1 1 0 1 0 1 @ 1 @ . 424870 . 291451 . 097150 A + ( . 541701 , . 840571) B C ( . 541701 , . 840571) 1 B C A 1 . 264249 . 440415 . 563472 @ A 1 = 2 . 888888 Finally, initial subvector is π 0 = ( . 541701 , . 840571) / 2 . 888888 = ( . 187512 , . 290967) − 1 = We can now find π 1 from the relationship π 1 = π 0 B ∗ 01 A ∗ 1 0 1 2 . 640 1 . 50 . 352941 0 1 @ . 083333 . 020833 0 B C ( . 187512 , . 290967) . 720 1 . 50 . 352941 B C A 0 . 062500 . 041667 @ A . 720 1 . 50 3 . 176470 = ( . 065888 , . 074762 , . 0518225) . 68 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Finally, all needed remaining subcomponents of π can be found from � i − 1 � � − 1 π 0 B ∗ π k A ∗ A ∗ π i = 0 i − i − k +1 1 k =1 − 1 ( π 0 B ∗ 02 − π 1 A ∗ 2 ) A ∗ π 2 = 1 = ( . 042777 , . 051530 , . 069569) − 1 = ( − π 1 A ∗ − 1 ( π 0 B ∗ 03 − π 1 A ∗ 3 − π 2 A ∗ 2 ) A ∗ 3 − π 2 A ∗ 2 ) A ∗ π 3 = 1 1 = ( . 0212261 , . 024471 , . 023088) − 1 = ( − π 2 A ∗ − 1 ( π 0 B ∗ 04 − π 1 A ∗ 4 − π 2 A ∗ 3 − π 3 A ∗ 2 ) A ∗ 3 − π 3 A ∗ 2 ) A ∗ π 4 = 1 1 = ( . 012203 , . 014783 , . 018471) . . . 69 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The probability that the Markov chain is in any level i is given by � π i � 1 . Thus the probabilities of this Markov chain being in the first 5 levels � π 0 � 1 = . 478479 , � π 1 � 1 = . 192473 , � π 2 � 1 = . 163876 , � π 3 � 1 = . 068785 , � π 4 � 1 = . 045457 The sum of these five probabilities is 0.949070. 70 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Phase Type Distributions Goals: (1) Find ways to model general distributions while maintaining the tractability of the exponential. (2) Find way to form a distribution having some given expectation and variance. Phase-type distributions are represented as the passage through a succession of exponential phases or stages (and hence the name). 71 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The Exponential Distribution — consists of a single exponential phase. Random variable X is exponentially distributed with parameter µ > 0 . µ The diagram represents customers entering the phase from the left, spending an amount of time that is exponentially distributed with parameter µ within the phase and then exiting to the right. Exponential density function: f X ( x ) ≡ dF ( x ) = µe − µx , x ≥ 0 dx σ 2 X = 1 /µ 2 . Expectation and variance, E [ X ] = 1 /µ ; 72 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The Erlang-2 Distribution Service provided to a customer is expressed as one exponential phase followed by a second exponential phase. µ µ 2 1 Although both service phases are exponentially distributed with the same parameter, they are completely independent — the servicing process does not contain two independent servers. Probability density function of each of the phases: f Y ( y ) = µe − µy , y ≥ 0 σ 2 Y = 1 /µ 2 . Expectation and variance, E [ Y ] = 1 /µ ; 73 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Total time in service is the sum of two independent and identically distributed exponential random variables. X = Y + Y � ∞ f X ( x ) = f Y ( y ) f Y ( x − y ) dy −∞ � y µe − µy µe − µ ( x − y ) dy = 0 � x µ 2 e − µx dy = µ 2 xe − µx , = x ≥ 0 , 0 and is equal to zero for x ≤ 0 — the Erlang-2 distribution: E 2 The corresponding cumulative distribution function is given by F X ( x ) = 1 − e − µx − µxe − µx = 1 − e − µx (1 + µx ) , x ≥ 0 . 74 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Laplace transform of the overall service time distribution: � ∞ e − sx f X ( x ) dx L X ( s ) ≡ 0 Laplace transform of each of the exponential phases: � ∞ e − sy f Y ( y ) dy. L Y ( s ) ≡ 0 Then L X ( s ) = E [ e − sx ] = E [ e − s ( y 1 + y 2 ) ] = E [ e − sy 1 ] E [ e − sy 2 ] = L Y ( s ) L Y ( s ) � 2 � µ = , s + µ To invert, look up tables of transform pairs. 75 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University x r 1 r ! e − ax . ⇐ ⇒ (16) ( s + a ) r +1 Setting a = µ and r = 1 allows us to invert L X ( s ) to obtain f X ( x ) = µ 2 xe − µx = µ ( µx ) e − µx , x ≥ 0 Moments may be found from the Laplace transform as E [ X k ] = ( − 1) k d k � � ds k L X ( s ) for k = 1 , 2 , . . . � � s =0 � � E [ X ] = − d = − µ 2 d s =0 = 2 = µ 2 2( s + µ ) − 3 � ds ( s + µ ) − 2 � � ds L X ( s ) µ. � � � � � s =0 s =0 Time spent in service is the sum of two iid random variables: E [ X ] = E [ Y ] + E [ Y ] = 1 /µ + 1 /µ = 2 /µ � 1 � 1 � 2 � 2 = 2 σ 2 X = σ 2 Y + σ 2 Y = + µ 2 . µ µ 76 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Example: Exponential random variable with parameter µ ; Two phase Erlang-2 random variable, each phase having parameter 2 µ . Mean Variance 1 /µ 2 Exponential 1 /µ 1 / 2 µ 2 Erlang-2 1 /µ An Erlang-2 random variable has less variability than an exponentially distributed random variable with the same mean. 77 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The Erlang-r Distribution A succession of r identical, but independent, exponential phases with parameter µ . µ µ µ µ r 2 r−1 1 Probability density function at phase i : f Y ( y ) = µe − µy ; y ≥ 0 Expectation and variance per phase: σ 2 Y = 1 /µ 2 E [ Y ] = 1 /µ, and respectively . 78 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Distribution of total time spent is the sum of r iid random variables. � 1 � 1 � 2 � = r = r σ 2 E [ X ] = r µ ; X = r µ 2 . µ µ � r � µ Laplace transform of the service time : L X ( s ) = s + µ x r 1 r ! e − ax Using the transform pair : ⇐ ⇒ with a = µ ( s + a ) r +1 f X ( x ) = µ ( µx ) r − 1 e − µx , x ≥ 0 . (17) ( r − 1)! This is the Erlang-r probability density function. The corresponding cumulative distribution function is given by r − 1 ( µx ) i � F X ( x ) = 1 − e − µx , x ≥ 0 and r = 1 , 2 , . . . (18) i ! i =0 79 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Differentiating F X ( x ) with respect to x shows that (18) is the distribution function with corresponding density function (17). r − 1 r − 1 k ( µx ) k − 1 µ ( µx ) k f X ( x ) = d � � µe − µx − e − µx dxF X ( x ) = k ! k ! k =0 k =0 r − 1 r − 1 ( µx ) k k ( µx ) k − 1 µ µe − µx + µe − µx � � − e − µx = k ! k ! k =1 k =1 r − 1 � k ( µx ) k − 1 − ( µx ) k � µe − µx − µe − µx � = k ! k ! k =1 � r − 1 �� ( µx ) k − 1 ( k − 1)! − ( µx ) k � µe − µx = 1 − k ! k =1 1 − ( µx ) r − 1 = µ ( µx ) r − 1 � � �� µe − µx ( r − 1)! e − µx . = 1 − ( r − 1)! 80 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The area under this density curve is equal to one. Let � ∞ µ r x r − 1 e − µx I r = dx, r = 1 , 2 , . . . ( r − 1)! 0 I 1 = 1 is the area under the exponential density curve. Using integration by parts: �� vdu with u = µ r − 1 x r − 1 / ( r − 1)! and dv = µe − µx dx ) udv = uv − � ∞ µ r − 1 x r − 1 µe − µx I r = dx ( r − 1)! 0 � ∞ ∞ µ r − 1 x r − 1 µ r − 1 x r − 2 � ( r − 1)! e − µx � ( r − 2)! e − µx dx = 0 + I r − 1 = + � � 0 0 It follows that I r = 1 for all r ≥ 1 . 81 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Squared coefficient of variation, C 2 X , for the family of Erlang-r distributions. X = r/µ 2 ( r/µ ) 2 = 1 C 2 r < 1 , for r ≥ 2 . “More regular” than exponential random variables. Possible values: 1 2 , 1 3 , 1 4 , · · · Allows us to approximate a constant distribution. 82 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Mixing an Erlang- ( r − 1) distribution with an Erlang- r distribution gives a distribution with 1 /r ≤ C 2 X ≤ 1 / ( r − 1) . r−1 2 1 µ µ µ α 1−α µ µ µ µ r 2 r−1 1 α = 1 ⇒ C 2 α = 0 ⇒ C 2 X = 1 / ( r − 1) ; X = 1 /r . For a given E [ X ] and C 2 X ∈ [1 /r, 1 / ( r − 1)] choose 1 � � µ = r − α � rC 2 r (1 + C 2 X ) − r 2 C 2 α = X − and E [ X ] (19) 1 + C 2 X X 83 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The Hypoexponential Distribution µ µ r−1 µ µ 2 1 r r 2 r−1 1 Two phases: exponentially distributed RVs, Y 1 and Y 2 : X = Y 1 + Y 2 . � ∞ f X ( x ) = f Y 1 ( y ) f Y 2 ( x − y ) dy −∞ � x µ 1 e − µ 1 y µ 2 e − µ 2 ( x − y ) dy = 0 � x µ 1 µ 2 e − µ 2 x e − ( µ 1 − µ 2 ) y dy = 0 µ 1 µ 2 e − µ 2 x − e − µ 1 x � � = ; x ≥ 0 . µ 1 − µ 2 84 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Corresponding cumulative distribution function is given by µ 2 µ 1 e − µ 1 x + e − µ 2 x , F X ( x ) = 1 − x ≥ 0 . µ 2 − µ 1 µ 2 − µ 1 Expectation, variance and squared coefficient of variation: � µ 2 1 + µ 2 E [ X ] = 1 + 1 V ar [ X ] = 1 + 1 C 2 2 , , and X = < 1 , µ 2 µ 2 µ 1 µ 2 µ 1 + µ 2 1 2 Laplace transform � � � � µ 1 µ 2 L X ( s ) = . s + µ 1 s + µ 2 The Laplace transform for an r phase hypoexponential random variable: � � � � � � µ 1 µ 2 µ r L X ( s ) = · · · . s + µ 1 s + µ 2 s + µ r 85 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The density function, f X ( x ) , is the convolution of r exponential densities each with its own parameter µ i and is given by r r µ i � � α i µ i e − µ i x , f X ( x ) = x > 0 where α i = , µ j − µ i i =1 j =1 , j � = i Expectation, variance and squared coefficient of variation: r r i 1 /µ 2 � 1 1 � � C 2 i E [ X ] = , V ar [ X ] = and X = i 1 /µ i ) 2 ≤ 1 . µ 2 µ i ( � i i =1 i =1 Observe that C 2 X cannot exceed 1. 86 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Example: Three exponential phases with parameters µ 1 = 1 , µ 2 = 2 and µ 3 = 3 . 3 1 = 1 1 + 1 2 + 1 3 = 11 � E [ X ] = µ i 6 i =1 3 1 = 1 1 + 1 4 + 1 9 = 49 � V ar [ X ] = µ 2 36 i i =1 121 / 36 = 36 49 / 36 C 2 = 121 = 0 . 2975 . X 87 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Probability density function of X . r r µ i � � α i µ i e − µ i x , f X ( x ) = x > 0 where α i = , µ j − µ i i =1 j =1 , j � = i r µ 1 µ 1 µ 1 = 1 1 × 1 2 = 1 � α 1 = = × µ j − µ 1 µ 2 − µ 1 µ 3 − µ 1 2 j =1 , j � = i r µ 2 µ 2 µ 2 = 2 − 1 × 2 � α 2 = = × 1 = − 4 µ j − µ 2 µ 1 − µ 2 µ 3 − µ 2 j =1 , j � = i r µ 3 µ 3 µ 3 = 3 − 2 × 3 − 1 = 9 � α 3 = = × µ j − µ 3 µ 3 − µ 1 µ 3 − µ 2 2 j =1 , j � = i It follows then that 3 α i µ i e − µ i x = (0 . 5) e − x + 8 e − 2 x + (13 . 5) e − 3 x , � f X ( x ) = x > 0 i =1 88 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The Hyperexponential Distribution Our goal now is to find a phase-type arrangement that gives larger coefficients of variation than the exponential. µ 1 α 1 1−α 1 µ 2 The density function: f X ( x ) = α 1 µ 1 e − µ 1 x + α 2 µ 2 e − µ 2 x , x ≥ 0 Cumulative distribution function: F X ( x ) = α 1 (1 − e − µ 1 x ) + α 2 (1 − e − µ 2 x ) , x ≥ 0 . 89 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Laplace transform: µ 1 µ 2 L X ( s ) = α 1 + α 2 . s + µ 1 s + µ 2 First and second moments: E [ X ] = α 1 + α 2 E [ X 2 ] = 2 α 1 + 2 α 2 and . µ 2 µ 2 µ 1 µ 2 1 2 Variance: V ar [ X ] = E [ X 2 ] − ( E [ X ]) 2 . Squared coefficient of variation: X = E [ X 2 ] − ( E [ X ]) 2 = E [ X 2 ] ( E [ X ]) 2 − 1 = 2 α 1 /µ 2 1 + 2 α 2 /µ 2 C 2 2 ( α 1 /µ 1 + α 2 /µ 2 ) 2 − 1 ≥ 1 . ( E [ X ]) 2 90 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Example: Given α 1 = 0 . 4 , µ 1 = 2 and µ 2 = 1 / 2 . 0 . 4 2 + 0 . 6 E [ X 2 ] = 0 . 8 4 + 1 . 2 E [ X ] = 0 . 5 = 1 . 40 0 . 25 = 5 √ � 5 − 1 . 4 2 = σ X = 3 . 04 = 1 . 7436 5 C 2 = 1 . 4 2 − 1 = 2 . 5510 − 1 . 0 = 1 . 5510 X 91 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University With r parallel phases and branching probabilities � r i =1 α i = 1 : µ 1 α 1 µ 2 α 2 α r−1 µ r−1 α r µ r r r µ i � � α i µ i e − µ i x , f X ( x ) = x ≥ 0; L X ( s ) = α i . s + µ i i =1 i =1 92 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University r r α i α i � � E [ X 2 ] = 2 E [ X ] = and µ 2 µ i i i =1 i =1 ( E [ X ]) 2 − 1 = 2 � r X = E [ X 2 ] i =1 α i /µ 2 C 2 i i =1 α i /µ i ) 2 − 1 . ( � r To show that this squared coefficient of variation is greater than or equal to one, it suffices to show that � r � 2 r � � α i /µ 2 α i /µ i ≤ i . i =1 i =1 Use the Cauchy-Schwartz inequality: for real a i and b i � 2 �� a 2 b 2 a i b i ≤ . i i i i i 93 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Substituting a i = √ α i and b i = √ α i /µ i implies that √ α i � 2 � 2 �� √ α i α i = µ i µ i i i � √ α i � 2 √ α i 2 � � ≤ , using Cauchy − Schwartz µ i i i �� α i α i � � = α i = , since α i = 1 . µ 2 µ 2 i i i i i i Therefore C 2 X ≥ 1 . 94 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University The Coxian Distribution α α 2 µ 1 µ 1 µ r 2 1 1−α 1−α 1 2 With probability p 1 = 1 − α 1 , process terminates after phase 1. With probability p 2 = α 1 (1 − α 2 ) , it terminates after phase 2. With probability p k = (1 − α k ) � k − 1 i =1 α i , it terminates after phase k . A Coxian distribution may be represented as a probabilistic choice from among r hypoexponential distributions: 95 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University µ 1 p 1 µ µ 1 2 p 2 p r−1 µ 2 µ µ 1 r−1 p r µ r−1 µ µ 1 µ 2 r 96 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Phase 1 is always executed and has expectation E [ X 1 ] = 1 /µ 1 . Phase 2 is executed with probability α 1 and has E [ X 2 ] = 1 /µ 2 . Phase k > 1 is executed with probability � k − 1 j =1 α j and E [ X k ] = 1 /µ k . Since the expectation of a sum is equal to the sum of the expectations: r E [ X ] = 1 + α 1 + α 1 α 2 + · · · + α 1 α 2 · · · α r − 1 A k � = , µ 1 µ 2 µ 3 µ r µ k k =1 where A 1 = 1 and, for k > 1 , A k = � k − 1 j =1 α j . The case of a Cox-2 random variable is especially important. E [ X ] = 1 + α 1 = µ 2 + αµ 1 , (20) µ 1 µ 2 µ 1 µ 2 97 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Laplace transform of a Cox-2 µ 1 µ 1 µ 2 L X ( s ) = (1 − α ) + α s + µ 1 s + µ 1 s + µ 2 ( − 1) 2 d 2 �� (1 − α ) µ 1 αµ 1 µ 2 E [ X 2 ] � = + � ds 2 s + µ 1 ( s + µ 1 )( s + µ 2 ) � s =0 d � − (1 − α ) µ 1 � − 1 − 1 �� = + α 1 µ 1 µ 2 ( s + µ 1 )( s + µ 2 ) 2 + ( s + µ 1 ) 2 ( s + µ 1 ) 2 ( s + µ 2 ) ds 2(1 − α ) µ 1 2 αµ 1 µ 2 αµ 1 µ 2 = ( s + µ 1 ) 3 + ( s + µ 1 )( s + µ 2 ) 3 + ( s + µ 1 ) 2 ( s + µ 2 ) 2 � 2 αµ 1 µ 2 αµ 1 µ 2 � + ( s + µ 1 ) 3 ( s + µ 2 ) + � ( s + µ 1 ) 2 ( s + µ 2 ) 2 � s =0 98 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University 2(1 − α ) + 2 α α + 2 α α = + + µ 2 µ 2 µ 2 µ 1 µ 2 µ 1 µ 2 1 2 1 2 + 2 α 2 α = + µ 2 µ 2 µ 1 µ 2 1 2 � 2 � 1 � 2 + 2 α 2 α � + α V ar [ X ] = + − µ 2 µ 2 µ 1 µ 2 µ 1 µ 2 1 2 2 µ 2 1 + 2 αµ 2 − ( µ 2 + αµ 1 ) 2 1 + 2 αµ 1 µ 2 = µ 2 1 µ 2 µ 2 1 µ 2 2 2 µ 2 2 + 2 αµ 2 1 − α 2 µ 2 = µ 2 2 + αµ 2 1 (2 − α ) 1 = µ 2 1 µ 2 µ 2 1 µ 2 2 2 E [ X ] 2 = µ 2 2 + αµ 2 ( µ 2 + αµ 1 ) 2 = µ 2 µ 2 1 µ 2 2 + αµ 2 X = V ar [ X ] 1 (2 − α ) 1 (2 − α ) 2 C 2 × . µ 2 1 µ 2 ( µ 2 + αµ 1 ) 2 2 (21) 99 SMF-07:PE Bertinoro, Italy

William J. Stewart Department of Computer Science N.C. State University Example: Coxian-2 RV with parameters µ 1 = 2 , µ 2 = 0 . 5 and α = 0 . 25 , 1 + α = 1 2 + 1 / 4 E [ X ] = 1 / 2 = 1 µ 1 µ 2 2 + 2 α 2 α = 2 4 + 1 / 2 1 / 4 + 1 / 2 E [ X 2 ] = + = 3 µ 2 µ 2 µ 1 µ 2 1 1 2 E [ X 2 ] − E [ X ] 2 = 3 − 1 = 2 V ar [ X ] = V ar [ X ] /E [ X ] 2 = 2 C 2 = X 100 SMF-07:PE Bertinoro, Italy

The Matrix Geometric/Analytic Methods for Structured Markov Chains - PowerPoint PPT Presentation

William J. Stewart Department of Computer Science N.C. State University The Matrix Geometric/Analytic Methods for Structured Markov Chains Markov chains whose transition matrices have a special block structure. Example: B 00 B 01 0

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

Zeros of analytic functions Lecture 14 Zeros of analytic functions Zeros of analytic functions

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

A Decision A Decision A Decision-Analytic Approach for A Decision Analytic Approach for

[3] The Matrix What is a matrix? Traditional answer Neo: What is the Matrix? Trinity: The answer

Matrix Multiplication Matrix Multiplication via Matrix-Vector Mult Defn. If matrix A is m n

Structured sparse methods for matrix factorization Francis Bach Sierra team, INRIA - Ecole

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Machine Learning Fall 2017 Structured Prediction (structured perceptron, HMM, structured SVM)

Building an IoT Platform with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Introductory Matrix Operations Matrix Entries Defn. For matrix A , notation a ij means the en-

Gov 2000: 10. Multiple Regression in Matrix Form Matthew Blackwell Fall 2016 1 / 64 1. Matrix

Liberating Communication with Matrix matthew@matrix.org http://www.matrix.org What is Matrix?

Geometric Optimization Piotr Indyk April 26, 2005 Lecture 19: Geometric Optimization Geometric

Geometric Algebra A powerful tool for solving geometric problems in visual computing Leandro A.

Structured matrix methods for polynomial computations Joab R. Winkler Department of Computer

Batched Dynamic Geometric Problems Jeff Vitter Duke University Center for Geometric and

Chapter 3: Common Families of Distributions STK4011/9011: Statistical Inference Theory Johan

Statistical Inference Lecture 3: Common Families of Distributions MING GAO DASE @ ECNU (for

Non-Parametric Priors For Generative Adversarial Networks Rajhans Singh 1 , Pavan Turaga 1,2 ,

Urban Computing Dr. Mitra Baratchi Leiden Institute of Advanced Computer Science - Leiden

Insights from Light Curve Modelling Collaborators: Christo Venter AK Harding, AS Seyffert, M

The Signature Method Nikolas Tapia NTNU Trondheim Apr. 16th, 2019 @ Santiago, Chile N. Tapia

Some homogeneous Lagrangian submanifolds in complex hyperbolic spaces Toru Kajigaya joint work