Decomposition by Subpopulations of the Bonferroni Indexes I. Valli June 23, 2016 I. Valli Università degli Studi di Milano Bicocca 1 / 66
Intro Let Y a non-negative variate, usually income, observed on N units of a finite population and, 0 ≤ y (1) ≤ . . . ≤ y ( i ) ≤ . . . ≤ y ( N ) > 0 the N ordered values. I. Valli Università degli Studi di Milano Bicocca 2 / 66
Intro The study of concentration (hereafter “inequality”) can be traced from the end of XIX sec. Since ∀ i = 1 , . . . , N � i t =1 y ( t ) where, p ( i ) = i p ( i ) ≥ q ( i ) N ; q ( i ) = T ( Y ) in 1914 Corrado Gini claimed that the inequality is more strong when stronger is the above reported inequality. In this terms, Gini suggested as point inequality measure the relative variation R ( i ) ( Y ) = p ( i ) − q ( i ) p ( i ) and as synthetic measure their weighted mean p ( i ) − q ( i ) � N − 1 · p ( i ) � N − 1 i =1 ( p ( i ) − q ( i ) ) i =1 p ( i ) ˜ R ( Y ) = = . � N − 1 � N − 1 i =1 p ( i ) i =1 p ( i ) I. Valli Università degli Studi di Milano Bicocca 3 / 66
Intro In 1930, Carlo Emilio Bonferroni suggests as measures to evaluate the inequality the point index � i t =1 y ( t ) V ( i ) ( Y ) = M ( Y ) − i M ( Y ) and as synthetic index their arithmetic mean: N − 1 1 ˜ � V ( Y ) = V ( i ) ( Y ) . N − 1 i =1 I. Valli Università degli Studi di Milano Bicocca 4 / 66
Intro In 1940 Mario De Vergottini showed that: R ( i ) ( Y ) = V ( i ) ( Y ) and N − 1 1 2 i ˜ � R ( Y ) = V ( i ) ( Y ) · N − 1 N − 1 i =1 I. Valli Università degli Studi di Milano Bicocca 5 / 66
Definitions and notation Let: i � Q ( i ) ( Y ) = y ( t ) , i = 1 , . . . N (1) t =1 be the income of the i poorest population units; N � T ( Y ) = Q ( N ) ( Y ) = y ( i ) ; (2) i =1 M ( i ) ( Y ) = Q ( i ) ( Y ) − ; (3) i M ( Y ) = T ( Y ) − = M ( N ) ( Y ) . (4) N I. Valli Università degli Studi di Milano Bicocca 6 / 66
Definitions and notation The Bonferroni (1930) point and synthetic inequality measures are: − V ( i ) ( Y ) = M ( Y ) − M ( i ) ( Y ) ; i = 1 , . . . , N (5) M ( Y ) N − 1 1 ˜ � V ( Y ) = V ( i ) ( Y ) , (6) N − 1 i =1 respectively. Note that V ( i ) ( Y ) is the relative variation of the − M ( i ) ( Y ) w.r.t. M ( Y ) , hence, ˜ lower mean V ( Y ) is their (simple) arithmetic mean. I. Valli Università degli Studi di Milano Bicocca 7 / 66
Definitions and notation In case of maximum inequality � � y (1) = . . . = y ( N − 1) = 0 , y ( N ) > 0 it is known that ˜ V ( Y ) = 1 , for all N ≥ 2 . Note that ˜ V ( Y ) does not discern among maximum inequality cases with different values of N . In the case of maximum inequality, it seems more reasonable that the value of an inequality index C N ( Y ) , evaluated on N units, is such that: (a) C N ( Y ) is an increasing and positive function of N ; (b) N →∞ C N ( Y ) = 1 . lim I. Valli Università degli Studi di Milano Bicocca 8 / 66
Definitions and notation Now, multiplying, both sides of (6) by N − 1 N , we have: N − 1 N V ′ ( Y ) = N − 1 V ( Y ) = 1 V ( i ) ( Y ) = 1 · ˜ � � N · N · V ( i ) ( Y ) . (7) N i =1 i =1 Thus, in case of maximum inequality, V ′ ( Y ) = 1 − 1 N . Note that 1 − 1 N is an increasing and positive function of N such that N →∞ 1 − 1 lim N = 1 . I. Valli Università degli Studi di Milano Bicocca 9 / 66
Definitions and notation Tabella 1: Distribution of N = 10 units and calculation of V ( i ) ( Y ) and V ( Y ) = 0 , 5157 − i y ( i ) Q ( i ) ( Y ) M ( i ) ( Y ) V ( i ) ( Y ) 1 2 2 2,00 0,9333 2 2 4 2,00 0,9333 3 8 12 4,00 0,8667 4 24 36 9,00 0,70 5 29 65 13,00 0,5667 6 37 102 17,00 0,4333 7 37 139 19,8571 0,3381 8 37 176 22 0,2667 9 62 238 26,4444 0,1185 10 62 300 30 0,00 Total 5,1566 I. Valli Università degli Studi di Milano Bicocca 10 / 66
Definitions and notation The value of V ′ ( Y ) = 1 � N i =1 V ( i ) ( Y ) can be interpreted as the N sum of the areas of N rectangles, each with basis 1 /N and height V ( i ) ( Y ) . To draw the inequality diagram V ( i ) ( Y ) , it is needed, first of all, to obtain N points of coordinates � i � N , V ( i ) ( Y ) . Then, we obtain N rectangles by the following 0 , 1 � � procedure: the first rectangle has abscissas in the interval N � � and ordiantes in the interval 0 , V (1) ( Y ) . The i − th rectangle, � i − 1 N , i � i = 2 , . . . , N , has abscissas in the interval and N � � ordinates in the interval 0 , V ( i ) ( Y ) . Figure 1 reports the graphs of V ( i ) ( Y ) . I. Valli Università degli Studi di Milano Bicocca 11 / 66
Definitions and notation Figura 1: Graphs of V ( i ) ( Y ) 1.0 ● ● ● ● 0.8 ● 0.6 ● V ( i ) ( Y ) ● 0.4 ● ● 0.2 ● 0.0 ● 0.0 0.2 0.4 0.6 0.8 1.0 p i I. Valli Università degli Studi di Milano Bicocca 12 / 66
Definitions and notation in the frequency distribution framework The last column of Table 1 shows that V ( i ) ( Y ) may not be constant for units taking the same value of Y . This behavior of V ( i ) ( Y ) is not reasonable in the decomposition by subpopulation because units with the same value of Y may belong to different subpopulations. We will overcome this situation by substituting the values of V ( i ) ( Y ) corresponding to units with the same value y h of Y with the value V ( P h. ) ( Y ) , where P h. is the number of units with Y ≤ y h . I. Valli Università degli Studi di Milano Bicocca 13 / 66
Definitions and notation in the frequency distribution framework Let, { 0 ≤ y 1 < . . . < y h < . . . < y r } denote the set of the r distinct values assumed by the variate Y over the k subpopulations. It is possible to report the whole r × k bivariate distribution of the N units as shown in Table 2, where: n hg denotes the frequency of y h in the subpopulation g ; n h. = � k g =1 n hg is the frequency of y h in the whole population and n .g = � r h =1 n hg is the frequency of the subpopulation g . I. Valli Università degli Studi di Milano Bicocca 14 / 66
Definitions and notation in the frequency distribution framework Tabella 2: Bivariate r × k distribution of the whole population partitioned into k subpopulations. Subpopulations 1 . . . g . . . k Total y 1 n 11 . . . n 1 g . . . n 1 k n 1 . . . . . . ... ... . . . . . . . . . . y h n h 1 . . . n hg . . . n hk n h. . . . . . ... ... . . . . . . . . . . y r n r 1 . . . n rg . . . n rk n r. Total n . 1 . . . n .g . . . n .k N I. Valli Università degli Studi di Milano Bicocca 15 / 66
Definitions and notation in the frequency distribution framework Let us define, for the overall distribution { ( y h , n h. ) : h = 1 , . . . , r } : h � P h. = P h. ( Y ) = n t. , (8) t =1 h � Q h. ( Y ) = Q ( P h. ) ( Y ) = y t · n t. , (9) t =1 r � T ( Y ) = Q r. ( Y ) = y h · n h. (10) h =1 M h. ( Y ) = Q h. ( Y ) − . (11) P h. − Note that, M ( Y ) = M r. ( Y ) . I. Valli Università degli Studi di Milano Bicocca 16 / 66
Definitions and notation in the frequency distribution framework � � For the distribution ( y h , n hg ) : h = 1 , . . . , r ; g = 1 , . . . , k , of the subpopulation g , let: h � P hg = P hg ( Y ) = n tg , (12) t =1 h � Q hg ( Y ) = Q ( Phg ) ( Y ) = y t · n tg , (13) t =1 r � T g ( Y ) = Q rg ( Y ) = y h · n hg , (14) h =1 T g M g ( Y ) = (15) n .g o ( g ) = min h : n hg > 0 (16) y o ( g ) for h < o ( g ) − M hg ( Y ) = Qhg ( Y ) , (17) for h ≥ o ( g ) Phg − where M hg ( Y ) in (17) denotes the mean of the first poorest P hg units. Note that, from (17) − and (15), follows M g ( Y ) = M rg ( Y ) . I. Valli Università degli Studi di Milano Bicocca 17 / 66
Definitions and notation in the frequency distribution framework Note that, from (12) to (14), we can deduce the quantities defined in (8)-(11): k � P h. = P hg ( Y ) (18) g =1 k � Q h. = Q hg ( Y ) (19) g =1 k � T ( Y ) = T g ( Y ) (20) g =1 I. Valli Università degli Studi di Milano Bicocca 18 / 66
Definitions and notation in the frequency distribution framework We can now define the Bonferroni inequality measures in the frequency distribution framework. From (9) we have: N V ′ ( Y ) = 1 � V ( i ) ( Y ) N i =1 r P h. = 1 � � V ( i ) ( Y ) . (21) N h =1 i =1+ P h. − n h. I. Valli Università degli Studi di Milano Bicocca 19 / 66
Definitions and notation in the frequency distribution framework In order to assign same point inequality measure to units that have same value Y = y h , we set, for 1 + P h. − n h. ≤ i ≤ P h. : − V ( i ) ( Y ) = V ( P h. ) ( Y ) = M ( Y ) − M ( P h. ) ( Y ) M ( Y ) − = M ( Y ) − M h. ( Y ) M ( Y ) = V h ( Y ) (22) I. Valli Università degli Studi di Milano Bicocca 20 / 66
Recommend
More recommend