biased and unbiased samples
play

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 - PowerPoint PPT Presentation

Definitions and Some Examples of Biased Samples Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125 Definitions and Some Examples of Biased Samples Definitions and Some Examples of Biased Samples All


  1. Definitions and Some Examples of Biased Samples Distribution of Y ∗ is G ( y ∗ | Y > c ) = F ( y ∗ | Y > c ) = F ( y ∗ | ∆ = 1) (6a) F ( y ∗ ) 1 − F ( c ) , y ∗ > c . = Point mass at Y ∗ = 0 (Convention) for Y ∗ = 0 (∆ = 0) . (6b) 22 / 125

  2. Definitions and Some Examples of Biased Samples Observe that (6a) is obtained from (1) by setting ω ( y ∗ ) = 1 if y > c , and ω ( y ∗ ) = 0 otherwise, and integrating up with respect to y ∗ . The distribution of ∆ is Pr(∆ = δ ) = [1 − F ( c )] δ [ F ( c )] 1 − δ , δ ∈ { 0 , 1 } . The joint distribution of ( Y ∗ , ∆) for a censored sample: F ( y ∗ , δ ) F ( y ∗ | δ )Pr( δ ) = (7) � � δ F ( y ∗ ) [1 − F ( c )] δ (1) 1 − δ [ F ( c )] 1 − δ = (1 − F ( c )) [ F ( y ∗ )] δ [ F ( c )] 1 − δ . = 23 / 125

  3. Definitions and Some Examples of Biased Samples (7) is obtained from (4) by setting ω ( y ) = 0 y < c , ω ( y ) = 1 otherwise, by setting i ( y ) = ω ( y ), and by integrating up with respect to y ∗ . For normally distributed Y : (7) is “Tobit” model. 24 / 125

  4. Definitions and Some Examples of Biased Samples More information in a censored sample than in a truncated sample because one can obtain (6a) from (7) (by conditioning on ∆ = 1) but not vice versa. 25 / 125

  5. Definitions and Some Examples of Biased Samples Inferences about the population distribution based on assuming that F ( y ∗ | Y > c ) closely approximates F ( y ) are potentially very misleading. A description of population income inequality based on a subsample of high income people may convey no information about the true population distribution. 26 / 125

  6. Definitions and Some Examples of Biased Samples Without further information about F and its support, it is not possible to recover F from G ( y ∗ ) from either a censored or a truncated sample. Access to a censored sample enables the analyst to recover F ( y ) for y > c but obviously does not provide any information on the shape of the true distribution for values of y ≤ c . 27 / 125

  7. Definitions and Some Examples of Biased Samples Problem is routinely “solved” by assuming that F is of a known functional form. This solution strategy does not always work. If F is normal, then it can be recovered from a censored or truncated sample (Pearson, 1900). If F is Pareto, F cannot be recovered from either a truncated or a censored sample (see Flinn and Heckman, 1982b). Show this. If F is real analytic (i.e., possesses derivatives of all order) and the support of Y is known, then F can be recovered (Heckman and Singer, 1986). 28 / 125

  8. Definitions and Some Examples of Biased Samples Example 2. Expand the previous discussion to a linear regression setting . Let Y = ❳ β + U (8) be the population earnings function where Y is earnings. “ β ”: suitably dimensioned parameter vector. ❳ is a regressor vector assumed to be distributed independently of mean zero disturbance U . ⊥ X ; E ( XX ′ ) full rank, E ( U ) = 0. U ⊥ 29 / 125

  9. Definitions and Some Examples of Biased Samples Data are collected on incomes of persons for whom Y exceeds c . Weight depends solely on y : ω ( y , ① ) = 0 , y ≤ c , ω ( y , ① ) = 1 , y > c . Can identify the sample distribution of Y above c the sample distribution of ❳ for Y above c and the proportion of the original random sample with income below c . Do not know Y below c . 30 / 125

  10. Definitions and Some Examples of Biased Samples As before, let Y ∗ = Y if Y > c . Define Y ∗ = 0 otherwise. ∆ = 1 if Y > c , ∆ = 0 otherwise. The probability of the event ∆ = 1 given ❳ = ① is Pr(∆ = 1 | ❳ = ① ) = Pr( Y > c | ❳ = ① ) = Pr( U > c − ① β | ❳ = ① ) . 31 / 125

  11. Definitions and Some Examples of Biased Samples Invoke independence between U and ❳ and letting F u denote the distribution of U , Pr(∆ = 1 | ❳ = ① ) = 1 − F u ( c − ① β ) (9a) and Pr(∆ = 0 | ❳ = ① ) = F u ( c − ① β ) . (9b) 32 / 125

  12. Definitions and Some Examples of Biased Samples The distribution of Y ∗ conditional on ❳ : G ( y ∗ | Y > 0 , ❳ = ① ) F ( y ∗ | X = x , Y > c ) = (10a) F ( y ∗ | ❳ = ① , ∆ = 1) = F u ( y ∗ − ① β ) y ∗ > c . = 1 − F u ( c − ① β ) , G ( y ∗ | Y ≤ 0) = 1 for Y ∗ = 0 (∆ = 0) . (10b) 33 / 125

  13. Definitions and Some Examples of Biased Samples The joint distribution of ( Y ∗ , ∆) given ❳ = ① is ① ) = F ( y ∗ | δ, ① ) Pr ( δ | ① ) F ( y ∗ , δ | ❳ = (11) { F u ( y ∗ − ① β ) } δ { F u ( c − ① β ) } 1 − δ . = In particular, E ( Y ∗ | ❳ = ① , ∆ = 1) = ① β + E ( U | ❳ = ① , δ = 1) (12) � ∞ z d F u ( z ) = ① β + (1 − F u ( c − ① β )) c − ① β z : dummy variable of integration. 34 / 125

  14. Definitions and Some Examples of Biased Samples Population mean regression function is E ( Y | ❳ = ① ) = ① β. (13) Contrast between (12) and (13) illuminating. When theoretical model is estimated on a selected sample (∆ = 1), the true conditional expectation is (12) not (13). 35 / 125

  15. Definitions and Some Examples of Biased Samples The conditional mean of U depends on ① . Omitted variable analysis, E ( U | ❳ = ① , ∆ = 1): omitted from the regression. Likely to be correlated with ① . Least squares estimates of β obtained on selected samples which do not account for selection are biased and inconsistent. 36 / 125

  16. Definitions and Some Examples of Biased Samples Illustrate the nature of the bias, it is useful to draw on the work of Cain and Watts (1973). Suppose that X is a scalar random variable (e.g., education) and that its associated coefficient is positive ( β > 0). Under conventional assumptions about U (e.g., mean zero, independently and identically distributed and distributed independently of X ), the population regression of Y on X is a straight line. The scatter about the regression line and the regression line are given in Figure 1. 37 / 125

  17. Definitions and Some Examples of Biased Samples Figure 1: Y Population regression Selected sample regression c 38 / 125

  18. Definitions and Some Examples of Biased Samples When Y > c is imposed as a sample inclusion requirement, lower population values of U are excluded from the sample in a way that systematically depends on x . ( Y > c or U > c − x β ). As x increases and β > 0, the conditional mean of U : [ E ( U | X = x , ∆ = 1)] decreases. Regression estimates of β that do not correct for sample selection (i.e., include E ( U | X = x , ∆ = 1) Downward biased because of the negative correlation between x and E ( U | X = x , ∆ = 1). Flattened regression line for the selected sample in Figure 1. 39 / 125

  19. Definitions and Some Examples of Biased Samples In models with more than one regressor, no sharp result on the sign of the bias in the regression estimate that results from ignoring the selected nature of the sample is available. Conventional least squares estimates of β obtained from selected samples are biased and inconsistent remains true. 40 / 125

  20. Definitions and Some Examples of Biased Samples Fruitful to distinguish between the case of a truncated sample and the case of a censored sample. In the truncated sample case, no information is available about the fraction of the population that would be allocated to the truncated sample [Pr (∆ = 1)]. In the censored sample case, this fraction is known or can be consistently estimated. Fruitful to distinguish two further cases: Case (a), the case in which ❳ is not observed when ∆ = 0. Case (b) is the one most fully developed in the literature: X observed when D = 0. 41 / 125

  21. Definitions and Some Examples of Biased Samples Conditional mean E ( U | ❳ = ① , ∆ = 1) is a function of c − ① β solely through Pr(∆ = 1 | ① ) . Since Pr(∆ = 1 | ① ) is monotonic in c − ① β . The conditional mean depends solely on Pr(∆ = 1 | ① ) and the parameters F u i.e., since F − 1 u (1 − Pr(∆ = 1 | x )) = c − ① β ∞ � zdF u ( z ) E ( U | X = x , ∆ = 1) = Pr(∆ = 1 | ① ) F − 1 [1 − Pr(∆=1 | ① )] u = K ( P (∆ = 1 | x )) lim P (∆ = 1 | x ) → 1 , K ( P (∆ = 1 | x )) = 0 . 42 / 125

  22. Definitions and Some Examples of Biased Samples This relationship demonstrates that the conditional mean is a function of the probability of selection. As the probability of selection goes to 1, the conditional mean goes to zero. For samples chosen so that the values of ① are such that the observations are certain to be included the sample, there is no problem in using ordinary least squares on selected samples to estimate β . Thus in Figure 1, ordinary least squares regressions fit on samples selected to have large ① values closely approximate the true regression function and become arbitrarily close as ① becomes large. 43 / 125

  23. Definitions and Some Examples of Biased Samples The conditional mean in (12) is a surrogate for Pr(∆ = 1 | ① ) . As this probability goes to one, the problem of sample selection in regression analysis becomes negligibly small. Much more general idea Heckman (1976) demonstrates that β and F u are identified if U is normally distributed and standard conditions invoked in regression analysis are satisfied. In Newey; Gallant and Nycha, Powell, etc., F u is consistently nonparametrically estimated. 44 / 125

  24. Definitions and Some Examples of Biased Samples Example 3 : censored random variables . This concept extends the notion of a truncated random variable by letting a more general rule than truncation on the outcome of interest generate the selected sample. Because the sample generating rule may be different from a simple truncation of the outcome being studied, the concept of a censored random variable in general requires at least two distinct random variables. 45 / 125

  25. Definitions and Some Examples of Biased Samples Let Y 1 be the outcome of interest. Let Y 2 be another random variable. Denote observed Y 1 by Y ∗ 1 . If Y 2 < c , Y 1 is observed. Otherwise Y 1 is not observed and we can set Y ∗ 1 = 0 or any other convenient value (assuming that Y 1 has no point mass at Y 1 = 0 or at the alternative convenient value). In weighting function ω ; ω ( y 1 , y 2 ) = 0 if y 2 > c . ω ( y 1 , y 2 ) = 1 if y 2 ≤ c . 46 / 125

  26. Definitions and Some Examples of Biased Samples Selection rule Y 2 < c does not necessarily restrict the range of Y 1 . Thus Y ∗ 1 is not in general a truncated random variable. Define ∆ = 1 if Y 2 < c ; ∆ = 0 otherwise. 47 / 125

  27. Definitions and Some Examples of Biased Samples If F ( y 1 , y 2 ) is the population distribution of ( Y 1 , Y 2 ), the distribution of ∆ is Pr(∆ = δ ) = [1 − F 2 ( c )] 1 − δ [ F 2 ( c )] δ , δ = 0 , 1 , F 2 is the marginal distribution of Y 2 . 48 / 125

  28. Definitions and Some Examples of Biased Samples The distribution of Y ∗ 1 is 1 ; δ = 1) = F ( y ∗ 1 ; c ) G ( y ∗ 1 ) = F ( y ∗ ∆ = 1 , (14a) F 2 ( c ) , G ( y ∗ 1 = 0) = 1 , ∆ = 0 . (14b) (14a): the distribution function corresponding to the density in (1) when ω ( y 1 , y 2 ) = 1 if y 2 ≤ c and ω ( y 1 , y 2 ) = 0 otherwise. 49 / 125

  29. Definitions and Some Examples of Biased Samples The joint distribution of ( Y ∗ 1 , ∆) is G ( y ∗ 1 , δ ) = [ F ( y ∗ 1 ; c )] δ [1 − F 2 ( c )] 1 − δ . (15) This is the distribution function corresponding to density (4) for the special weighting rule of this example. In a censored sample, under general conditions it is possible to consistently estimate Pr(∆ = δ ) and G ( y ∗ 1 ). 50 / 125

  30. Definitions and Some Examples of Biased Samples In a truncated sample, only conditional distribution (14a) can be estimated. A degenerate version of this model has Y 1 ≡ Y 2 . In that case, censored random variable Y 1 is also a truncated random variable. Note that a censored random variable may be defined for a truncated or censored sample. 51 / 125

  31. Definitions and Some Examples of Biased Samples Example 3: Let Y 1 be the wage of a woman. Wages of women are observed only if women work. Let Y 2 be an index of a woman’s propensity to work. 52 / 125

  32. Definitions and Some Examples of Biased Samples Y 2 is postulated as the difference between reservation wages (the value of time at home determined from household preference functions) and potential market wages Y 1 . Then if Y 2 < 0, the woman works. Otherwise, she does not. Y ∗ 1 = Y 1 if Y 2 < 0 is the observed wage. 53 / 125

  33. Definitions and Some Examples of Biased Samples If Y 1 is the offered wage of an unemployed worker, and Y 2 is the difference between reservation wages (the return to searching) and offered market wages, Y ∗ 1 = Y 1 if Y 2 < 0 is the accepted wage for an unemployed worker (see Flinn and Heckman, 1982a). If Y 1 is the potential output of a firm and Y 2 is its profitability, Y ∗ 1 = Y 1 if Y 2 > 0. If Y 1 is the potential income in occupation one and Y 2 is the potential income in occupation two. 54 / 125

  34. Definitions and Some Examples of Biased Samples Y ∗ 1 = Y 1 if Y 1 − Y 2 < 0 while Y ∗ 2 = Y 2 if Y 1 − Y 2 ≥ 0. 55 / 125

  35. Definitions and Some Examples of Biased Samples Example 4 . Builds on example 3 by introducing regressors. This produces the censored regression model Heckman (1976, 1979). In example 3 set Y 1 = ❳ 1 β 1 + U 1 (16a) Y 2 = ❳ 2 β 2 + U 2 (16b) where ( ❳ 1 , ❳ 2 ) are distributed independently of ( U 1 , U 2 ) , a mean zero, finite variance random vector. 56 / 125

  36. Definitions and Some Examples of Biased Samples Conventional assumptions are invoked to ensure that if Y 1 and Y 2 can be observed, least squares applied to a random sample of data on ( Y 1 , Y 2 , ❳ 1 , ❳ 2 ) would consistently estimate β 1 and β 2 . Y ∗ 1 = Y 1 if Y 2 < 0. If Y 2 < 0 , ∆ = 1. Regression function for the selected sample is E ( Y ∗ 1 | ❳ 1 = ① 1 , Y 2 < 0) = E ( Y ∗ 1 | ❳ 1 = ① 1 , ∆ = 1) = ❳ 1 β 1 + E ( U 1 | ❳ 1 = ① 1 , ∆ = 1) (17) Regression function for the population is E ( Y 1 | ❳ 1 = ① 1 ) = ❳ 1 β 1 . (18) 57 / 125

  37. Definitions and Some Examples of Biased Samples The conditional mean is a surrogate for the probability of selection [Pr(∆ = 1 | ① 2 )]. As Pr(∆ = 1 | x 2 ) goes to one, the problem of sample selection bias becomes negligible. In the censored regression case, a new phenomenon appears. If there are variables in ❳ 2 not in ❳ 1 , such variables may appear to be statistically important determinants of Y 1 when ordinary least squares is applied to data generated from censored samples. 58 / 125

  38. Definitions and Some Examples of Biased Samples Example: suppose that survey statisticians use some extraneous (to X 1 ) variables to determine sample enrollment. Such variables may appear to be important determinants of Y 1 when in fact they are not. They are important determinants of Y 1 when in fact they are not. They are important determinants of Y ∗ 1 . 59 / 125

  39. Definitions and Some Examples of Biased Samples In an analysis of self-selection, let Y 1 be the wage that a potential worker could earn were they to accept a market offer. Let Y 2 be the difference between the best non-market opportunity available to the potential worker and Y 1 . If Y 2 < 0, the agent works. The conditional expectation of observed wages ( Y ∗ 1 = Y , if Y 2 < 0) given ① 1 and ① 2 will be a non-trivial function of ① 2 . 60 / 125

  40. Definitions and Some Examples of Biased Samples Thus variables determining non-market opportunities will determine Y ∗ 1 , even though they do not determine Y 1 . For example, the number of children less than six may appear to be significant determinants of Y 1 when inadequate account is taken of sample selection, even though the market does not place any value or penalty on small children in generating wage offers for potential workers. 61 / 125

  41. Definitions and Some Examples of Biased Samples Example 5. Length biased sampling. Let T be the duration of an event such as a completed unemployment spell or a completed duration of a job with an employer. The population distribution of T is F ( t ) with density f ( t ). The sampling rule is such that population unemployment spells are sampled at random. Data are recorded on a completed spell provided that at the time of the interview the individual is experiencing the event. Such sampling rules are in wide use in many national surveys of employment and unemployment. Make a distinction between: Population distribution of T 1 And sampled distribution of ❚ 2 62 / 125

  42. Definitions and Some Examples of Biased Samples In order to have a sampled completed spell, a person must be in the state at the time of the interview. Let “0” be the date of the survey. Decompose any completed spell T into a component that occurs before the survey T b and a component that occurs after the survey T a . Then T = T a + T b . For a person to be sampled, T b > 0. The density of T given T b = t b is f ( t ) f ( t | t b ) = 1 − F ( t b ) , t ≥ t b . (19) 63 / 125

  43. Definitions and Some Examples of Biased Samples Suppose that the environment is stationary. The population entry rate into the state at each instant of time is k . From each vintage of entrants into the state distinguished by their distance from the survey date t b , only 1 − F ( t b ) = Pr( T > t b ) survive. People with this duration entered at time t = − t b . Aggregating over all cohorts of entrants, the population proportion in the state at the date of the interview is P where � ∞ P = k (1 − F ( t b )) dt b (20) 0 which is assumed to exist (a requirement for finite mean of T b ). In a duration of unemployment example, P is the aggregate unemployment rate (proportion of population unemployed at the date of the survey). 64 / 125

  44. Definitions and Some Examples of Biased Samples Let ∗ denote random variables defined in sampled population . The density of T ∗ b , sampled presurvey duration, is b > 0) = k (1 − F ( t ∗ b )) g ( t ∗ b | t ∗ (21) . P The density of sampled completed durations is thus � t ∗ g ( t ∗ ) f ( t ∗ | t ∗ b ) f ( t ∗ b | t ∗ b > 0) dt ∗ = b 0 � t ∗ f ( t ∗ ) 1 − F ( t ∗ b ) dt ∗ = k b 1 − F ( t ∗ b ) P 0 k t ∗ f ( t ∗ ) = . P Length biased sampling . 65 / 125

  45. Definitions and Some Examples of Biased Samples Integration by parts: � ∞ � ∞ P = k (1 − F ( z )) dz = k zdF ( z ) = kE ( T ) . 0 0 Note that g ( t ∗ ) = t ∗ f ( t ∗ ) (22) E ( T ) . We know g ( t ∗ ) from data. t ∗ , t ∗ > 0. Can form g ( t ∗ ) ∴ we know f ( t ∗ ) E ( T ) . =1 Known � �� � � �� � � ∞ � ∞ 0 f ( t ∗ ) dt ∗ g ( t ∗ ) dt ∗ = Apply analysis of (5): . t ∗ E ( T ) 0 � �� � can determine this ∴ know f ( t ∗ ). 66 / 125

  46. Definitions and Some Examples of Biased Samples In this form (22) is equivalent to (1) with ω ( t ) = t . E ( T ). Length biased sampling. Intuitively, longer spells are oversampled when the requirement is imposed that a spell be in progress at the time the survey is conducted ( T b > 0). Suppose, instead, that individuals are randomly sampled and data are recorded on the next spell of the event (after the survey date). We recover population f ( t ) if spells independent . 67 / 125

  47. Definitions and Some Examples of Biased Samples As long as successive spells are independent, such a sampling frame does not distort the sampled distribution because no requirement is imposed that the sampled spell be in progress at the date of the interview. It is important to notice that the source of the bias is the requirement that T b > 0 (i.e., sampled spells are in progress), not that only a fraction of the population experiences the event ( P < 1). 68 / 125

  48. Definitions and Some Examples of Biased Samples The simple length weight ( ω ( t ) = t ) that produces (22) is an artefact of the stationarity assumption. Heckman and Singer (1986): non-stationarity and unobservables when there is selection on the event that a person be in the state at the time of the interview. They also demonstrate the bias that results from estimating parametric models on samples generated by length biased sampling rules when inadequate account is taken of the sampling plan. 69 / 125

  49. Definitions and Some Examples of Biased Samples The probability that a spell lasts until t c given that it has lasted t b f ( t c ) g ( t c | t b ) = 1 − F ( t b ) So the density of a spell that lasts for t c is � t c g ( t c ) = f ( t c | T c > T > T b ) Pr ( T c ≥ T ) dt b 0 � t c f ( t c ) m dt b = f ( t c ) t c = m 0 70 / 125

  50. Definitions and Some Examples of Biased Samples Likewise, the density of a spell that lasts until t a is � ∞ g ( t a ) = f ( t a + t b | T ≥ T b ≥ 0) Pr ( T ≥ T b ≥ 0) dt b 0 � ∞ f ( t a + t b ) = dt b m 0 � ∞ 1 = f ( t b ) dt b m t a 1 − F ( t a ) = m So the functional form of g ( t b ) = g ( t a ). Stationarity ⇒ backward and forward densities same. Mirror images. “Back to the future.” 71 / 125

  51. Definitions and Some Examples of Biased Samples Some useful results that follow from this model: If f ( t ) = θ e − t θ , then g ( t b ) = θ e − t b θ and g ( t a ) = θ e − t a θ . 1 Proof : 2 θ e − t θ → m = 1 f ( t ) = θ, 1 − e − t θ → g ( t a ) = 1 − F ( t ) = θ e − t θ F ( t ) = m 72 / 125

  52. Definitions and Some Examples of Biased Samples 2 (1 + σ 2 E ( T a ) = m m 2 ). 1 Proof : � � 1 − F ( t a ) E ( T a ) = t a f ( t a ) dt a = t a dt a m � 1 � 1 � 1 2 t 2 a (1 − F ( t a )) | ∞ 2 t 2 = 0 − a d (1 − F ( t a )) m � 1 1 a f ( t a ) dt a = 1 2 t 2 2 m [ var ( t a ) + E 2 ( t a )] = m 1 2 m [ σ 2 + m 2 ] = 73 / 125

  53. Definitions and Some Examples of Biased Samples 2 (1 + σ 2 E ( T b ) = m m 2 ). 1 Proof : See proof of Proposition 2. 2 E ( T c ) = m (1 + σ 2 m 2 ). 3 Proof : 4 � t 2 c f ( t c ) dt c = 1 m ( var ( t c ) + E 2 ( t c )) E ( T c ) = m → E ( T c ) = 2 E ( T a ) = 2 E ( T b ) , E ( T c ) > m unless σ 2 = 0 74 / 125

  54. Definitions and Some Examples of Biased Samples Examples 75 / 125

  55. Definitions and Some Examples of Biased Samples Specification of the Distribution Weibull Distribution Parameters: λ > 0 , k > 0 Probability Density Function (PDF): � t � t � � k � � k − 1 λ exp − λ k k Cumulative Density Function: � t � � k � 1 − exp − k Set of Parameters:   λ 1 , k 1 = 0 . 5 λ 2 , k 1 = 1 . 0    , respectively   λ 3 , k 1 = 2 . 0  λ 3 , k 1 = 3 . 0 76 / 125

  56. Definitions and Some Examples of Biased Samples Basic Distribution Graphs ��� ��� &������ ������������ ��� �� &������ ������������ 3 1 Weibull Distribution λ = 0.1, k = 0.5 0.9 Weibull Distribution λ = 0.5, k = 1.0 Weibull Distribution λ = 0.5, k = 2.0 2.5 s 0.8 Weibull Distribution λ = 1.0, k = 3.0 n o i l t l u u b b 0.7 i i r e t 2 s W i D : n 0.6 l l o u b t i u i e b W 1.5 0.5 i r t s : i s D l e l e 0.4 p h S t f e 1 o h F 0.3 t f D o C F Weibull Distribution λ = 0.1, k = 0.5 D 0.2 P 0.5 Weibull Distribution λ = 0.5, k = 1.0 Weibull Distribution λ = 0.5, k = 2.0 0.1 Weibull Distribution λ = 1.0, k = 3.0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 t t 77 / 125

  57. Definitions and Some Examples of Biased Samples Basic Duration Graphs �� ��� �������� ��� &������ ������������ !���"����� �� ��� �������� ��� &������ 10 10 Weibull Distribution λ = 0.1, k = 0.5 Weibull Distribution λ = 0.1, k = 0.5 Integrated Hazard Function of the Distribution: Weibull 9 Weibull Distribution λ = 0.5, k = 1.0 9 Weibull Distribution λ = 0.5, k = 1.0 Hazard Function of the Distribution: Weibull Weibull Distribution λ = 0.5, k = 2.0 Weibull Distribution λ = 0.5, k = 2.0 8 8 Weibull Distribution λ = 1.0, k = 3.0 Weibull Distribution λ = 1.0, k = 3.0 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 t t 78 / 125

  58. Definitions and Some Examples of Biased Samples Observed and Original Distribution for T b (Example 1) 3 The Observed PDF of Spells (T b ) The Original PDF (Weibull Distribution λ = 0.1, k = 0.5) Observed (T b ) and Original PDFs of the Spells 2.5 2 1.5 1 0.5 0 0 0.5 1 1.5 2 t 79 / 125

  59. Definitions and Some Examples of Biased Samples Observed and Original Distribution for T b (Example 2) 2.5 The Observed PDF of Spells (T b ) The Original PDF (Weibull Distribution λ = 0.5, k = 2.0) Observed (T b ) and Original PDFs of the Spells 2 1.5 1 0.5 0 0 0.5 1 1.5 t 80 / 125

  60. Definitions and Some Examples of Biased Samples Observed and Original Distribution for T b (Example 3) 1.6 The Observed PDF of Spells (T b ) The Original PDF (Weibull Distribution λ = 1.0, k = 3.0) 1.4 Observed (T b ) and Original PDFs of the Spells 1.2 1 0.8 0.6 0.4 0.2 0 0 0.5 1 1.5 2 t 81 / 125

  61. Definitions and Some Examples of Biased Samples Observed and Original Distribution for T c (Example 1) 3 The Observed PDF of Spells (T c ) The Original PDF (Weibull Distribution λ = 0.1, k = 0.5) s l l 2.5 e p S e h t f o 2 s F D P l a n 1.5 i g i r O d n a ) 1 T c ( d e v r e 0.5 s b O 0 0 0.5 1 1.5 2 t 82 / 125

  62. Definitions and Some Examples of Biased Samples Observed and Original Distribution for T c (Example 2) 2 The Observed PDF of Spells (T c ) 1.8 The Original PDF (Weibull Distribution λ = 0.5, k = 1.0) s l l e p 1.6 S e h t 1.4 f o s F D 1.2 P l a n i 1 g i r O d 0.8 n a ) T c 0.6 ( d e v 0.4 r e s b O 0.2 0 0 0.5 1 1.5 2 t 83 / 125 �

  63. Definitions and Some Examples of Biased Samples Observed and Original Distribution for T c (Example 3) 2.5 The Observed PDF of Spells (T c ) The Original PDF (Weibull Distribution λ = 0.5, k = 2.0) Observed (T c ) and Original PDFs of the Spells 2 1.5 1 0.5 0 0 0.5 1 1.5 t 84 / 125

  64. Definitions and Some Examples of Biased Samples Observed and Original Distribution for T c (Example 4) 1.6 The Observed PDF of Spells (T c ) The Original PDF (Weibull Distribution λ = 1.0, k = 3.0) 1.4 Observed (T c ) and Original PDFs of the Spells 1.2 1 0.8 0.6 0.4 0.2 0 0 0.5 1 1.5 2 t 85 / 125

  65. Definitions and Some Examples of Biased Samples Example 6. Choice based sampling. Let D be a discrete valued random variable which assumes a finite number of values I . Discrete choice model. D = i , i = 1 , . . . , I corresponds to the occurrence of state i . States are mutually exclusive. In the existing literature the states may be modes of transportation choice for commuters (Domencich and McFadden, 1975), occupations, migration destinations, financial solvency status of firms, schooling choices of students, etc. 86 / 125

  66. Definitions and Some Examples of Biased Samples Interest centers on estimating a population choice model Pr( D = i | ❳ = ① ) , i = 1 , . . . , I . (23) The population density of ( D , ❳ ) is f ( d , ① ) = Pr( D = d | ❳ = ① ) h ( x ) (24) where, in this example, h ( ① ) is the population density of the ❳ . 87 / 125

  67. Definitions and Some Examples of Biased Samples For example, interviews about transportation preferences conducted at train stations tend to over-sample train riders and under-sample bus riders. Interviews about occupational choice preferences conducted at leading universities over-sample those who select professional occupations. 88 / 125

  68. Definitions and Some Examples of Biased Samples In choice based sampling, selection occurs solely on the D coordinate of ( D , ❳ ). In terms of (1) (extended to allow for discrete random variables), ω ( d , ❳ ) = ω ( d ). Then sampled ( D ∗ , ❳ ∗ ) has density ω ( d ∗ ) f ( d ∗ , ① ∗ ) g ( d ∗ , ① ∗ ) = . (25) I � � ω ( i ) f ( i , x ∗ )d x ∗ i =1 89 / 125

  69. Definitions and Some Examples of Biased Samples Notice that the dominator can be simplified to I � ω ( i ) f ( i ) i =1 f ( d ∗ ) is the marginal distribution of D ∗ so that g ( d ∗ , ① ∗ ) = ω ( d ∗ ) f ( d ∗ , ① ∗ ) . (26) I � ω ( i ) f ( i ) i =1 90 / 125

  70. Definitions and Some Examples of Biased Samples Integrating (25) with respect to ① using (26) we obtain g ( d ∗ ) = ω ( d ∗ ) f ( d ∗ ) (27) I � ω ( i ) f ( i ) i =1 Sampling rule causes the sampled proportions to deviate from the population proportions. 91 / 125

  71. Definitions and Some Examples of Biased Samples Note further that as a consequence of sampling only on D , the population conditional density h ( ① ∗ | d ∗ ) = f ( d ∗ , x ∗ ) (28) f ( d ∗ ) can be recovered from the choice based sample. The density of x in the sample is thus I � g ( x ∗ ) = h ( x ∗ | i ) g ( i ) . (29) i =1 92 / 125

  72. Definitions and Some Examples of Biased Samples Then using (26)-(29) we reach g ( d ∗ | x ∗ ) f ( d ∗ | x ∗ ) = (30)             ω ( d ∗ )     1       × .     I I  � �     f ( i | x ∗ ) g ( i )   ω ( i ) f ( i )     f ( i )  i =1 i =1 The bias that results from using choice based samples to make inference about f ( d ∗ | x ∗ ) is a consequence of neglecting the terms in braces on the right-hand side of (30). 93 / 125

  73. Definitions and Some Examples of Biased Samples Notice that if the data are generated by a random sampling rule, ω ( d ∗ ) = 1 , g ( d ∗ ) = f ( d ∗ ) and the term in braces is one. 94 / 125

  74. Definitions and Some Examples of Biased Samples Further Discussion of Choice Based Samples 95 / 125

  75. Definitions and Some Examples of Biased Samples Pick D first ( e.g. travel mode). Probability of selecting D is C ( D ) . f ( D , X ) is the joint density of D and X in the population. f ( D , X | θ ) = g ( D | X , θ ) h ( X ) = ϕ ( X | D ) f ( D | θ ) � f ( D | θ ) = g ( D | X , θ ) h ( X ) dX Given D we observe X (the implicit assumption is that we are sampling only on D , not on D and X ). Probability of sampled ( X , D ) is ϕ ( X | D ) C ( D ) . 96 / 125

  76. Definitions and Some Examples of Biased Samples A fact we use later is � g ( D | X ) h ( X ) � ϕ ( X | D ) C ( D ) = C ( D ) f ( D ) g ( D | X ) h ( X ) C ( D ) = � . �� g ( D | X ) h ( X ) dX � When C ( D ) = f ( D ) = g ( D | X ) h ( X ) dX , choice based sampling is random sampling. 97 / 125

  77. Definitions and Some Examples of Biased Samples Note, the likelihood function in an exogenous sampling scheme is I I � � L = f ( D i , X i ) = f ( D i | X i , θ ) h ( X i ) i =1 i =1 I � � ln L = ln f ( D i | X i ) + ln h ( X i ) . i =1 By exogeneity, we get the lack of dependence of distribution of X on θ. 98 / 125

  78. Definitions and Some Examples of Biased Samples Likelihood function for a choice-based sampling scheme is I � ln L = [ln g ( D i | X i ) + ln h ( X i ) − ln f ( D i ) + ln C ( D i )] . i =1 In general, f ( D ) depends on parameters θ . ∴ Max with θ . I I ∂ ln L ∂ ln g ( D i | X i ) ∂ ln f ( D i ) � � = − . ∂θ ∂θ ∂θ i =1 i =1 � �� � source of bias We neglect the second term in forming the usual estimators using only the first term. That is the source of the inconsistency. 99 / 125

  79. Definitions and Some Examples of Biased Samples Further Analysis of Choice Based Samples: An example in discrete choice. (c) Draw d by ϕ ( d ) . (d) Draw X by f ( X | d = 1) . Joint density of data: ϕ ( d = 1) f ( X | d = 1 , θ ) � Pr( d = 1 | X , θ ) f ( X ) � = ϕ ( d = 1) Pr( d = 1 | θ ) 100 / 125

Recommend


More recommend