Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 - PowerPoint PPT Presentation

Definitions and Some Examples of Biased Samples Distribution of Y ∗ is G ( y ∗ | Y > c ) = F ( y ∗ | Y > c ) = F ( y ∗ | ∆ = 1) (6a) F ( y ∗ ) 1 − F ( c ) , y ∗ > c . = Point mass at Y ∗ = 0 (Convention) for Y ∗ = 0 (∆ = 0) . (6b) 22 / 125

Definitions and Some Examples of Biased Samples Observe that (6a) is obtained from (1) by setting ω ( y ∗ ) = 1 if y > c , and ω ( y ∗ ) = 0 otherwise, and integrating up with respect to y ∗ . The distribution of ∆ is Pr(∆ = δ ) = [1 − F ( c )] δ [ F ( c )] 1 − δ , δ ∈ { 0 , 1 } . The joint distribution of ( Y ∗ , ∆) for a censored sample: F ( y ∗ , δ ) F ( y ∗ | δ )Pr( δ ) = (7) � � δ F ( y ∗ ) [1 − F ( c )] δ (1) 1 − δ [ F ( c )] 1 − δ = (1 − F ( c )) [ F ( y ∗ )] δ [ F ( c )] 1 − δ . = 23 / 125

Definitions and Some Examples of Biased Samples (7) is obtained from (4) by setting ω ( y ) = 0 y < c , ω ( y ) = 1 otherwise, by setting i ( y ) = ω ( y ), and by integrating up with respect to y ∗ . For normally distributed Y : (7) is “Tobit” model. 24 / 125

Definitions and Some Examples of Biased Samples More information in a censored sample than in a truncated sample because one can obtain (6a) from (7) (by conditioning on ∆ = 1) but not vice versa. 25 / 125

Definitions and Some Examples of Biased Samples Inferences about the population distribution based on assuming that F ( y ∗ | Y > c ) closely approximates F ( y ) are potentially very misleading. A description of population income inequality based on a subsample of high income people may convey no information about the true population distribution. 26 / 125

Definitions and Some Examples of Biased Samples Without further information about F and its support, it is not possible to recover F from G ( y ∗ ) from either a censored or a truncated sample. Access to a censored sample enables the analyst to recover F ( y ) for y > c but obviously does not provide any information on the shape of the true distribution for values of y ≤ c . 27 / 125

Definitions and Some Examples of Biased Samples Problem is routinely “solved” by assuming that F is of a known functional form. This solution strategy does not always work. If F is normal, then it can be recovered from a censored or truncated sample (Pearson, 1900). If F is Pareto, F cannot be recovered from either a truncated or a censored sample (see Flinn and Heckman, 1982b). Show this. If F is real analytic (i.e., possesses derivatives of all order) and the support of Y is known, then F can be recovered (Heckman and Singer, 1986). 28 / 125

Definitions and Some Examples of Biased Samples Example 2. Expand the previous discussion to a linear regression setting . Let Y = ❳ β + U (8) be the population earnings function where Y is earnings. “ β ”: suitably dimensioned parameter vector. ❳ is a regressor vector assumed to be distributed independently of mean zero disturbance U . ⊥ X ; E ( XX ′ ) full rank, E ( U ) = 0. U ⊥ 29 / 125

Definitions and Some Examples of Biased Samples Data are collected on incomes of persons for whom Y exceeds c . Weight depends solely on y : ω ( y , ① ) = 0 , y ≤ c , ω ( y , ① ) = 1 , y > c . Can identify the sample distribution of Y above c the sample distribution of ❳ for Y above c and the proportion of the original random sample with income below c . Do not know Y below c . 30 / 125

Definitions and Some Examples of Biased Samples As before, let Y ∗ = Y if Y > c . Define Y ∗ = 0 otherwise. ∆ = 1 if Y > c , ∆ = 0 otherwise. The probability of the event ∆ = 1 given ❳ = ① is Pr(∆ = 1 | ❳ = ① ) = Pr( Y > c | ❳ = ① ) = Pr( U > c − ① β | ❳ = ① ) . 31 / 125

Definitions and Some Examples of Biased Samples Invoke independence between U and ❳ and letting F u denote the distribution of U , Pr(∆ = 1 | ❳ = ① ) = 1 − F u ( c − ① β ) (9a) and Pr(∆ = 0 | ❳ = ① ) = F u ( c − ① β ) . (9b) 32 / 125

Definitions and Some Examples of Biased Samples The distribution of Y ∗ conditional on ❳ : G ( y ∗ | Y > 0 , ❳ = ① ) F ( y ∗ | X = x , Y > c ) = (10a) F ( y ∗ | ❳ = ① , ∆ = 1) = F u ( y ∗ − ① β ) y ∗ > c . = 1 − F u ( c − ① β ) , G ( y ∗ | Y ≤ 0) = 1 for Y ∗ = 0 (∆ = 0) . (10b) 33 / 125

Definitions and Some Examples of Biased Samples The joint distribution of ( Y ∗ , ∆) given ❳ = ① is ① ) = F ( y ∗ | δ, ① ) Pr ( δ | ① ) F ( y ∗ , δ | ❳ = (11) { F u ( y ∗ − ① β ) } δ { F u ( c − ① β ) } 1 − δ . = In particular, E ( Y ∗ | ❳ = ① , ∆ = 1) = ① β + E ( U | ❳ = ① , δ = 1) (12) � ∞ z d F u ( z ) = ① β + (1 − F u ( c − ① β )) c − ① β z : dummy variable of integration. 34 / 125

Definitions and Some Examples of Biased Samples Population mean regression function is E ( Y | ❳ = ① ) = ① β. (13) Contrast between (12) and (13) illuminating. When theoretical model is estimated on a selected sample (∆ = 1), the true conditional expectation is (12) not (13). 35 / 125

Definitions and Some Examples of Biased Samples The conditional mean of U depends on ① . Omitted variable analysis, E ( U | ❳ = ① , ∆ = 1): omitted from the regression. Likely to be correlated with ① . Least squares estimates of β obtained on selected samples which do not account for selection are biased and inconsistent. 36 / 125

Definitions and Some Examples of Biased Samples Illustrate the nature of the bias, it is useful to draw on the work of Cain and Watts (1973). Suppose that X is a scalar random variable (e.g., education) and that its associated coefficient is positive ( β > 0). Under conventional assumptions about U (e.g., mean zero, independently and identically distributed and distributed independently of X ), the population regression of Y on X is a straight line. The scatter about the regression line and the regression line are given in Figure 1. 37 / 125

Definitions and Some Examples of Biased Samples Figure 1: Y Population regression Selected sample regression c 38 / 125

Definitions and Some Examples of Biased Samples When Y > c is imposed as a sample inclusion requirement, lower population values of U are excluded from the sample in a way that systematically depends on x . ( Y > c or U > c − x β ). As x increases and β > 0, the conditional mean of U : [ E ( U | X = x , ∆ = 1)] decreases. Regression estimates of β that do not correct for sample selection (i.e., include E ( U | X = x , ∆ = 1) Downward biased because of the negative correlation between x and E ( U | X = x , ∆ = 1). Flattened regression line for the selected sample in Figure 1. 39 / 125

Definitions and Some Examples of Biased Samples In models with more than one regressor, no sharp result on the sign of the bias in the regression estimate that results from ignoring the selected nature of the sample is available. Conventional least squares estimates of β obtained from selected samples are biased and inconsistent remains true. 40 / 125

Definitions and Some Examples of Biased Samples Fruitful to distinguish between the case of a truncated sample and the case of a censored sample. In the truncated sample case, no information is available about the fraction of the population that would be allocated to the truncated sample [Pr (∆ = 1)]. In the censored sample case, this fraction is known or can be consistently estimated. Fruitful to distinguish two further cases: Case (a), the case in which ❳ is not observed when ∆ = 0. Case (b) is the one most fully developed in the literature: X observed when D = 0. 41 / 125

Definitions and Some Examples of Biased Samples This relationship demonstrates that the conditional mean is a function of the probability of selection. As the probability of selection goes to 1, the conditional mean goes to zero. For samples chosen so that the values of ① are such that the observations are certain to be included the sample, there is no problem in using ordinary least squares on selected samples to estimate β . Thus in Figure 1, ordinary least squares regressions fit on samples selected to have large ① values closely approximate the true regression function and become arbitrarily close as ① becomes large. 43 / 125

Definitions and Some Examples of Biased Samples The conditional mean in (12) is a surrogate for Pr(∆ = 1 | ① ) . As this probability goes to one, the problem of sample selection in regression analysis becomes negligibly small. Much more general idea Heckman (1976) demonstrates that β and F u are identified if U is normally distributed and standard conditions invoked in regression analysis are satisfied. In Newey; Gallant and Nycha, Powell, etc., F u is consistently nonparametrically estimated. 44 / 125

Definitions and Some Examples of Biased Samples Example 3 : censored random variables . This concept extends the notion of a truncated random variable by letting a more general rule than truncation on the outcome of interest generate the selected sample. Because the sample generating rule may be different from a simple truncation of the outcome being studied, the concept of a censored random variable in general requires at least two distinct random variables. 45 / 125

Definitions and Some Examples of Biased Samples Let Y 1 be the outcome of interest. Let Y 2 be another random variable. Denote observed Y 1 by Y ∗ 1 . If Y 2 < c , Y 1 is observed. Otherwise Y 1 is not observed and we can set Y ∗ 1 = 0 or any other convenient value (assuming that Y 1 has no point mass at Y 1 = 0 or at the alternative convenient value). In weighting function ω ; ω ( y 1 , y 2 ) = 0 if y 2 > c . ω ( y 1 , y 2 ) = 1 if y 2 ≤ c . 46 / 125

Definitions and Some Examples of Biased Samples Selection rule Y 2 < c does not necessarily restrict the range of Y 1 . Thus Y ∗ 1 is not in general a truncated random variable. Define ∆ = 1 if Y 2 < c ; ∆ = 0 otherwise. 47 / 125

Definitions and Some Examples of Biased Samples If F ( y 1 , y 2 ) is the population distribution of ( Y 1 , Y 2 ), the distribution of ∆ is Pr(∆ = δ ) = [1 − F 2 ( c )] 1 − δ [ F 2 ( c )] δ , δ = 0 , 1 , F 2 is the marginal distribution of Y 2 . 48 / 125

Definitions and Some Examples of Biased Samples The distribution of Y ∗ 1 is 1 ; δ = 1) = F ( y ∗ 1 ; c ) G ( y ∗ 1 ) = F ( y ∗ ∆ = 1 , (14a) F 2 ( c ) , G ( y ∗ 1 = 0) = 1 , ∆ = 0 . (14b) (14a): the distribution function corresponding to the density in (1) when ω ( y 1 , y 2 ) = 1 if y 2 ≤ c and ω ( y 1 , y 2 ) = 0 otherwise. 49 / 125

Definitions and Some Examples of Biased Samples The joint distribution of ( Y ∗ 1 , ∆) is G ( y ∗ 1 , δ ) = [ F ( y ∗ 1 ; c )] δ [1 − F 2 ( c )] 1 − δ . (15) This is the distribution function corresponding to density (4) for the special weighting rule of this example. In a censored sample, under general conditions it is possible to consistently estimate Pr(∆ = δ ) and G ( y ∗ 1 ). 50 / 125

Definitions and Some Examples of Biased Samples In a truncated sample, only conditional distribution (14a) can be estimated. A degenerate version of this model has Y 1 ≡ Y 2 . In that case, censored random variable Y 1 is also a truncated random variable. Note that a censored random variable may be defined for a truncated or censored sample. 51 / 125

Definitions and Some Examples of Biased Samples Example 3: Let Y 1 be the wage of a woman. Wages of women are observed only if women work. Let Y 2 be an index of a woman’s propensity to work. 52 / 125

Definitions and Some Examples of Biased Samples Y 2 is postulated as the difference between reservation wages (the value of time at home determined from household preference functions) and potential market wages Y 1 . Then if Y 2 < 0, the woman works. Otherwise, she does not. Y ∗ 1 = Y 1 if Y 2 < 0 is the observed wage. 53 / 125

Definitions and Some Examples of Biased Samples If Y 1 is the offered wage of an unemployed worker, and Y 2 is the difference between reservation wages (the return to searching) and offered market wages, Y ∗ 1 = Y 1 if Y 2 < 0 is the accepted wage for an unemployed worker (see Flinn and Heckman, 1982a). If Y 1 is the potential output of a firm and Y 2 is its profitability, Y ∗ 1 = Y 1 if Y 2 > 0. If Y 1 is the potential income in occupation one and Y 2 is the potential income in occupation two. 54 / 125

Definitions and Some Examples of Biased Samples Y ∗ 1 = Y 1 if Y 1 − Y 2 < 0 while Y ∗ 2 = Y 2 if Y 1 − Y 2 ≥ 0. 55 / 125

Definitions and Some Examples of Biased Samples Example 4 . Builds on example 3 by introducing regressors. This produces the censored regression model Heckman (1976, 1979). In example 3 set Y 1 = ❳ 1 β 1 + U 1 (16a) Y 2 = ❳ 2 β 2 + U 2 (16b) where ( ❳ 1 , ❳ 2 ) are distributed independently of ( U 1 , U 2 ) , a mean zero, finite variance random vector. 56 / 125

Definitions and Some Examples of Biased Samples Conventional assumptions are invoked to ensure that if Y 1 and Y 2 can be observed, least squares applied to a random sample of data on ( Y 1 , Y 2 , ❳ 1 , ❳ 2 ) would consistently estimate β 1 and β 2 . Y ∗ 1 = Y 1 if Y 2 < 0. If Y 2 < 0 , ∆ = 1. Regression function for the selected sample is E ( Y ∗ 1 | ❳ 1 = ① 1 , Y 2 < 0) = E ( Y ∗ 1 | ❳ 1 = ① 1 , ∆ = 1) = ❳ 1 β 1 + E ( U 1 | ❳ 1 = ① 1 , ∆ = 1) (17) Regression function for the population is E ( Y 1 | ❳ 1 = ① 1 ) = ❳ 1 β 1 . (18) 57 / 125

Definitions and Some Examples of Biased Samples The conditional mean is a surrogate for the probability of selection [Pr(∆ = 1 | ① 2 )]. As Pr(∆ = 1 | x 2 ) goes to one, the problem of sample selection bias becomes negligible. In the censored regression case, a new phenomenon appears. If there are variables in ❳ 2 not in ❳ 1 , such variables may appear to be statistically important determinants of Y 1 when ordinary least squares is applied to data generated from censored samples. 58 / 125

Definitions and Some Examples of Biased Samples Example: suppose that survey statisticians use some extraneous (to X 1 ) variables to determine sample enrollment. Such variables may appear to be important determinants of Y 1 when in fact they are not. They are important determinants of Y 1 when in fact they are not. They are important determinants of Y ∗ 1 . 59 / 125

Definitions and Some Examples of Biased Samples In an analysis of self-selection, let Y 1 be the wage that a potential worker could earn were they to accept a market offer. Let Y 2 be the difference between the best non-market opportunity available to the potential worker and Y 1 . If Y 2 < 0, the agent works. The conditional expectation of observed wages ( Y ∗ 1 = Y , if Y 2 < 0) given ① 1 and ① 2 will be a non-trivial function of ① 2 . 60 / 125

Definitions and Some Examples of Biased Samples Thus variables determining non-market opportunities will determine Y ∗ 1 , even though they do not determine Y 1 . For example, the number of children less than six may appear to be significant determinants of Y 1 when inadequate account is taken of sample selection, even though the market does not place any value or penalty on small children in generating wage offers for potential workers. 61 / 125

Definitions and Some Examples of Biased Samples Example 5. Length biased sampling. Let T be the duration of an event such as a completed unemployment spell or a completed duration of a job with an employer. The population distribution of T is F ( t ) with density f ( t ). The sampling rule is such that population unemployment spells are sampled at random. Data are recorded on a completed spell provided that at the time of the interview the individual is experiencing the event. Such sampling rules are in wide use in many national surveys of employment and unemployment. Make a distinction between: Population distribution of T 1 And sampled distribution of ❚ 2 62 / 125

Definitions and Some Examples of Biased Samples In order to have a sampled completed spell, a person must be in the state at the time of the interview. Let “0” be the date of the survey. Decompose any completed spell T into a component that occurs before the survey T b and a component that occurs after the survey T a . Then T = T a + T b . For a person to be sampled, T b > 0. The density of T given T b = t b is f ( t ) f ( t | t b ) = 1 − F ( t b ) , t ≥ t b . (19) 63 / 125

Definitions and Some Examples of Biased Samples Suppose that the environment is stationary. The population entry rate into the state at each instant of time is k . From each vintage of entrants into the state distinguished by their distance from the survey date t b , only 1 − F ( t b ) = Pr( T > t b ) survive. People with this duration entered at time t = − t b . Aggregating over all cohorts of entrants, the population proportion in the state at the date of the interview is P where � ∞ P = k (1 − F ( t b )) dt b (20) 0 which is assumed to exist (a requirement for finite mean of T b ). In a duration of unemployment example, P is the aggregate unemployment rate (proportion of population unemployed at the date of the survey). 64 / 125

Definitions and Some Examples of Biased Samples Let ∗ denote random variables defined in sampled population . The density of T ∗ b , sampled presurvey duration, is b > 0) = k (1 − F ( t ∗ b )) g ( t ∗ b | t ∗ (21) . P The density of sampled completed durations is thus � t ∗ g ( t ∗ ) f ( t ∗ | t ∗ b ) f ( t ∗ b | t ∗ b > 0) dt ∗ = b 0 � t ∗ f ( t ∗ ) 1 − F ( t ∗ b ) dt ∗ = k b 1 − F ( t ∗ b ) P 0 k t ∗ f ( t ∗ ) = . P Length biased sampling . 65 / 125

Definitions and Some Examples of Biased Samples Integration by parts: � ∞ � ∞ P = k (1 − F ( z )) dz = k zdF ( z ) = kE ( T ) . 0 0 Note that g ( t ∗ ) = t ∗ f ( t ∗ ) (22) E ( T ) . We know g ( t ∗ ) from data. t ∗ , t ∗ > 0. Can form g ( t ∗ ) ∴ we know f ( t ∗ ) E ( T ) . =1 Known � �� ∞ � ∞ 0 f ( t ∗ ) dt ∗ g ( t ∗ ) dt ∗ = Apply analysis of (5): . t ∗ E ( T ) 0 � �� can determine this ∴ know f ( t ∗ ). 66 / 125

Definitions and Some Examples of Biased Samples In this form (22) is equivalent to (1) with ω ( t ) = t . E ( T ). Length biased sampling. Intuitively, longer spells are oversampled when the requirement is imposed that a spell be in progress at the time the survey is conducted ( T b > 0). Suppose, instead, that individuals are randomly sampled and data are recorded on the next spell of the event (after the survey date). We recover population f ( t ) if spells independent . 67 / 125

Definitions and Some Examples of Biased Samples As long as successive spells are independent, such a sampling frame does not distort the sampled distribution because no requirement is imposed that the sampled spell be in progress at the date of the interview. It is important to notice that the source of the bias is the requirement that T b > 0 (i.e., sampled spells are in progress), not that only a fraction of the population experiences the event ( P < 1). 68 / 125

Definitions and Some Examples of Biased Samples The simple length weight ( ω ( t ) = t ) that produces (22) is an artefact of the stationarity assumption. Heckman and Singer (1986): non-stationarity and unobservables when there is selection on the event that a person be in the state at the time of the interview. They also demonstrate the bias that results from estimating parametric models on samples generated by length biased sampling rules when inadequate account is taken of the sampling plan. 69 / 125

Definitions and Some Examples of Biased Samples The probability that a spell lasts until t c given that it has lasted t b f ( t c ) g ( t c | t b ) = 1 − F ( t b ) So the density of a spell that lasts for t c is � t c g ( t c ) = f ( t c | T c > T > T b ) Pr ( T c ≥ T ) dt b 0 � t c f ( t c ) m dt b = f ( t c ) t c = m 0 70 / 125

Definitions and Some Examples of Biased Samples Likewise, the density of a spell that lasts until t a is � ∞ g ( t a ) = f ( t a + t b | T ≥ T b ≥ 0) Pr ( T ≥ T b ≥ 0) dt b 0 � ∞ f ( t a + t b ) = dt b m 0 � ∞ 1 = f ( t b ) dt b m t a 1 − F ( t a ) = m So the functional form of g ( t b ) = g ( t a ). Stationarity ⇒ backward and forward densities same. Mirror images. “Back to the future.” 71 / 125

Definitions and Some Examples of Biased Samples Some useful results that follow from this model: If f ( t ) = θ e − t θ , then g ( t b ) = θ e − t b θ and g ( t a ) = θ e − t a θ . 1 Proof : 2 θ e − t θ → m = 1 f ( t ) = θ, 1 − e − t θ → g ( t a ) = 1 − F ( t ) = θ e − t θ F ( t ) = m 72 / 125

Definitions and Some Examples of Biased Samples 2 (1 + σ 2 E ( T a ) = m m 2 ). 1 Proof : � � 1 − F ( t a ) E ( T a ) = t a f ( t a ) dt a = t a dt a m � 1 � 1 � 1 2 t 2 a (1 − F ( t a )) | ∞ 2 t 2 = 0 − a d (1 − F ( t a )) m � 1 1 a f ( t a ) dt a = 1 2 t 2 2 m [ var ( t a ) + E 2 ( t a )] = m 1 2 m [ σ 2 + m 2 ] = 73 / 125

Definitions and Some Examples of Biased Samples 2 (1 + σ 2 E ( T b ) = m m 2 ). 1 Proof : See proof of Proposition 2. 2 E ( T c ) = m (1 + σ 2 m 2 ). 3 Proof : 4 � t 2 c f ( t c ) dt c = 1 m ( var ( t c ) + E 2 ( t c )) E ( T c ) = m → E ( T c ) = 2 E ( T a ) = 2 E ( T b ) , E ( T c ) > m unless σ 2 = 0 74 / 125

Definitions and Some Examples of Biased Samples Examples 75 / 125

Definitions and Some Examples of Biased Samples Specification of the Distribution Weibull Distribution Parameters: λ > 0 , k > 0 Probability Density Function (PDF): � t � t � � k � � k − 1 λ exp − λ k k Cumulative Density Function: � t � � k � 1 − exp − k Set of Parameters:   λ 1 , k 1 = 0 . 5 λ 2 , k 1 = 1 . 0    , respectively   λ 3 , k 1 = 2 . 0  λ 3 , k 1 = 3 . 0 76 / 125

Definitions and Some Examples of Biased Samples Basic Distribution Graphs �� &�� &�� 3 1 Weibull Distribution λ = 0.1, k = 0.5 0.9 Weibull Distribution λ = 0.5, k = 1.0 Weibull Distribution λ = 0.5, k = 2.0 2.5 s 0.8 Weibull Distribution λ = 1.0, k = 3.0 n o i l t l u u b b 0.7 i i r e t 2 s W i D : n 0.6 l l o u b t i u i e b W 1.5 0.5 i r t s : i s D l e l e 0.4 p h S t f e 1 o h F 0.3 t f D o C F Weibull Distribution λ = 0.1, k = 0.5 D 0.2 P 0.5 Weibull Distribution λ = 0.5, k = 1.0 Weibull Distribution λ = 0.5, k = 2.0 0.1 Weibull Distribution λ = 1.0, k = 3.0 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 t t 77 / 125

Definitions and Some Examples of Biased Samples Basic Duration Graphs �� &�� !��"�� &�� 10 10 Weibull Distribution λ = 0.1, k = 0.5 Weibull Distribution λ = 0.1, k = 0.5 Integrated Hazard Function of the Distribution: Weibull 9 Weibull Distribution λ = 0.5, k = 1.0 9 Weibull Distribution λ = 0.5, k = 1.0 Hazard Function of the Distribution: Weibull Weibull Distribution λ = 0.5, k = 2.0 Weibull Distribution λ = 0.5, k = 2.0 8 8 Weibull Distribution λ = 1.0, k = 3.0 Weibull Distribution λ = 1.0, k = 3.0 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0 0 0.5 1 1.5 2 0 0.5 1 1.5 2 t t 78 / 125

Definitions and Some Examples of Biased Samples Observed and Original Distribution for T b (Example 1) 3 The Observed PDF of Spells (T b ) The Original PDF (Weibull Distribution λ = 0.1, k = 0.5) Observed (T b ) and Original PDFs of the Spells 2.5 2 1.5 1 0.5 0 0 0.5 1 1.5 2 t 79 / 125

Definitions and Some Examples of Biased Samples Observed and Original Distribution for T b (Example 2) 2.5 The Observed PDF of Spells (T b ) The Original PDF (Weibull Distribution λ = 0.5, k = 2.0) Observed (T b ) and Original PDFs of the Spells 2 1.5 1 0.5 0 0 0.5 1 1.5 t 80 / 125

Definitions and Some Examples of Biased Samples Observed and Original Distribution for T b (Example 3) 1.6 The Observed PDF of Spells (T b ) The Original PDF (Weibull Distribution λ = 1.0, k = 3.0) 1.4 Observed (T b ) and Original PDFs of the Spells 1.2 1 0.8 0.6 0.4 0.2 0 0 0.5 1 1.5 2 t 81 / 125

Definitions and Some Examples of Biased Samples Observed and Original Distribution for T c (Example 1) 3 The Observed PDF of Spells (T c ) The Original PDF (Weibull Distribution λ = 0.1, k = 0.5) s l l 2.5 e p S e h t f o 2 s F D P l a n 1.5 i g i r O d n a ) 1 T c ( d e v r e 0.5 s b O 0 0 0.5 1 1.5 2 t 82 / 125

Definitions and Some Examples of Biased Samples Observed and Original Distribution for T c (Example 2) 2 The Observed PDF of Spells (T c ) 1.8 The Original PDF (Weibull Distribution λ = 0.5, k = 1.0) s l l e p 1.6 S e h t 1.4 f o s F D 1.2 P l a n i 1 g i r O d 0.8 n a ) T c 0.6 ( d e v 0.4 r e s b O 0.2 0 0 0.5 1 1.5 2 t 83 / 125 �

Definitions and Some Examples of Biased Samples Observed and Original Distribution for T c (Example 3) 2.5 The Observed PDF of Spells (T c ) The Original PDF (Weibull Distribution λ = 0.5, k = 2.0) Observed (T c ) and Original PDFs of the Spells 2 1.5 1 0.5 0 0 0.5 1 1.5 t 84 / 125

Definitions and Some Examples of Biased Samples Observed and Original Distribution for T c (Example 4) 1.6 The Observed PDF of Spells (T c ) The Original PDF (Weibull Distribution λ = 1.0, k = 3.0) 1.4 Observed (T c ) and Original PDFs of the Spells 1.2 1 0.8 0.6 0.4 0.2 0 0 0.5 1 1.5 2 t 85 / 125

Definitions and Some Examples of Biased Samples Example 6. Choice based sampling. Let D be a discrete valued random variable which assumes a finite number of values I . Discrete choice model. D = i , i = 1 , . . . , I corresponds to the occurrence of state i . States are mutually exclusive. In the existing literature the states may be modes of transportation choice for commuters (Domencich and McFadden, 1975), occupations, migration destinations, financial solvency status of firms, schooling choices of students, etc. 86 / 125

Definitions and Some Examples of Biased Samples Interest centers on estimating a population choice model Pr( D = i | ❳ = ① ) , i = 1 , . . . , I . (23) The population density of ( D , ❳ ) is f ( d , ① ) = Pr( D = d | ❳ = ① ) h ( x ) (24) where, in this example, h ( ① ) is the population density of the ❳ . 87 / 125

Definitions and Some Examples of Biased Samples For example, interviews about transportation preferences conducted at train stations tend to over-sample train riders and under-sample bus riders. Interviews about occupational choice preferences conducted at leading universities over-sample those who select professional occupations. 88 / 125

Definitions and Some Examples of Biased Samples In choice based sampling, selection occurs solely on the D coordinate of ( D , ❳ ). In terms of (1) (extended to allow for discrete random variables), ω ( d , ❳ ) = ω ( d ). Then sampled ( D ∗ , ❳ ∗ ) has density ω ( d ∗ ) f ( d ∗ , ① ∗ ) g ( d ∗ , ① ∗ ) = . (25) I � � ω ( i ) f ( i , x ∗ )d x ∗ i =1 89 / 125

Definitions and Some Examples of Biased Samples Notice that the dominator can be simplified to I � ω ( i ) f ( i ) i =1 f ( d ∗ ) is the marginal distribution of D ∗ so that g ( d ∗ , ① ∗ ) = ω ( d ∗ ) f ( d ∗ , ① ∗ ) . (26) I � ω ( i ) f ( i ) i =1 90 / 125

Definitions and Some Examples of Biased Samples Integrating (25) with respect to ① using (26) we obtain g ( d ∗ ) = ω ( d ∗ ) f ( d ∗ ) (27) I � ω ( i ) f ( i ) i =1 Sampling rule causes the sampled proportions to deviate from the population proportions. 91 / 125

Definitions and Some Examples of Biased Samples Note further that as a consequence of sampling only on D , the population conditional density h ( ① ∗ | d ∗ ) = f ( d ∗ , x ∗ ) (28) f ( d ∗ ) can be recovered from the choice based sample. The density of x in the sample is thus I � g ( x ∗ ) = h ( x ∗ | i ) g ( i ) . (29) i =1 92 / 125

Definitions and Some Examples of Biased Samples Then using (26)-(29) we reach g ( d ∗ | x ∗ ) f ( d ∗ | x ∗ ) = (30)             ω ( d ∗ )     1       × .     I I  � �     f ( i | x ∗ ) g ( i )   ω ( i ) f ( i )     f ( i )  i =1 i =1 The bias that results from using choice based samples to make inference about f ( d ∗ | x ∗ ) is a consequence of neglecting the terms in braces on the right-hand side of (30). 93 / 125

Definitions and Some Examples of Biased Samples Notice that if the data are generated by a random sampling rule, ω ( d ∗ ) = 1 , g ( d ∗ ) = f ( d ∗ ) and the term in braces is one. 94 / 125

Definitions and Some Examples of Biased Samples Further Discussion of Choice Based Samples 95 / 125

Definitions and Some Examples of Biased Samples Pick D first ( e.g. travel mode). Probability of selecting D is C ( D ) . f ( D , X ) is the joint density of D and X in the population. f ( D , X | θ ) = g ( D | X , θ ) h ( X ) = ϕ ( X | D ) f ( D | θ ) � f ( D | θ ) = g ( D | X , θ ) h ( X ) dX Given D we observe X (the implicit assumption is that we are sampling only on D , not on D and X ). Probability of sampled ( X , D ) is ϕ ( X | D ) C ( D ) . 96 / 125

Definitions and Some Examples of Biased Samples A fact we use later is � g ( D | X ) h ( X ) � ϕ ( X | D ) C ( D ) = C ( D ) f ( D ) g ( D | X ) h ( X ) C ( D ) = � . �� g ( D | X ) h ( X ) dX � When C ( D ) = f ( D ) = g ( D | X ) h ( X ) dX , choice based sampling is random sampling. 97 / 125

Definitions and Some Examples of Biased Samples Note, the likelihood function in an exogenous sampling scheme is I I � � L = f ( D i , X i ) = f ( D i | X i , θ ) h ( X i ) i =1 i =1 I � � ln L = ln f ( D i | X i ) + ln h ( X i ) . i =1 By exogeneity, we get the lack of dependence of distribution of X on θ. 98 / 125

Definitions and Some Examples of Biased Samples Likelihood function for a choice-based sampling scheme is I � ln L = [ln g ( D i | X i ) + ln h ( X i ) − ln f ( D i ) + ln C ( D i )] . i =1 In general, f ( D ) depends on parameters θ . ∴ Max with θ . I I ∂ ln L ∂ ln g ( D i | X i ) ∂ ln f ( D i ) � � = − . ∂θ ∂θ ∂θ i =1 i =1 � �� source of bias We neglect the second term in forming the usual estimators using only the first term. That is the source of the inconsistency. 99 / 125

Definitions and Some Examples of Biased Samples Further Analysis of Choice Based Samples: An example in discrete choice. (c) Draw d by ϕ ( d ) . (d) Draw X by f ( X | d = 1) . Joint density of data: ϕ ( d = 1) f ( X | d = 1 , θ ) � Pr( d = 1 | X , θ ) f ( X ) � = ϕ ( d = 1) Pr( d = 1 | θ ) 100 / 125

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 - PowerPoint PPT Presentation

Definitions and Some Examples of Biased Samples Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125 Definitions and Some Examples of Biased Samples Definitions and Some Examples of Biased Samples All

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125

Finite Projective Planes http://math.uwyo.edu/moorhouse/pub/planes/ Eric Moorhouse Mutually

Combining Biased and Unbiased Estimators in High Dimensions Bill Strawderman Rutgers University

Biased Monte Carlo Ray Tracing Filtering, Irradiance Caching, and Photon Mapping Henrik Wann

Extreme Event-Size Extreme Event-Size Fluctuations in Biased Fluctuations in Biased Random

Samples Advertising of samples and handing out samples Advertising Education and Assurance

-Samples [AB98] Hyp: domain S is a smooth curve or surface. S 1 -Samples [AB98] Hyp:

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

STAT 113 Independent vs. Paired Samples Colin Reimer Dawson Oberlin College November 16, 2017

Bias in Rendering Keenan Crane (kcrane@uiuc.edu) Contents 1 What does unbiased mean? 1

k Ho t k S . E . k degrees of freedom = n

Biased-Belief Equilibrium Yuval Heller (Bar Ilan) and Eyal Winter (Hebrew University) Bar Ilan,

Quick Warm-Up Suppose we have a biased coin that comes up heads with some unknown probability p

Lecture 26 ANNOUNCEMENTS Homework 12 due Thursday, 12/6 OUTLINE Self-biased current sources

CSE 427 Computational Biology Gene Prediction A statistical interlude: Fair or biased? H H H H

A biased history of equality in type theory Some equations are more equal than others James

Chapter 1. Pigeonhole Principle Prof. Tesler Math 184A Fall 2017 Prof. Tesler Ch. 1.

Command Pattern CS 446 The Command Pattern ! Encapsulates a request as an object ! Packages

Data pre-processing RECURREN T N EURAL N ETW ORK S F OR LAN GUAGE MODELIN G IN P YTH ON David

Multiblock Method for Categorical Variables Application to air quality in pig farms S. Bougeard 1

MECT Microeconometrics Blundell Lecture 2 Censored Data Models Richard Blundell

Analysis of Competing Risks in the Pareto Model for Progressive Censoring with binomial removals

EM Algorithm Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Ch. 4 in Givens & Hoeting

Estimation of the survival function Rasmus Waagepetersen Department of Mathematics Aalborg

Sambuz

Useful Links

Newsletter

Mail Us

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 - PowerPoint PPT Presentation

Definitions and Some Examples of Biased Samples Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 14, 2019 1 / 125 Definitions and Some Examples of Biased Samples Definitions and Some Examples of Biased Samples All

Biased and Unbiased Samples James J. Heckman Econ 312, Spring 2019 May 13, 2019 1 / 125

Finite Projective Planes http://math.uwyo.edu/moorhouse/pub/planes/ Eric Moorhouse Mutually

Combining Biased and Unbiased Estimators in High Dimensions Bill Strawderman Rutgers University

Biased Monte Carlo Ray Tracing Filtering, Irradiance Caching, and Photon Mapping Henrik Wann

Extreme Event-Size Extreme Event-Size Fluctuations in Biased Fluctuations in Biased Random

Samples Advertising of samples and handing out samples Advertising Education and Assurance

-Samples [AB98] Hyp: domain S is a smooth curve or surface. S 1 -Samples [AB98] Hyp:

Business Statistics CONTENTS Comparing two samples Comparing two unrelated samples Comparing

STAT 113 Independent vs. Paired Samples Colin Reimer Dawson Oberlin College November 16, 2017

Bias in Rendering Keenan Crane (kcrane@uiuc.edu) Contents 1 What does unbiased mean? 1

k Ho t k S . E . k degrees of freedom = n

Biased-Belief Equilibrium Yuval Heller (Bar Ilan) and Eyal Winter (Hebrew University) Bar Ilan,

Quick Warm-Up Suppose we have a biased coin that comes up heads with some unknown probability p

Lecture 26 ANNOUNCEMENTS Homework 12 due Thursday, 12/6 OUTLINE Self-biased current sources

CSE 427 Computational Biology Gene Prediction A statistical interlude: Fair or biased? H H H H

A biased history of equality in type theory Some equations are more equal than others James

Chapter 1. Pigeonhole Principle Prof. Tesler Math 184A Fall 2017 Prof. Tesler Ch. 1.

Command Pattern CS 446 The Command Pattern ! Encapsulates a request as an object ! Packages

Data pre-processing RECURREN T N EURAL N ETW ORK S F OR LAN GUAGE MODELIN G IN P YTH ON David

Multiblock Method for Categorical Variables Application to air quality in pig farms S. Bougeard 1

MECT Microeconometrics Blundell Lecture 2 Censored Data Models Richard Blundell

Analysis of Competing Risks in the Pareto Model for Progressive Censoring with binomial removals

EM Algorithm Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Ch. 4 in Givens &amp; Hoeting

Estimation of the survival function Rasmus Waagepetersen Department of Mathematics Aalborg

Sambuz

Useful Links

Newsletter

Mail Us

EM Algorithm Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Ch. 4 in Givens & Hoeting