In this paper ■ Describe the construction of the LEHD infrastructure The LEHD Infrastructure Files Introduction ✦ ... in particular the imputation mechanisms used ➲ What are QWI? ➲ What is it? ■ Describe the computation of the QWI statistics ➲ In this paper ✦ ... in particular the imputation mechanisms used Input Files Infrastructure Files ■ Describe the disclosure-proofing mechanism Forming Aggregated ■ Describe researcher access to infrastructure files and Estimates: QWI confidential QWI files Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 5/31
The LEHD Infrastructure Files Introduction Input Files ➲ Wage records: UI ➲ Employer reports: ES202 ➲ Demographics Input Files Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 6/31
Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ➲ Wage records: UI ➲ Employer reports: ES202 ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31
Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31
Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files ■ identifies EARNINGS, EMPLOYER, TIME PERIOD Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31
Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files ■ identifies EARNINGS, EMPLOYER, TIME PERIOD Forming Aggregated ■ some limited other state-dependent information available Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31
Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files ■ identifies EARNINGS, EMPLOYER, TIME PERIOD Forming Aggregated ■ some limited other state-dependent information available Estimates: QWI Disclosure-proofing the QWI ■ in particular, for Minnesota, the ESTABLISHMENT is Publicly available files reported Conclusion May 6, 2005 - p. 7/31
Employer reports: ES202 ... or QCEW The LEHD Infrastructure Files Introduction Input Files ➲ Wage records: UI ➲ Employer reports: ES202 ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 8/31
Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ➲ Wage records: UI ➲ Employer reports: ES202 ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 8/31
Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 8/31
Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files ■ collects from employers covered by state unemployment Forming Aggregated insurance programs: Estimates: QWI ✦ employment Disclosure-proofing the QWI ✦ payroll Publicly available files ✦ geographic information Conclusion May 6, 2005 - p. 8/31
Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files ■ collects from employers covered by state unemployment Forming Aggregated insurance programs: Estimates: QWI ✦ employment Disclosure-proofing the QWI ✦ payroll Publicly available files ✦ geographic information Conclusion ■ fundamental unit: ’reporting unit’ ( ≈ establishment) May 6, 2005 - p. 8/31
Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files ■ collects from employers covered by state unemployment Forming Aggregated insurance programs: Estimates: QWI ✦ employment Disclosure-proofing the QWI ✦ payroll Publicly available files ✦ geographic information Conclusion ■ fundamental unit: ’reporting unit’ ( ≈ establishment) ■ One report per establishment per quarter is filed May 6, 2005 - p. 8/31
Demographics ■ Demographics are taken from a number of Census-internal The LEHD Infrastructure Files Introduction files derived from administrative data: Input Files ✦ Person Characteristics File (PCF) ➲ Wage records: UI ➲ Employer reports: ES202 ✦ Census Numident ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 9/31
Demographics ■ Demographics are taken from a number of Census-internal The LEHD Infrastructure Files Introduction files derived from administrative data: Input Files ✦ Person Characteristics File (PCF) ➲ Wage records: UI ➲ Employer reports: ES202 ✦ Census Numident ➲ Demographics ■ Where available, more detailed data on individuals is also Infrastructure Files Forming Aggregated extracted from surveys and censuses: Estimates: QWI ✦ CPS Disclosure-proofing the QWI ✦ SIPP Publicly available files ✦ ACS Conclusion ✦ 1990 Census ✦ 2000 Census May 6, 2005 - p. 9/31
The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Infrastructure Files Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 10/31
EHF: Employment History Files ■ Job-level EHF The LEHD Infrastructure Files Introduction ✦ complete in-state work history for each individual on Input Files UIwage records. Infrastructure Files ✦ one record for each employee-employer combination – a ➲ EHF: Employment History Files job ➲ ICF: Individual Characteristics File ✦ earnings and employment patterns ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 11/31
EHF: Employment History Files ■ Job-level EHF The LEHD Infrastructure Files Introduction ✦ complete in-state work history for each individual on Input Files UIwage records. Infrastructure Files ✦ one record for each employee-employer combination – a ➲ EHF: Employment History Files job ➲ ICF: Individual Characteristics File ✦ earnings and employment patterns ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address ■ Employer and establishment-level employment history List ➲ Flow so far ✦ QCEW-based employment-activity history for every SEIN Forming Aggregated (employer) and SEINUNIT (establishment) Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 11/31
EHF: Employment History Files ■ Job-level EHF The LEHD Infrastructure Files Introduction ✦ complete in-state work history for each individual on Input Files UIwage records. Infrastructure Files ✦ one record for each employee-employer combination – a ➲ EHF: Employment History Files job ➲ ICF: Individual Characteristics File ✦ earnings and employment patterns ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address ■ Employer and establishment-level employment history List ➲ Flow so far ✦ QCEW-based employment-activity history for every SEIN Forming Aggregated (employer) and SEINUNIT (establishment) Estimates: QWI ■ Comparison of employment and activity of SEINs between Disclosure-proofing the QWI Publicly available files UI and QCEW files is done for QA purposes, and in Conclusion preparation of weighting. May 6, 2005 - p. 11/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI ✦ Education Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI ✦ Education Disclosure-proofing the QWI ✦ County of residence Publicly available files Conclusion May 6, 2005 - p. 12/31
ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI ✦ Education Disclosure-proofing the QWI ✦ County of residence Publicly available files are each imputed ten times Conclusion May 6, 2005 - p. 12/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI ■ Imputation: Disclosure-proofing the QWI ✦ impute SIC if NAICS non-missing and vice-versa Publicly available files Conclusion May 6, 2005 - p. 13/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI ■ Imputation: Disclosure-proofing the QWI ✦ impute SIC if NAICS non-missing and vice-versa Publicly available files ✦ unconditional impute of missing SIC and NAICS codes Conclusion May 6, 2005 - p. 13/31
ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI ■ Imputation: Disclosure-proofing the QWI ✦ impute SIC if NAICS non-missing and vice-versa Publicly available files ✦ unconditional impute of missing SIC and NAICS codes Conclusion ✦ geography conditional on industry May 6, 2005 - p. 13/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files Conclusion May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion 1. geocoded May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion 1. geocoded 2. standardized May 6, 2005 - p. 14/31
GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion 1. geocoded 2. standardized 3. unduplicated (by firm name) May 6, 2005 - p. 14/31
Flow so far The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 15/31
Flow so far The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 15/31
Flow so far The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 15/31
The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Forming Aggregated Estimates: QWI Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 16/31
Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31
Correction of spurious worker flows ■ Firm identifier: state-specific account number The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31
Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31
Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31
Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31
Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ■ Change in firm identifier ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31
Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ■ Change in firm identifier is the component determining when ➲ Correction of spurious worker flows a worker changes employers ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31
Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ■ Change in firm identifier ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ■ → non-economic change in identifier creates spurious flow ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31
Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 18/31
Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction ■ → link entities that have different account numbes, but Input Files constitute the same economic entitiy Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 18/31
Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction ■ → link entities that have different account numbes, but Input Files constitute the same economic entitiy Infrastructure Files ■ SPF provides a variety of link characteristics, based on the Forming Aggregated Estimates: QWI number of workers leaving an SEIN, in both absolute and ➲ Correction of spurious worker flows relative terms, and the number of workers entering an SEIN, ➲ Solution: Successor-Predecessor again in absolute and relative terms. File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 18/31
Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction ■ → link entities that have different account numbes, but Input Files constitute the same economic entitiy Infrastructure Files ■ SPF provides a variety of link characteristics, based on the Forming Aggregated Estimates: QWI number of workers leaving an SEIN, in both absolute and ➲ Correction of spurious worker flows relative terms, and the number of workers entering an SEIN, ➲ Solution: Successor-Predecessor again in absolute and relative terms. File ➲ Attaching establishment characteristics to jobs ■ QWI: if 80% of an SEIN’s workers (the predecessor) are ➲ U2W: Unit to Worker Impute observed to move to a single successor, and that successor ➲ Probability Model ➲ Implementation absorbs 80% of its employees from a single predecessor, ➲ Implementation ➲ Computing the statistics then all flows between those two account numbers are Disclosure-proofing the QWI filtered out, and treated as if they had never existed. Publicly available files Conclusion May 6, 2005 - p. 18/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: no establishment identification on wage record Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics ■ Important practical aspects: Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics ■ Important practical aspects: Disclosure-proofing the QWI ✦ Non-ignorable missing data imputation Publicly available files Conclusion May 6, 2005 - p. 19/31
Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics ■ Important practical aspects: Disclosure-proofing the QWI ✦ Non-ignorable missing data imputation Publicly available files ✦ Several million imputations every quarter Conclusion May 6, 2005 - p. 19/31
U2W: Unit to Worker Impute ■ workers i = 1 , ..., I The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 20/31
U2W: Unit to Worker Impute ■ workers i = 1 , ..., I The LEHD Infrastructure Files Introduction ■ firms j = 1 , ..., J Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 20/31
Recommend
More recommend