the lehd infrastructure files
play

The LEHD Infrastructure Files and the Creation of the Quarterly - PowerPoint PPT Presentation

The LEHD Infrastructure Files and the Creation of the Quarterly Workforce Indicators John M. Abowd , , Bryce E. Stephens and Lars Vilhuber Cornell University U.S. Census Bureau, LEHD Program May 6, 2005 - p. 1/31 The LEHD


  1. In this paper ■ Describe the construction of the LEHD infrastructure The LEHD Infrastructure Files Introduction ✦ ... in particular the imputation mechanisms used ➲ What are QWI? ➲ What is it? ■ Describe the computation of the QWI statistics ➲ In this paper ✦ ... in particular the imputation mechanisms used Input Files Infrastructure Files ■ Describe the disclosure-proofing mechanism Forming Aggregated ■ Describe researcher access to infrastructure files and Estimates: QWI confidential QWI files Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 5/31

  2. The LEHD Infrastructure Files Introduction Input Files ➲ Wage records: UI ➲ Employer reports: ES202 ➲ Demographics Input Files Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 6/31

  3. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ➲ Wage records: UI ➲ Employer reports: ES202 ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31

  4. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31

  5. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files ■ identifies EARNINGS, EMPLOYER, TIME PERIOD Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31

  6. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files ■ identifies EARNINGS, EMPLOYER, TIME PERIOD Forming Aggregated ■ some limited other state-dependent information available Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 7/31

  7. Wage records: UI ■ report of an individual’s UI-covered earnings by an The LEHD Infrastructure Files Introduction employing entity Input Files ■ appears if at least one dollar was earned by that individual ➲ Wage records: UI ➲ Employer reports: ES202 during the quarter ➲ Demographics Infrastructure Files ■ identifies EARNINGS, EMPLOYER, TIME PERIOD Forming Aggregated ■ some limited other state-dependent information available Estimates: QWI Disclosure-proofing the QWI ■ in particular, for Minnesota, the ESTABLISHMENT is Publicly available files reported Conclusion May 6, 2005 - p. 7/31

  8. Employer reports: ES202 ... or QCEW The LEHD Infrastructure Files Introduction Input Files ➲ Wage records: UI ➲ Employer reports: ES202 ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 8/31

  9. Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ➲ Wage records: UI ➲ Employer reports: ES202 ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 8/31

  10. Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 8/31

  11. Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files ■ collects from employers covered by state unemployment Forming Aggregated insurance programs: Estimates: QWI ✦ employment Disclosure-proofing the QWI ✦ payroll Publicly available files ✦ geographic information Conclusion May 6, 2005 - p. 8/31

  12. Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files ■ collects from employers covered by state unemployment Forming Aggregated insurance programs: Estimates: QWI ✦ employment Disclosure-proofing the QWI ✦ payroll Publicly available files ✦ geographic information Conclusion ■ fundamental unit: ’reporting unit’ ( ≈ establishment) May 6, 2005 - p. 8/31

  13. Employer reports: ES202 ■ collected as part of the Covered Employment and Wages The LEHD Infrastructure Files Introduction (CEW) (administered by the BLS) Input Files ■ Also used as the inputs to the Business Employment ➲ Wage records: UI ➲ Employer reports: ES202 Dynamics (BED) ➲ Demographics Infrastructure Files ■ collects from employers covered by state unemployment Forming Aggregated insurance programs: Estimates: QWI ✦ employment Disclosure-proofing the QWI ✦ payroll Publicly available files ✦ geographic information Conclusion ■ fundamental unit: ’reporting unit’ ( ≈ establishment) ■ One report per establishment per quarter is filed May 6, 2005 - p. 8/31

  14. Demographics ■ Demographics are taken from a number of Census-internal The LEHD Infrastructure Files Introduction files derived from administrative data: Input Files ✦ Person Characteristics File (PCF) ➲ Wage records: UI ➲ Employer reports: ES202 ✦ Census Numident ➲ Demographics Infrastructure Files Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 9/31

  15. Demographics ■ Demographics are taken from a number of Census-internal The LEHD Infrastructure Files Introduction files derived from administrative data: Input Files ✦ Person Characteristics File (PCF) ➲ Wage records: UI ➲ Employer reports: ES202 ✦ Census Numident ➲ Demographics ■ Where available, more detailed data on individuals is also Infrastructure Files Forming Aggregated extracted from surveys and censuses: Estimates: QWI ✦ CPS Disclosure-proofing the QWI ✦ SIPP Publicly available files ✦ ACS Conclusion ✦ 1990 Census ✦ 2000 Census May 6, 2005 - p. 9/31

  16. The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Infrastructure Files Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 10/31

  17. EHF: Employment History Files ■ Job-level EHF The LEHD Infrastructure Files Introduction ✦ complete in-state work history for each individual on Input Files UIwage records. Infrastructure Files ✦ one record for each employee-employer combination – a ➲ EHF: Employment History Files job ➲ ICF: Individual Characteristics File ✦ earnings and employment patterns ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 11/31

  18. EHF: Employment History Files ■ Job-level EHF The LEHD Infrastructure Files Introduction ✦ complete in-state work history for each individual on Input Files UIwage records. Infrastructure Files ✦ one record for each employee-employer combination – a ➲ EHF: Employment History Files job ➲ ICF: Individual Characteristics File ✦ earnings and employment patterns ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address ■ Employer and establishment-level employment history List ➲ Flow so far ✦ QCEW-based employment-activity history for every SEIN Forming Aggregated (employer) and SEINUNIT (establishment) Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 11/31

  19. EHF: Employment History Files ■ Job-level EHF The LEHD Infrastructure Files Introduction ✦ complete in-state work history for each individual on Input Files UIwage records. Infrastructure Files ✦ one record for each employee-employer combination – a ➲ EHF: Employment History Files job ➲ ICF: Individual Characteristics File ✦ earnings and employment patterns ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address ■ Employer and establishment-level employment history List ➲ Flow so far ✦ QCEW-based employment-activity history for every SEIN Forming Aggregated (employer) and SEINUNIT (establishment) Estimates: QWI ■ Comparison of employment and activity of SEINs between Disclosure-proofing the QWI Publicly available files UI and QCEW files is done for QA purposes, and in Conclusion preparation of weighting. May 6, 2005 - p. 11/31

  20. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  21. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  22. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  23. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  24. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  25. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  26. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  27. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI ✦ Education Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 12/31

  28. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI ✦ Education Disclosure-proofing the QWI ✦ County of residence Publicly available files Conclusion May 6, 2005 - p. 12/31

  29. ICF: Individual Characteristics File ■ Demographic information from the PCF is merged with The LEHD Infrastructure Files Introduction universe of PIKs from wage records Input Files ■ records without a valid match flagged Infrastructure Files ➲ EHF: Employment History ■ CPS and SIPP identifiers are merged on. Files ➲ ICF: Individual ■ ... gender, education, and age information from the CPS Characteristics File ➲ ECF: Employer Characteristics File ■ Data completion ➲ GAL: Geocoded Address List ✦ Age ➲ Flow so far ✦ Gender Forming Aggregated Estimates: QWI ✦ Education Disclosure-proofing the QWI ✦ County of residence Publicly available files are each imputed ten times Conclusion May 6, 2005 - p. 12/31

  30. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  31. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  32. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  33. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  34. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  35. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  36. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 13/31

  37. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI ■ Imputation: Disclosure-proofing the QWI ✦ impute SIC if NAICS non-missing and vice-versa Publicly available files Conclusion May 6, 2005 - p. 13/31

  38. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI ■ Imputation: Disclosure-proofing the QWI ✦ impute SIC if NAICS non-missing and vice-versa Publicly available files ✦ unconditional impute of missing SIC and NAICS codes Conclusion May 6, 2005 - p. 13/31

  39. ECF: Employer Characteristics File ■ Two files: firm and establishment level, quarterly records The LEHD Infrastructure Files Introduction ■ Inputs: Input Files 1. ES202 Infrastructure Files ➲ EHF: Employment History 2. UI: supplement information on the ES202, extend Files ➲ ICF: Individual published BLS county-level employment data Characteristics File ➲ ECF: Employer Characteristics File 3. GAL: establishment geocodes ➲ GAL: Geocoded Address List 4. LDB (BLS) for backfilling NAICS information ➲ Flow so far Forming Aggregated ■ Longitudinal edits for consistency and data completion Estimates: QWI ■ Imputation: Disclosure-proofing the QWI ✦ impute SIC if NAICS non-missing and vice-versa Publicly available files ✦ unconditional impute of missing SIC and NAICS codes Conclusion ✦ geography conditional on industry May 6, 2005 - p. 13/31

  40. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  41. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  42. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  43. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  44. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  45. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 14/31

  46. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files Conclusion May 6, 2005 - p. 14/31

  47. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion May 6, 2005 - p. 14/31

  48. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion 1. geocoded May 6, 2005 - p. 14/31

  49. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion 1. geocoded 2. standardized May 6, 2005 - p. 14/31

  50. GAL: Geocoded Address List ■ ... is a data set containing unique commercial and residential The LEHD Infrastructure Files Introduction addresses Input Files ■ geocoded to the Census Block and latitude/longitude Infrastructure Files coordinates ➲ EHF: Employment History Files ➲ ICF: Individual ■ Inputs: Characteristics File ➲ ECF: Employer 1. ES202 data Characteristics File ➲ GAL: Geocoded Address List 2. Census Bureau’s Business Register (BR) ➲ Flow so far Forming Aggregated 3. Census Bureau’s Master Address File (MAF) Estimates: QWI 4. American Community Survey Place of Work file Disclosure-proofing the QWI (ACS-POW) Publicly available files ■ Addresses are Conclusion 1. geocoded 2. standardized 3. unduplicated (by firm name) May 6, 2005 - p. 14/31

  51. Flow so far The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 15/31

  52. Flow so far The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 15/31

  53. Flow so far The LEHD Infrastructure Files Introduction Input Files Infrastructure Files ➲ EHF: Employment History Files ➲ ICF: Individual Characteristics File ➲ ECF: Employer Characteristics File ➲ GAL: Geocoded Address List ➲ Flow so far Forming Aggregated Estimates: QWI Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 15/31

  54. The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Forming Aggregated Estimates: QWI Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 16/31

  55. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  56. Correction of spurious worker flows ■ Firm identifier: state-specific account number The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  57. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  58. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  59. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  60. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ■ Change in firm identifier ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  61. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ■ Change in firm identifier is the component determining when ➲ Correction of spurious worker flows a worker changes employers ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  62. Correction of spurious worker flows ■ Firm identifier: The LEHD Infrastructure Files Introduction ■ Account numbers can and do change: Input Files ✦ change in legal form Infrastructure Files ✦ a merger Forming Aggregated Estimates: QWI ■ Change in firm identifier ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ■ → non-economic change in identifier creates spurious flow ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 17/31

  63. Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 18/31

  64. Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction ■ → link entities that have different account numbes, but Input Files constitute the same economic entitiy Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 18/31

  65. Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction ■ → link entities that have different account numbes, but Input Files constitute the same economic entitiy Infrastructure Files ■ SPF provides a variety of link characteristics, based on the Forming Aggregated Estimates: QWI number of workers leaving an SEIN, in both absolute and ➲ Correction of spurious worker flows relative terms, and the number of workers entering an SEIN, ➲ Solution: Successor-Predecessor again in absolute and relative terms. File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 18/31

  66. Solution: Successor-Predecessor File ■ track large worker movements between SEINs The LEHD Infrastructure Files Introduction ■ → link entities that have different account numbes, but Input Files constitute the same economic entitiy Infrastructure Files ■ SPF provides a variety of link characteristics, based on the Forming Aggregated Estimates: QWI number of workers leaving an SEIN, in both absolute and ➲ Correction of spurious worker flows relative terms, and the number of workers entering an SEIN, ➲ Solution: Successor-Predecessor again in absolute and relative terms. File ➲ Attaching establishment characteristics to jobs ■ QWI: if 80% of an SEIN’s workers (the predecessor) are ➲ U2W: Unit to Worker Impute observed to move to a single successor, and that successor ➲ Probability Model ➲ Implementation absorbs 80% of its employees from a single predecessor, ➲ Implementation ➲ Computing the statistics then all flows between those two account numbers are Disclosure-proofing the QWI filtered out, and treated as if they had never existed. Publicly available files Conclusion May 6, 2005 - p. 18/31

  67. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  68. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: no establishment identification on wage record Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  69. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  70. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  71. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  72. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  73. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  74. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  75. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics ■ Important practical aspects: Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 19/31

  76. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics ■ Important practical aspects: Disclosure-proofing the QWI ✦ Non-ignorable missing data imputation Publicly available files Conclusion May 6, 2005 - p. 19/31

  77. Attaching establishment characteristics to jobs ■ Goal: achieve a high level of accuracy and detail The LEHD Infrastructure Files Introduction ■ Problem: Input Files ■ 30-40% of state-wide employment in multi-establishment Infrastructure Files firms Forming Aggregated Estimates: QWI ■ Solution: probability model for employment location and ➲ Correction of spurious worker flows ➲ Solution: imputation Successor-Predecessor File ■ Key elements are: ➲ Attaching establishment characteristics to jobs 1. distance between place-of-work and place-of-residence ➲ U2W: Unit to Worker Impute ➲ Probability Model 2. distribution of employment across establishments of ➲ Implementation ➲ Implementation multi-establishment firms. ➲ Computing the statistics ■ Important practical aspects: Disclosure-proofing the QWI ✦ Non-ignorable missing data imputation Publicly available files ✦ Several million imputations every quarter Conclusion May 6, 2005 - p. 19/31

  78. U2W: Unit to Worker Impute ■ workers i = 1 , ..., I The LEHD Infrastructure Files Introduction Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 20/31

  79. U2W: Unit to Worker Impute ■ workers i = 1 , ..., I The LEHD Infrastructure Files Introduction ■ firms j = 1 , ..., J Input Files Infrastructure Files Forming Aggregated Estimates: QWI ➲ Correction of spurious worker flows ➲ Solution: Successor-Predecessor File ➲ Attaching establishment characteristics to jobs ➲ U2W: Unit to Worker Impute ➲ Probability Model ➲ Implementation ➲ Implementation ➲ Computing the statistics Disclosure-proofing the QWI Publicly available files Conclusion May 6, 2005 - p. 20/31

Recommend


More recommend