Cens u s S u bject Tables AN ALYZIN G U S C E N SU S DATA IN P YTH ON Lee Hachadoorian Asst . Professor of Instr u ction , Temple Uni v ersit y
Cens u s Data Prod u cts Decennial Cens u s of Pop u lation and Ho u sing American Comm u nit y S u r v e y ( ann u al ) C u rrent Pop u lation S u r v e y ( monthl y) Economic S u r v e y (5 y ears ) Ann u al S u r v e y of State and Local Go v ernment Finances ANALYZING US CENSUS DATA IN PYTHON
Co u rse Prereq u isites Lists Dictionaries Package imports Control � o w, looping List comprehensions pandas data frames ANALYZING US CENSUS DATA IN PYTHON
Introd u ction to Cens u s Topics Decennial Cens u s of Pop u lation and Ho u sing Demographics ( age , se x, race , famil y str u ct u re ) Ho u sing Occ u panc y and O w nership (v acant / occ u pied , rent / o w n ) Gro u p Q u arters Pop u lation ( prisons , college dorms ) American Comm u nit y S u r v e y Ed u cational A � ainment Comm u ting ( mode , time lea v ing , time tra v elled ) Disabilit y Stat u s ANALYZING US CENSUS DATA IN PYTHON
Str u ct u re of a S u bject Table ANALYZING US CENSUS DATA IN PYTHON
S u bject Table to Data Frame states.head() total ... hispanic_multiracial Alabama 4779736 ... 10806 Alaska 710231 ... 6507 Arizona 6392017 ... 103669 Arkansas 2915918 ... 11173 California 37253956 ... 846688 [5 rows x 17 columns] ANALYZING US CENSUS DATA IN PYTHON
Basic Data Vis u ali z ation import seaborn as sns sns.set() sns.barplot( x = "total", y = states.index, data = states ) Going f u rther : Data Vis u ali z ation w ith Seaborn ANALYZING US CENSUS DATA IN PYTHON
Let ' s practice ! AN ALYZIN G U S C E N SU S DATA IN P YTH ON
Using the Cens u s API AN ALYZIN G U S C E N SU S DATA IN P YTH ON Lee Hachadoorian Asst . Professor of Instr u ction , Temple Uni v ersit y
Str u ct u re of a Cens u s API Req u est https://api.census.gov/data/2010/dec/sf1?get=NAME,P001001,&for=state:* ANALYZING US CENSUS DATA IN PYTHON
Str u ct u re of a Cens u s API Req u est https://api.census.gov/data/2010/dec/sf1? Base URL Host = https://api.census.gov/data Year = 2010 Dataset = dec/sf1 ANALYZING US CENSUS DATA IN PYTHON
Str u ct u re of a Cens u s API Req u est https://api.census.gov/data/2010/dec/sf1?get=NAME,P001001,&for=state:* Base URL Host = https://api.census.gov/data Year = 2010 Dataset = dec/sf1 Parameters get - List of v ariables for - Geograph y of interest ANALYZING US CENSUS DATA IN PYTHON
The req u ests Librar y import requests HOST = "https://api.census.gov/data" year = "2010" dataset = "dec/sf1" base_url = "/".join([HOST, year, dataset]) predicates = {} get_vars = ["NAME", "AREALAND", "P001001"] predicates["get"] = ",".join(get_vars) predicates["for"] = "state:*" r = requests.get(base_url, params=predicates) ANALYZING US CENSUS DATA IN PYTHON
E x amine the Response print(r.text) [["NAME","AREALAND","P001001","state"], ["Alabama","131170787086","4779736","01"], ["Alaska","1477953211577","710231","02"], ["Arizona","294207314414","6392017","04"], ... ANALYZING US CENSUS DATA IN PYTHON
Response Errors print(r.text) error: unknown variable 'nonexistentvariable' ANALYZING US CENSUS DATA IN PYTHON
Create User - Friendl y Col u mn Names print(r.json()[0]) ['NAME', 'AREALAND', 'P001001', 'state'] Create eas y to remember col u mn names u sing snake _ case : col_names = ["name", "area_m2", "total_pop", "state"] ANALYZING US CENSUS DATA IN PYTHON
Load into Pandas Data Frame import pandas as pd df = pd.DataFrame(columns=col_names, data=r.json()[1:]) # Fix data types df["area_m2"] = df["area_m2"].astype(int) df["total_pop"] = df["total_pop"].astype(int) print(df.head()) name area_m2 total_pop state 0 Alabama 131170787086 4779736 01 1 Alaska 1477953211577 710231 02 2 Arizona 294207314414 6392017 04 3 Arkansas 134771261408 2915918 05 4 California 403466310059 37253956 06 ANALYZING US CENSUS DATA IN PYTHON
Find 3 Most Densel y Settled States # Create new column df["pop_per_km2"] = 1000**2 * df["total_pop"] / df["area_m2"] # Find top 3 df.nlargest(3, "pop_per_km2") name area_m2 total_pop state pop_per_km2 8 District of Columbia 158114680 601723 11 3805.611218 30 New Jersey 19047341691 8791894 34 461.581156 51 Puerto Rico 8867536532 3725789 72 420.160547 ANALYZING US CENSUS DATA IN PYTHON
Let ' s practice ! AN ALYZIN G U S C E N SU S DATA IN P YTH ON
Cens u s Geograph y AN ALYZIN G U S C E N SU S DATA IN P YTH ON Lee Hachadoorian Asst . Professor of Instr u ction , Temple Uni v ersit y
Req u est All Geographies import requests HOST = "https://api.census.gov/data" year = "2010" dataset = "dec/sf1" base_url = "/".join([HOST, year, dataset]) predicates = {} predicates["get"] = "NAME,P001001" predicates["for"] = "state:*" r = requests.get(base_url, params=predicates) ANALYZING US CENSUS DATA IN PYTHON
Req u est Specific Geographies import requests HOST = "https://api.census.gov/data" year = "2010" dataset = "dec/sf1" base_url = "/".join([HOST, year, dataset]) predicates = {} predicates["get"] = "NAME,P001001" predicates["for"] = "state:42" r = requests.get(base_url, params=predicates) ANALYZING US CENSUS DATA IN PYTHON
1 h � ps :// cens u s . misso u ri . ed u/ geocodes / ANALYZING US CENSUS DATA IN PYTHON
Geographic Entities Legal / Administrati v e Statistical State Block Co u nt y ( Cens u s ) Tract Congressional Districts Metropolitan / Micropolitan Statistical Area School Districts ZIP Code Tab u lation Area etc . etc . 1 h � ps ://www. cens u s . go v/ geo / ed u cation / legstat _ geo . html ANALYZING US CENSUS DATA IN PYTHON
ANALYZING US CENSUS DATA IN PYTHON
The " in " Predicate Req u est all co u nties in speci � c states : predicates["for"] = "county:*" predicates["in"] = "state:33,50" Req u est speci � c co u nties in one state : predicates["for"] = "county:001,003" predicates["in"] = "state:33" r = requests.get(base_url, params=predicates) ANALYZING US CENSUS DATA IN PYTHON
Places " An incorporated place is established to pro v ide go v ernmental f u nctions for a concentration of people …. An incorporated place u s u all y is a cit y, to w n , v illage , or boro u gh , b u t can ha v e other legal descriptions ." " Cens u s Designated Places ( CDPs ) are the statistical co u nterparts of incorporated places , and are delineated to pro v ide data for se � led concentrations of pop u lation that are identi � able b y name b u t are not legall y incorporated u nder the la w s of the state in w hich the y are located ." So u rce : h � ps ://www. cens u s . go v/ geo / reference / gtc / gtc _ place . html ANALYZING US CENSUS DATA IN PYTHON
Geograph y Le v el Geograph y Hierarch y 40 state 50 state › co u nt y 60 state › co u nt y› co u nt y s u bdi v ision 101 state › co u nt y› tract › block 140 state › co u nt y› tract 150 state › co u nt y› tract › block gro u p 160 state › place h � ps :// api . cens u s . go v/ data /2010/ dec / sf 1/ geograph y. html ANALYZING US CENSUS DATA IN PYTHON
Part Geographies state › congressional district › co u nt y ( or part ) predicates = {} predicates["get"] = "NAME,P001001" predicates["for"] = "county (or part):*" predicates["in"] = "state:42;congressional district:02" r = requests.get(base_url, params=predicates) print(r.text) [["NAME","P001001","state","congressional district","county"], ["Montgomery County (part)","36793","42","02","091"], ["Philadelphia County (part)","593484","42","02","101"]] ANALYZING US CENSUS DATA IN PYTHON
Let ' s practice ! AN ALYZIN G U S C E N SU S DATA IN P YTH ON
Recommend
More recommend