This guide provides step-by-step instructions for downloading data from the National Historical Geographic Information System (NHGIS). NHGIS provides Census population, housing, agricultural, and economic data, along with GIS-compatible boundary files, for geographic units in the United States from 1790 to the present. It offers a user-friendly system for selecting and downloading raw data for processing in R.
Navigate to the NHGIS home page. You will need to sign up for a free account. Click on LOG IN located at the top right of the screen. Click on Create an account, fill out the required info. You should get an email verifying your registration.
Once you’re registered, go back to the NHGIS home page. We’re going to download county-level data on marriage and educational attainment.
Select on Geographic Levels. A window pops up allowing you to select the level of geography at which you want your data in. Select underneath County. Then hit the Submit button.
Select Years. This will bring up a window allowing you to select which years you want to grab data for. We want the 2014-2018 5-year ACS - select the checkbox next to 2014-2018 underneath the heading 5-year ranges. Click Submit.
Click on Topics. This allows you to filter which variables appear for selection by broad topic. Click next to Marriage under Core Demographics and Educational Attainment under Education in the Table Topic Filter column (not the Breakdown Filter). Click on Submit.
On the next page you should see the following
This will give you variables that contain information on Marriage AND Education. We want variables that contain info on each topic separately. Select the pull down menu next to AND and select OR.
Select next to Educational Attainment for the Population 25 Years and Over. That’s our education variable.
Select next to Sex by Marital Status for the Population 15 Years and Over. That’s our marriage variable.
Select Continue located in the Data Cart box at the top right.
No need to change anything in the next screen. Select Continue in the Data Cart box.
Keep the defaults in the next page. You can add a description under the Description box if you want to remind yourself what these data represent. Otherwise, click on Submit.
When the data are ready to download, you will get an email. In that email, click on the link it provides to download the data. On that page, select tables under Download Table Data. This will download your data in a zipped folder. Save this zip file in an appropriate folder on your hard drive.
You can unzip the file and bring in the csv file by using read_csv()
. Or you can use functions from the package ipumsr and automatically clean your dataset without even unzipping the zip file! To do this, first install the package ipumsr if you haven’t already done so.
install.packages("ipumsr")
And then load it in
library(ipumsr)
nhgis_csv_file <- "nhgis0161_csv.zip"
read_ipums_codebook()
nhgis_ddi <- read_ipums_codebook(nhgis_csv_file)
read_nhgis()
nhgis <- read_nhgis(
data_file = nhgis_csv_file,
)
## Use of data from NHGIS is subject to conditions including that users should
## cite the data appropriately. Use command `ipums_conditions()` for more details.
##
##
## Reading data file...
View the tibble and you’ll find not only the variable names, but their descriptions!
View(nhgis)
You can also download spatial shapefiles of different Census boundaries through NHGIS. To download Census tract shapefile boundaries, follow the steps below
Select Continue located in the Data Cart box at the top right.
No need to change anything in the next screen. Select Continue in the Data Cart box.
Keep the defaults in the next page. You can add a description under the Description box if you want to remind yourself what these data represent. Otherwise, click on Submit.
When the data are ready to download, you will get an email. In that email, click on the link it provides to download the data. On that page, select tables under Download Table Data. This will download your data in a zipped folder. Save this download in an appropriate folder on your hard drive.
Unzip the folder you downloaded. You will likely need to unzip another folder found within that unzipped folder. After that you should find Census tract shapefiles. Note that you will get a shapefile of census tracts for the entire United States. Unlike with the Census API, you can’t subset your geometry to a lower scale of geography like the state. You’ll need to use a function like st_within()
in the sf package or ms_clip()
in the rmapshaper.
Check out NHGIS’ User Resources page for more tutorials, answers to Frequently Asked Questions, and other information.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Website created and maintained by Noli Brazil