This guide provides step-by-step instructions for downloading data from the National Historical Geographic Information System (NHGIS). NHGIS provides Census population, housing, agricultural, and economic data, along with GIS-compatible boundary files, for geographic units in the United States from 1790 to the present. It offers a user-friendly system for selecting and downloading raw data for processing in R.

Download tabular data


Navigate to the NHGIS home page. You will need to sign up for a free account. Click on LOG IN located at the top right of the screen. Click on Create an account, fill out the required info. You should get an email verifying your registration.

Once you’re registered, go back to the NHGIS home page. We’re going to download county-level data on marriage and educational attainment.

  1. Click on Get Data next to Start Here.
  2. On the next page, you should see the following 4 filters


Select on Geographic Levels. A window pops up allowing you to select the level of geography at which you want your data in. Select underneath County. Then hit the Submit button.

  1. Select Years. This will bring up a window allowing you to select which years you want to grab data for. We want the 2014-2018 5-year ACS - select the checkbox next to 2014-2018 underneath the heading 5-year ranges. Click Submit.

  2. Click on Topics. This allows you to filter which variables appear for selection by broad topic. Click next to Marriage under Core Demographics and Educational Attainment under Education in the Table Topic Filter column (not the Breakdown Filter). Click on Submit.

  3. On the next page you should see the following

This will give you variables that contain information on Marriage AND Education. We want variables that contain info on each topic separately. Select the pull down menu next to AND and select OR.

  1. Leave the Datasets filter alone. Below the filters are the variables available to download given the filters we selected. Click on the Popularity column. This will sort the variables by the most downloaded by NHGIS users.

  1. Select next to Educational Attainment for the Population 25 Years and Over. That’s our education variable.

  2. Select next to Sex by Marital Status for the Population 15 Years and Over. That’s our marriage variable.

  3. Select Continue located in the Data Cart box at the top right.

  4. No need to change anything in the next screen. Select Continue in the Data Cart box.

  5. Keep the defaults in the next page. You can add a description under the Description box if you want to remind yourself what these data represent. Otherwise, click on Submit.

  6. When the data are ready to download, you will get an email. In that email, click on the link it provides to download the data. On that page, select tables under Download Table Data. This will download your data in a zipped folder. Save this zip file in an appropriate folder on your hard drive.

  7. You can unzip the file and bring in the csv file by using read_csv(). Or you can use functions from the package ipumsr and automatically clean your dataset without even unzipping the zip file! To do this, first install the package ipumsr if you haven’t already done so.

install.packages("ipumsr")

And then load it in

library(ipumsr)
  1. Save the name of the NHGIS zip file you downloaded. For example, my file is name nhgis0161_csv.zip (yours will likely be different).
nhgis_csv_file <- "nhgis0161_csv.zip"
  1. All NHGIS downloads also contains metadata (i.e. codebook). This is a valuable file as it lets you know what each variable in your file represents, among many other importance pieces of information. Read that in using the function read_ipums_codebook()
nhgis_ddi <- read_ipums_codebook(nhgis_csv_file)
  1. Finally, read in the data using the function read_nhgis()
nhgis <- read_nhgis(
  data_file = nhgis_csv_file,
)
## Use of data from NHGIS is subject to conditions including that users should
## cite the data appropriately. Use command `ipums_conditions()` for more details.
## 
## 
## Reading data file...

View the tibble and you’ll find not only the variable names, but their descriptions!

View(nhgis)

Download shapefiles


You can also download spatial shapefiles of different Census boundaries through NHGIS. To download Census tract shapefile boundaries, follow the steps below

  1. Click on Get Data next to Start Here from the NHGIS homepage.
  2. On the next page, select the Geographic Levels filter at the top.
  3. A window pops up allowing you to select the level of geography at which you want your data in. Select underneath Census Tract. Click on the Submit button.
  4. Click on the Years filter.
  5. Select the year you want your boundaries in. The Census changes boundaries for most geographies every 10 years. So, if you select, for example, the 5-year range 2014-2018, you will get Census tract boundaries from 2010.
  6. Click on the GIS FILES tab (see figure below) under the Select Data section. Then click on next to the appropriate Census tract year.

  1. Select Continue located in the Data Cart box at the top right.

  2. No need to change anything in the next screen. Select Continue in the Data Cart box.

  3. Keep the defaults in the next page. You can add a description under the Description box if you want to remind yourself what these data represent. Otherwise, click on Submit.

  4. When the data are ready to download, you will get an email. In that email, click on the link it provides to download the data. On that page, select tables under Download Table Data. This will download your data in a zipped folder. Save this download in an appropriate folder on your hard drive.

  5. Unzip the folder you downloaded. You will likely need to unzip another folder found within that unzipped folder. After that you should find Census tract shapefiles. Note that you will get a shapefile of census tracts for the entire United States. Unlike with the Census API, you can’t subset your geometry to a lower scale of geography like the state. You’ll need to use a function like st_within() in the sf package or ms_clip() in the rmapshaper.

Resources


Check out NHGIS’ User Resources page for more tutorials, answers to Frequently Asked Questions, and other information.


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Website created and maintained by Noli Brazil