County-level geographic calculations

Author

Vagish Hemmige

This script performs county-level geographic preparation for the access analysis, including integration of HIV case counts, county boundaries, and related spatial attributes used to support later tract-level and catchment-level summaries. These county-level objects provide the initial geographic representation of HIV burden before finer tract-based aggregation.

Source code

The full R script is available at:

R/county_geographic_calculations.R

This R script file is itself reliant on the following helper files:

Preparing transplant-center `sf` objects

This code converts multiple transplant-center datasets into geographic (sf) objects and organizes them in nested lists by organ and year. The end result is four parallel list objects that can be used downstream for buffering, distance calculations, and map overlays.

What gets created

Transplant_centers_all_sf
Spatial (sf) version of all transplant centers, by organ.
Transplant_centers_active_SF
Spatial (sf) version of active transplant centers, by organ and year.
Transplant_centers_HIV_sf
Spatial (sf) subset of centers that performed an HIV R+ transplant at the given time point, by organ and year.
Transplant_centers_HOPE_sf
Spatial (sf) subset of centers that performed an HOPE (HIV D+/R+) transplant at the given time point, by organ and year.

Step 1: initialize empty containers

We start by creating empty lists. These will be filled inside the loops:

the outer list is keyed by organ_loop (from organ_list)
for year-specific objects, there is an inner list keyed by year_loop (from year_list)

Step 2: Loop Over Organ Types

For each organ in organ_list:

The dataset of all transplant centers for that organ is converted to an sf object.
The coordinate reference system is transformed to EPSG:5070 (NAD83 / Conus Albers).

This projection is used because it is well-suited for U.S. spatial analyses and provides consistent units for distance-based calculations.

Step 3: Loop Over Years Within Each Organ

For each year in year_list, three organ-year–specific spatial datasets are created:

A) Active Centers

The year-specific dataset of active transplant centers is converted to sf and projected to EPSG:5070.

B) HIV R+ Centers

From the full set of centers for that organ, centers are filtered to retain only those whose OTCCode appears in the HIV center volume file for that organ and year.

This identifies centers that performed at least one HIV R+ transplant at that time point.

C) HOPE (HIV D+/R+) Centers

Similarly, the full center list is filtered using the HOPE center volume file to identify centers that performed HIV D+/R+ transplants during that year.

Final Structure

After the loops complete, the resulting structure is:

Transplant_centers_all_sf[[organ]]
Transplant_centers_active_SF[[organ]][[year]]
Transplant_centers_HIV_sf[[organ]][[year]]
Transplant_centers_HOPE_sf[[organ]][[year]]

This consistent nested design allows downstream functions to iterate cleanly across organ types and years when generating maps, buffers, and catchment-area analyses.

Click to show/hide R Code

Transplant_centers_all_sf<-list()
Transplant_centers_active_SF<-list()
Transplant_centers_HIV_sf<-list()
Transplant_centers_HOPE_sf<-list()

for (organ_loop in organ_list)
  
{

  #Convert transplant center data to geographical data
  Transplant_centers_all_sf[[organ_loop]]<-Transplant_centers_all[[organ_loop]]%>%
    st_transform(5070)  
  
  
  
  for (year_loop in year_list) {
    

    Transplant_centers_active_SF[[organ_loop]][[year_loop]]<-Transplant_centers_active[[organ_loop]][[year_loop]]%>%
      st_transform(5070)
    

    #Geographic file of centers that performed an HIV HIV R+ transplant at the designated time point
    Transplant_centers_HIV_sf[[organ_loop]][[year_loop]] <- Transplant_centers_all_sf[[organ_loop]]%>%
      filter(OTCCode %in% HIV_center_volumes[[organ_loop]][[year_loop]]$REC_CTR_CD)
    
    #Geographic file of centers that performed an HIV D+/R+ transplant at the designated time point
    Transplant_centers_HOPE_sf[[organ_loop]][[year_loop]] <- Transplant_centers_all_sf[[organ_loop]]%>%
      filter(OTCCode %in% HOPE_center_volumes[[organ_loop]][[year_loop]]$REC_CTR_CD)
    
    
    
    
  }
  
  
}

Merging CDC HIV Data with County and Tract Geography

This section joins CDC Atlas HIV surveillance data to U.S. county and census tract population files, then derives tract-level HIV estimates for downstream spatial analysis.

Three list objects are created:

Merged_Counties
County-level spatial data merged with CDC HIV totals.
Merged_Counties_nonSF
County-level data with geometry removed (used for attribute joins).
Merged_tracts
Census tract–level spatial data with estimated HIV-positive and HIV-negative populations.

Each object is indexed by year.

Step 1: Initialize Storage Lists

Empty lists are created to store year-specific merged datasets.

Each list will ultimately contain one entry per year in year_list.

Step 2: Loop Over Years

For each year:

A) Merge County Population Data with CDC Atlas Data

County geographic population files are joined with CDC Atlas HIV totals.

The join matches GEOID (county FIPS) from the population file
To geo_id from the CDC Atlas dataset
A left_join() ensures that all counties remain in the dataset

The result is a spatial (sf) object containing:

County geometry
County population
County HIV case counts and rates

This object is stored in:

Click to show/hide R Code

Merged_Counties<-list()
Merged_Counties_nonSF<-list()
Merged_tracts<-list()

for (year_loop in year_list) {

  #Join county geographical data with CDC data.
  Merged_Counties[[year_loop]]<-left_join(us_counties_population[[year_loop]],
                                          AtlasPlusTableData_county_totals[[year_loop]], 
                                          by=join_by(GEOID==geo_id))
  
  Merged_Counties_nonSF[[year_loop]]<-Merged_Counties[[year_loop]]%>%
    st_drop_geometry()
  

  
  #Merge county data with tracts
  Merged_tracts[[year_loop]]<-left_join(us_tracts_population[[year_loop]], 
                                Merged_Counties_nonSF[[year_loop]], 
                                by=c("county_fips"="GEOID"))%>%
    select(-NAME.x, -NAME.y)%>%
    #Estimate census tract HIV population
    mutate(tract_cases=county_cases*tract_population/county_population)%>%
    #Estimate HIV-negative population in census tracts
    mutate(tract_noncases=tract_population-tract_cases)%>%
    st_transform(5070)
  
  
}

--- title: "County-level geographic calculations" format: html --- This script performs county-level geographic preparation for the access analysis, including integration of HIV case counts, county boundaries, and related spatial attributes used to support later tract-level and catchment-level summaries. These county-level objects provide the initial geographic representation of HIV burden before finer tract-based aggregation. ::: {.Rcode title="Source code"} The full R script is available at: - [`R/county_geographic_calculations.R`](https://github.com/VagishHemmige/HOPE-CDC-analysis-2026/blob/master/R/county_geographic_calculations.R) This R script file is itself reliant on the following helper files: - [`R/setup.R`](https://github.com/VagishHemmige/HOPE-CDC-analysis-2026/blob/master/R/setup.R) - [`R/functions.R`](https://github.com/VagishHemmige/HOPE-CDC-analysis-2026/blob/master/R/functions.R) ::: ## Preparing transplant-center `sf` objects This code converts multiple transplant-center datasets into geographic (`sf`) objects and organizes them in nested lists by **organ** and **year**. The end result is four parallel list objects that can be used downstream for buffering, distance calculations, and map overlays. ### What gets created - `Transplant_centers_all_sf` Spatial (`sf`) version of **all** transplant centers, by organ. - `Transplant_centers_active_SF` Spatial (`sf`) version of **active** transplant centers, by organ *and* year. - `Transplant_centers_HIV_sf` Spatial (`sf`) subset of centers that performed an **HIV R+** transplant at the given time point, by organ and year. - `Transplant_centers_HOPE_sf` Spatial (`sf`) subset of centers that performed an **HOPE (HIV D+/R+)** transplant at the given time point, by organ and year. ### Step 1: initialize empty containers We start by creating empty lists. These will be filled inside the loops: - the **outer list** is keyed by `organ_loop` (from `organ_list`) - for year-specific objects, there is an **inner list** keyed by `year_loop` (from `year_list`) ### Step 2: Loop Over Organ Types For each organ in `organ_list`: 1. The dataset of all transplant centers for that organ is converted to an `sf` object. 2. The coordinate reference system is transformed to **EPSG:5070 (NAD83 / Conus Albers)**. This projection is used because it is well-suited for U.S. spatial analyses and provides consistent units for distance-based calculations. --- ### Step 3: Loop Over Years Within Each Organ For each `year` in `year_list`, three organ-year–specific spatial datasets are created: #### A) Active Centers The year-specific dataset of active transplant centers is converted to `sf` and projected to EPSG:5070. #### B) HIV R+ Centers From the full set of centers for that organ, centers are filtered to retain only those whose `OTCCode` appears in the HIV center volume file for that organ and year. This identifies centers that performed at least one HIV R+ transplant at that time point. #### C) HOPE (HIV D+/R+) Centers Similarly, the full center list is filtered using the HOPE center volume file to identify centers that performed HIV D+/R+ transplants during that year. --- ### Final Structure After the loops complete, the resulting structure is: - `Transplant_centers_all_sf[[organ]]` - `Transplant_centers_active_SF[[organ]][[year]]` - `Transplant_centers_HIV_sf[[organ]][[year]]` - `Transplant_centers_HOPE_sf[[organ]][[year]]` This consistent nested design allows downstream functions to iterate cleanly across organ types and years when generating maps, buffers, and catchment-area analyses. ```{r, eval=FALSE} Transplant_centers_all_sf<-list() Transplant_centers_active_SF<-list() Transplant_centers_HIV_sf<-list() Transplant_centers_HOPE_sf<-list() for (organ_loop in organ_list) { #Convert transplant center data to geographical data Transplant_centers_all_sf[[organ_loop]]<-Transplant_centers_all[[organ_loop]]%>% st_transform(5070) for (year_loop in year_list) { Transplant_centers_active_SF[[organ_loop]][[year_loop]]<-Transplant_centers_active[[organ_loop]][[year_loop]]%>% st_transform(5070) #Geographic file of centers that performed an HIV HIV R+ transplant at the designated time point Transplant_centers_HIV_sf[[organ_loop]][[year_loop]] <- Transplant_centers_all_sf[[organ_loop]]%>% filter(OTCCode %in% HIV_center_volumes[[organ_loop]][[year_loop]]$REC_CTR_CD) #Geographic file of centers that performed an HIV D+/R+ transplant at the designated time point Transplant_centers_HOPE_sf[[organ_loop]][[year_loop]] <- Transplant_centers_all_sf[[organ_loop]]%>% filter(OTCCode %in% HOPE_center_volumes[[organ_loop]][[year_loop]]$REC_CTR_CD) } } ``` ## Merging CDC HIV Data with County and Tract Geography This section joins CDC Atlas HIV surveillance data to U.S. county and census tract population files, then derives tract-level HIV estimates for downstream spatial analysis. Three list objects are created: - `Merged_Counties` County-level spatial data merged with CDC HIV totals. - `Merged_Counties_nonSF` County-level data with geometry removed (used for attribute joins). - `Merged_tracts` Census tract–level spatial data with estimated HIV-positive and HIV-negative populations. Each object is indexed by `year`. --- ### Step 1: Initialize Storage Lists Empty lists are created to store year-specific merged datasets. Each list will ultimately contain one entry per year in `year_list`. --- ### Step 2: Loop Over Years For each `year`: --- ### A) Merge County Population Data with CDC Atlas Data County geographic population files are joined with CDC Atlas HIV totals. - The join matches `GEOID` (county FIPS) from the population file - To `geo_id` from the CDC Atlas dataset - A `left_join()` ensures that all counties remain in the dataset The result is a spatial (`sf`) object containing: - County geometry - County population - County HIV case counts and rates This object is stored in: ```{r, eval=FALSE} Merged_Counties<-list() Merged_Counties_nonSF<-list() Merged_tracts<-list() for (year_loop in year_list) { #Join county geographical data with CDC data. Merged_Counties[[year_loop]]<-left_join(us_counties_population[[year_loop]], AtlasPlusTableData_county_totals[[year_loop]], by=join_by(GEOID==geo_id)) Merged_Counties_nonSF[[year_loop]]<-Merged_Counties[[year_loop]]%>% st_drop_geometry() #Merge county data with tracts Merged_tracts[[year_loop]]<-left_join(us_tracts_population[[year_loop]], Merged_Counties_nonSF[[year_loop]], by=c("county_fips"="GEOID"))%>% select(-NAME.x, -NAME.y)%>% #Estimate census tract HIV population mutate(tract_cases=county_cases*tract_population/county_population)%>% #Estimate HIV-negative population in census tracts mutate(tract_noncases=tract_population-tract_cases)%>% st_transform(5070) } ```

Preparing transplant-center sf objects

What gets created

Step 1: initialize empty containers

Step 2: Loop Over Organ Types

Step 3: Loop Over Years Within Each Organ

A) Active Centers

B) HIV R+ Centers

C) HOPE (HIV D+/R+) Centers

Final Structure

Merging CDC HIV Data with County and Tract Geography

Step 1: Initialize Storage Lists

Step 2: Loop Over Years

A) Merge County Population Data with CDC Atlas Data

Preparing transplant-center `sf` objects