This code converts multiple transplant-center datasets into geographic (sf) objects and organizes them in nested lists by organ and year. The end result is four parallel list objects that can be used downstream for buffering, distance calculations, and map overlays.
What gets created
Transplant_centers_all_sf
Spatial (sf) version of all transplant centers, by organ.
Transplant_centers_active_SF
Spatial (sf) version of active transplant centers, by organ and year.
Transplant_centers_HIV_sf
Spatial (sf) subset of centers that performed an HIV R+ transplant at the given time point, by organ and year.
Transplant_centers_HOPE_sf
Spatial (sf) subset of centers that performed an HOPE (HIV D+/R+) transplant at the given time point, by organ and year.
Step 1: initialize empty containers
We start by creating empty lists. These will be filled inside the loops:
the outer list is keyed by organ_loop (from organ_list)
for year-specific objects, there is an inner list keyed by year_loop (from year_list)
Step 2: Loop Over Organ Types
For each organ in organ_list:
The dataset of all transplant centers for that organ is converted to an sf object.
The coordinate reference system is transformed to EPSG:5070 (NAD83 / Conus Albers).
This projection is used because it is well-suited for U.S. spatial analyses and provides consistent units for distance-based calculations.
Step 3: Loop Over Years Within Each Organ
For each year in year_list, three organ-year–specific spatial datasets are created:
A) Active Centers
The year-specific dataset of active transplant centers is converted to sf and projected to EPSG:5070.
B) HIV R+ Centers
From the full set of centers for that organ, centers are filtered to retain only those whose OTCCode appears in the HIV center volume file for that organ and year.
This identifies centers that performed at least one HIV R+ transplant at that time point.
C) HOPE (HIV D+/R+) Centers
Similarly, the full center list is filtered using the HOPE center volume file to identify centers that performed HIV D+/R+ transplants during that year.
Final Structure
After the loops complete, the resulting structure is:
Transplant_centers_all_sf[[organ]]
Transplant_centers_active_SF[[organ]][[year]]
Transplant_centers_HIV_sf[[organ]][[year]]
Transplant_centers_HOPE_sf[[organ]][[year]]
This consistent nested design allows downstream functions to iterate cleanly across organ types and years when generating maps, buffers, and catchment-area analyses.
Click to show/hide R Code
Transplant_centers_all_sf<-list()Transplant_centers_active_SF<-list()Transplant_centers_HIV_sf<-list()Transplant_centers_HOPE_sf<-list()for (organ_loop in organ_list){#Convert transplant center data to geographical data Transplant_centers_all_sf[[organ_loop]]<-Transplant_centers_all[[organ_loop]]%>%st_transform(5070) for (year_loop in year_list) { Transplant_centers_active_SF[[organ_loop]][[year_loop]]<-Transplant_centers_active[[organ_loop]][[year_loop]]%>%st_transform(5070)#Geographic file of centers that performed an HIV HIV R+ transplant at the designated time point Transplant_centers_HIV_sf[[organ_loop]][[year_loop]] <- Transplant_centers_all_sf[[organ_loop]]%>%filter(OTCCode %in% HIV_center_volumes[[organ_loop]][[year_loop]]$REC_CTR_CD)#Geographic file of centers that performed an HIV D+/R+ transplant at the designated time point Transplant_centers_HOPE_sf[[organ_loop]][[year_loop]] <- Transplant_centers_all_sf[[organ_loop]]%>%filter(OTCCode %in% HOPE_center_volumes[[organ_loop]][[year_loop]]$REC_CTR_CD) }}
Merging CDC HIV Data with County and Tract Geography
This section joins CDC Atlas HIV surveillance data to U.S. county and census tract population files, then derives tract-level HIV estimates for downstream spatial analysis.
Three list objects are created:
Merged_Counties
County-level spatial data merged with CDC HIV totals.
Merged_Counties_nonSF
County-level data with geometry removed (used for attribute joins).
Merged_tracts
Census tract–level spatial data with estimated HIV-positive and HIV-negative populations.
Each object is indexed by year.
Step 1: Initialize Storage Lists
Empty lists are created to store year-specific merged datasets.
Each list will ultimately contain one entry per year in year_list.
Step 2: Loop Over Years
For each year:
A) Merge County Population Data with CDC Atlas Data
County geographic population files are joined with CDC Atlas HIV totals.
The join matches GEOID (county FIPS) from the population file
To geo_id from the CDC Atlas dataset
A left_join() ensures that all counties remain in the dataset
The result is a spatial (sf) object containing:
County geometry
County population
County HIV case counts and rates
This object is stored in:
Click to show/hide R Code
Merged_Counties<-list()Merged_Counties_nonSF<-list()Merged_tracts<-list()for (year_loop in year_list) {#Join county geographical data with CDC data. Merged_Counties[[year_loop]]<-left_join(us_counties_population[[year_loop]], AtlasPlusTableData_county_totals[[year_loop]], by=join_by(GEOID==geo_id)) Merged_Counties_nonSF[[year_loop]]<-Merged_Counties[[year_loop]]%>%st_drop_geometry()#Merge county data with tracts Merged_tracts[[year_loop]]<-left_join(us_tracts_population[[year_loop]], Merged_Counties_nonSF[[year_loop]], by=c("county_fips"="GEOID"))%>%select(-NAME.x, -NAME.y)%>%#Estimate census tract HIV populationmutate(tract_cases=county_cases*tract_population/county_population)%>%#Estimate HIV-negative population in census tractsmutate(tract_noncases=tract_population-tract_cases)%>%st_transform(5070)}
Other portions of the analysis
Setup: Defines global paths, data sources, cohort inclusion criteria, and analysis-wide constants.
Functions: Reusable helper functions for cohort construction, matching, costing, and modeling.
Tables: Summary tables and regression outputs generated from the final models.
Figures:Visualizations of costs, risks, and model-based estimates.
---title: "County-level geographic calculations"format: html---The code in this script creates the cohort of kidney transplant patients eligible for the study for use in other scripts.::: {.Rcode title="Source code"}The full R script is available at:- [`R/county_geographic_calculations.R`](https://github.com/VagishHemmige/HOPE-CDC-analysis-2026/blob/master/R/county_geographic_calculations.R)This R script file is itself reliant on the following helper files:- [`R/setup.R`](https://github.com/VagishHemmige/HOPE-CDC-analysis-2026/blob/master/R/setup.R)- [`R/functions.R`](https://github.com/VagishHemmige/HOPE-CDC-analysis-2026/blob/master/R/functions.R):::## Preparing transplant-center `sf` objectsThis code converts multiple transplant-center datasets into geographic (`sf`) objects and organizes them in nested lists by **organ** and **year**. The end result is four parallel list objects that can be used downstream for buffering, distance calculations, and map overlays.### What gets created- `Transplant_centers_all_sf` Spatial (`sf`) version of **all** transplant centers, by organ.- `Transplant_centers_active_SF` Spatial (`sf`) version of **active** transplant centers, by organ *and* year.- `Transplant_centers_HIV_sf` Spatial (`sf`) subset of centers that performed an **HIV R+** transplant at the given time point, by organ and year.- `Transplant_centers_HOPE_sf` Spatial (`sf`) subset of centers that performed an **HOPE (HIV D+/R+)** transplant at the given time point, by organ and year.### Step 1: initialize empty containersWe start by creating empty lists. These will be filled inside the loops:- the **outer list** is keyed by `organ_loop` (from `organ_list`)- for year-specific objects, there is an **inner list** keyed by `year_loop` (from `year_list`)### Step 2: Loop Over Organ TypesFor each organ in `organ_list`:1. The dataset of all transplant centers for that organ is converted to an `sf` object.2. The coordinate reference system is transformed to **EPSG:5070 (NAD83 / Conus Albers)**.This projection is used because it is well-suited for U.S. spatial analyses and provides consistent units for distance-based calculations.---### Step 3: Loop Over Years Within Each OrganFor each `year` in `year_list`, three organ-year–specific spatial datasets are created:#### A) Active CentersThe year-specific dataset of active transplant centers is converted to `sf` and projected to EPSG:5070.#### B) HIV R+ CentersFrom the full set of centers for that organ, centers are filtered to retain only those whose `OTCCode` appears in the HIV center volume file for that organ and year.This identifies centers that performed at least one HIV R+ transplant at that time point.#### C) HOPE (HIV D+/R+) CentersSimilarly, the full center list is filtered using the HOPE center volume file to identify centers that performed HIV D+/R+ transplants during that year.---### Final StructureAfter the loops complete, the resulting structure is:- `Transplant_centers_all_sf[[organ]]`- `Transplant_centers_active_SF[[organ]][[year]]`- `Transplant_centers_HIV_sf[[organ]][[year]]`- `Transplant_centers_HOPE_sf[[organ]][[year]]`This consistent nested design allows downstream functions to iterate cleanly across organ types and years when generating maps, buffers, and catchment-area analyses.```{r, eval=FALSE}Transplant_centers_all_sf<-list()Transplant_centers_active_SF<-list()Transplant_centers_HIV_sf<-list()Transplant_centers_HOPE_sf<-list()for (organ_loop in organ_list){ #Convert transplant center data to geographical data Transplant_centers_all_sf[[organ_loop]]<-Transplant_centers_all[[organ_loop]]%>% st_transform(5070) for (year_loop in year_list) { Transplant_centers_active_SF[[organ_loop]][[year_loop]]<-Transplant_centers_active[[organ_loop]][[year_loop]]%>% st_transform(5070) #Geographic file of centers that performed an HIV HIV R+ transplant at the designated time point Transplant_centers_HIV_sf[[organ_loop]][[year_loop]] <- Transplant_centers_all_sf[[organ_loop]]%>% filter(OTCCode %in% HIV_center_volumes[[organ_loop]][[year_loop]]$REC_CTR_CD) #Geographic file of centers that performed an HIV D+/R+ transplant at the designated time point Transplant_centers_HOPE_sf[[organ_loop]][[year_loop]] <- Transplant_centers_all_sf[[organ_loop]]%>% filter(OTCCode %in% HOPE_center_volumes[[organ_loop]][[year_loop]]$REC_CTR_CD) }}```## Merging CDC HIV Data with County and Tract GeographyThis section joins CDC Atlas HIV surveillance data to U.S. county and census tract population files, then derives tract-level HIV estimates for downstream spatial analysis.Three list objects are created:- `Merged_Counties` County-level spatial data merged with CDC HIV totals.- `Merged_Counties_nonSF` County-level data with geometry removed (used for attribute joins).- `Merged_tracts` Census tract–level spatial data with estimated HIV-positive and HIV-negative populations.Each object is indexed by `year`.---### Step 1: Initialize Storage ListsEmpty lists are created to store year-specific merged datasets.Each list will ultimately contain one entry per year in `year_list`.---### Step 2: Loop Over YearsFor each `year`:---### A) Merge County Population Data with CDC Atlas DataCounty geographic population files are joined with CDC Atlas HIV totals.- The join matches `GEOID` (county FIPS) from the population file- To `geo_id` from the CDC Atlas dataset- A `left_join()` ensures that all counties remain in the datasetThe result is a spatial (`sf`) object containing:- County geometry- County population- County HIV case counts and ratesThis object is stored in:```{r, eval=FALSE}Merged_Counties<-list()Merged_Counties_nonSF<-list()Merged_tracts<-list()for (year_loop in year_list) { #Join county geographical data with CDC data. Merged_Counties[[year_loop]]<-left_join(us_counties_population[[year_loop]], AtlasPlusTableData_county_totals[[year_loop]], by=join_by(GEOID==geo_id)) Merged_Counties_nonSF[[year_loop]]<-Merged_Counties[[year_loop]]%>% st_drop_geometry() #Merge county data with tracts Merged_tracts[[year_loop]]<-left_join(us_tracts_population[[year_loop]], Merged_Counties_nonSF[[year_loop]], by=c("county_fips"="GEOID"))%>% select(-NAME.x, -NAME.y)%>% #Estimate census tract HIV population mutate(tract_cases=county_cases*tract_population/county_population)%>% #Estimate HIV-negative population in census tracts mutate(tract_noncases=tract_population-tract_cases)%>% st_transform(5070)}```## Other portions of the analysis- [**Setup**](setup.qmd): Defines global paths, data sources, cohort inclusion criteria, and analysis-wide constants.- [**Functions**](functions.qmd): Reusable helper functions for cohort construction, matching, costing, and modeling.- [**Tables**](tables.qmd): Summary tables and regression outputs generated from the final models.- [**Figures**](figures.qmd):Visualizations of costs, risks, and model-based estimates.- [**About**](about.qmd): methods, assumptions, and disclosures