Calculate aggregated results

Author
Affiliation

Vagish Hemmige

Montefiore Medical Center/ Albert Einstein College of Medicine

This script aggregates tract-level geographic results into organ-level summary tables.

Earlier steps:

Here, we translate those geographic intersections into interpretable population coverage statistics.

Source code

The full R script is available at:

This R script file is itself reliant on the following helper files:


Initialize Result Containers and Baseline Totals

This first section prepares the data structures that will store aggregated coverage results and calculates the population baselines used as denominators.

Step 1: Create Empty Results Tables

For each organ, two empty result tables are initialized:

  • Results_HIV_df — for people living with HIV (PLWH)
  • Results_nonHIV_df — for people without HIV

Each table is structured to store:

  • Characteristic — the center category and distance (e.g., active 60min)
  • Numerator — population within the catchment area
  • Denominator — total population nationwide (for that year)
  • Percentage — share of the total population within reach
  • Year — analysis year

These tables will be filled in later sections of the script.


Step 2: Calculate Total Population Denominators

Before computing coverage percentages, we calculate the total estimated population counts across all census tracts.

For each year:

  • The total number of estimated HIV cases (tract_cases) is summed across all tracts.
  • The total estimated HIV-negative population (tract_noncases) is summed across all tracts.
  • Geometry is dropped before summarizing to improve computational efficiency.

These totals are stored in:

  • Tract_total_HIV_cases[[year]]
  • Tract_total_nonHIV_cases[[year]]

These values serve as the denominators for all subsequent percentage calculations.

Click to show/hide R Code
Results_HIV_df<-list()
Results_nonHIV_df<-list()
Tract_total_HIV_cases<-list()
Tract_total_nonHIV_cases<-list()
                      
for (organ_loop in organ_list)
  
{
  
  #Initialize data frames
  Results_HIV_df[[organ_loop]] <- tibble(
    Characteristic = character(),
    Numerator = numeric(),
    Denominator = numeric(),
    Percentage = numeric(),
    Year= numeric()
  )
  
  Results_nonHIV_df[[organ_loop]] <- tibble(
    Characteristic = character(),
    Numerator = numeric(),
    Denominator = numeric(),
    Percentage = numeric(),
    Year= numeric()
  )
}
  
  for (year_loop in year_list)
  {
  
  #Calculate the total number of HIV cases assigned to a tract
  Tract_total_HIV_cases[[year_loop]]<-Merged_tracts[[year_loop]]%>%
    st_drop_geometry()%>%
    summarize(total_cases=sum(tract_cases, na.rm=TRUE))%>%
    pull(total_cases)%>%
    first()
  
  #Calculate the total number of nonHIV cases (i.e. people without HIV) assigned to a tract
  Tract_total_nonHIV_cases[[year_loop]]<-Merged_tracts[[year_loop]]%>%
    st_drop_geometry()%>%
    summarize(total_cases=sum(tract_noncases, na.rm=TRUE))%>%
    pull(total_cases)%>%
    first()
}

Aggregate Coverage Results for People Living with HIV (PLWH)

This section calculates the proportion of people living with HIV who reside within transplant center catchment areas.

For each:

  • Organ
  • Distance (e.g., 50 miles, 60 minutes)
  • Year
  • Center category (active, HIV, HOPE)

the script computes:

  • The numerator: total estimated HIV cases (tract_cases) located within the united catchment area
  • The denominator: total estimated HIV cases across all census tracts for that year
  • The percentage: share of PLWH within geographic reach

How It Works

For each combination of organ, distance, year, and center type:

  1. The appropriate united tract-buffer object is dynamically selected.
  2. Geometry is dropped to improve efficiency.
  3. Tract-level HIV cases are summed within the catchment area.
  4. That sum is divided by the national total for the year.
  5. A new row is appended to Results_HIV_df[[organ]].

The resulting table quantifies geographic access to transplant services for people living with HIV and enables comparison:

  • Across organs
  • Across distances
  • Across center categories
  • Across time (e.g., 2017 vs 2022)
Click to show/hide R Code
#Calculate table for PLWH


for (organ_loop in organ_list)
{
  for (distance_loop in distance_list){
    for (year_loop in year_list){
      for (type in c("active", "HIV", "HOPE")) {
        
        name_of_df<-paste0("tracts_in_buffers_", type, "_united")
        print(name_of_df)
        
        temp_Num<-get(name_of_df)[[organ_loop]][[distance_loop]][[year_loop]]%>%
          st_drop_geometry()%>%
          group_by(catchment_area)%>%
          summarize(total_cases=sum(tract_cases, na.rm=TRUE))%>%
          pull(total_cases)%>%
          first()
        
        temp_Denom<-Tract_total_HIV_cases[[year_loop]]
        
        
        Results_HIV_df[[organ_loop]]<-Results_HIV_df[[organ_loop]]%>%
          add_row(
            Characteristic = paste(type, distance_loop),
            Numerator = round(temp_Num, 0),
            Denominator = temp_Denom,
            Percentage = round(100*temp_Num/temp_Denom,1),
            Year= as.numeric(year_loop)
          ) 
        
      }
    }
  }
}

Aggregate Coverage Results for People Without HIV

This final section calculates geographic coverage for the population without HIV.

The structure mirrors the PLWH calculation but uses:

  • tract_noncases instead of tract_cases
  • Tract_total_nonHIV_cases as the denominator

What Is Being Calculated?

For each:

  • Organ
  • Distance
  • Year

the script computes:

  • The numerator: total estimated non-HIV population within the united active-center catchment area
  • The denominator: total estimated non-HIV population across all census tracts for that year
  • The percentage: share of the non-HIV population within geographic reach

Why Only Active Centers?

In this implementation, non-HIV coverage is calculated for active transplant centers only.

This provides a general population access benchmark that can be compared against:

  • Coverage for people living with HIV
  • Coverage within HIV-performing or HOPE-performing center categories

How It Works

For each organ, distance, and year:

  1. The united active-center tract object is selected.
  2. Geometry is dropped for efficient aggregation.
  3. tract_noncases is summed within the catchment area.
  4. The total is divided by the national non-HIV population estimate for that year.
  5. A new row is appended to Results_nonHIV_df[[organ]].

These results allow comparison between:

  • Geographic access for people living with HIV
  • Geographic access for the general population
  • Changes over time
  • Differences across organ types and service-area definitions

Together, the PLWH and non-HIV tables provide the quantitative foundation for the manuscript’s access comparisons.

Click to show/hide R Code
#Calculate tract-level results for people without HIV
for (organ_loop in organ_list)
{
  for (distance_loop in distance_list){
    for (year_loop in year_list){
      for (type in c("active")) {
        
        name_of_df<-paste0("tracts_in_buffers_", type, "_united")
        print(name_of_df)
        
        temp_Num<-get(name_of_df)[[organ_loop]][[distance_loop]][[year_loop]]%>%
          st_drop_geometry()%>%
          group_by(catchment_area)%>%
          summarize(total_noncases=sum(tract_noncases, na.rm=TRUE))%>%
          pull(total_noncases)%>%
          first()
        
        temp_Denom<-Tract_total_nonHIV_cases[[year_loop]]
        
        Results_nonHIV_df[[organ_loop]]<-Results_nonHIV_df[[organ_loop]]%>%
          add_row(
            Characteristic = paste(type, distance_loop),
            Numerator = round(temp_Num, 0),
            Denominator = temp_Denom,
            Percentage = round(100*temp_Num/temp_Denom,1),
            Year= as.numeric(year_loop))
        
        
        
      }
    }
  }
  
}

Other portions of the analysis

  • Setup: Defines global paths, data sources, cohort inclusion criteria, and analysis-wide constants.
  • Functions: Reusable helper functions for cohort construction, matching, costing, and modeling.
  • Tables: Summary tables and regression outputs generated from the final models.
  • Figures:Visualizations of costs, risks, and model-based estimates.
  • About: methods, assumptions, and disclosures