Calculate aggregated results

Author

Vagish Hemmige

This script aggregates tract-level geographic results into organ-level summary tables.

Earlier steps:

Assigned estimated HIV and non-HIV populations to census tracts
Identified which tracts fall within transplant center catchment areas
United overlapping buffers to prevent double-counting

Here, we translate those geographic intersections into interpretable population coverage statistics.

Source code

The full R script is available at:

R/calculate_aggregated_results.R

This R script file is itself reliant on the following helper files:

Initialize Result Containers and Baseline Totals

This first section prepares the data structures that will store aggregated coverage results and calculates the population baselines used as denominators.

Step 1: Create Empty Results Tables

For each organ, two empty result tables are initialized:

Results_HIV_df — for people living with HIV (PLWH)
Results_nonHIV_df — for people without HIV

Each table is structured to store:

Characteristic — the center category and distance (e.g., active 60min)
Numerator — population within the catchment area
Denominator — total population nationwide (for that year)
Percentage — share of the total population within reach
Year — analysis year

These tables will be filled in later sections of the script.

Step 2: Calculate Total Population Denominators

Before computing coverage percentages, we calculate the total estimated population counts across all census tracts.

For each year:

The total number of estimated HIV cases (tract_cases) is summed across all tracts.
The total estimated HIV-negative population (tract_noncases) is summed across all tracts.
Geometry is dropped before summarizing to improve computational efficiency.

These totals are stored in:

Tract_total_HIV_cases[[year]]
Tract_total_nonHIV_cases[[year]]

These values serve as the denominators for all subsequent percentage calculations.

Click to show/hide R Code

Results_HIV_df<-list()
Results_nonHIV_df<-list()
Tract_total_HIV_cases<-list()
Tract_total_nonHIV_cases<-list()
                      
for (organ_loop in organ_list)
  
{
  
  #Initialize data frames
  Results_HIV_df[[organ_loop]] <- tibble(
    Characteristic = character(),
    Numerator = numeric(),
    Denominator = numeric(),
    Percentage = numeric(),
    Year= numeric()
  )
  
  Results_nonHIV_df[[organ_loop]] <- tibble(
    Characteristic = character(),
    Numerator = numeric(),
    Denominator = numeric(),
    Percentage = numeric(),
    Year= numeric()
  )
}
  
  for (year_loop in year_list)
  {
  
  #Calculate the total number of HIV cases assigned to a tract
  Tract_total_HIV_cases[[year_loop]]<-Merged_tracts[[year_loop]]%>%
    st_drop_geometry()%>%
    summarize(total_cases=sum(tract_cases, na.rm=TRUE))%>%
    pull(total_cases)%>%
    first()
  
  #Calculate the total number of nonHIV cases (i.e. people without HIV) assigned to a tract
  Tract_total_nonHIV_cases[[year_loop]]<-Merged_tracts[[year_loop]]%>%
    st_drop_geometry()%>%
    summarize(total_cases=sum(tract_noncases, na.rm=TRUE))%>%
    pull(total_cases)%>%
    first()
}

Aggregate Coverage Results for People Living with HIV (PLWH)

This section calculates the proportion of people living with HIV who reside within transplant center catchment areas.

For each:

Organ
Distance (e.g., 50 miles, 60 minutes)
Year
Center category (active, HIV, HOPE)

the script computes:

The numerator: total estimated HIV cases (tract_cases) located within the united catchment area
The denominator: total estimated HIV cases across all census tracts for that year
The percentage: share of PLWH within geographic reach

How It Works

For each combination of organ, distance, year, and center type:

The appropriate united tract-buffer object is dynamically selected.
Geometry is dropped to improve efficiency.
Tract-level HIV cases are summed within the catchment area.
That sum is divided by the national total for the year.
A new row is appended to Results_HIV_df[[organ]].

The resulting table quantifies geographic access to transplant services for people living with HIV and enables comparison:

Across organs
Across distances
Across center categories
Across time (e.g., 2017 vs 2022)

Click to show/hide R Code

#Calculate table for PLWH


for (organ_loop in organ_list)
{
  for (distance_loop in distance_list){
    for (year_loop in year_list){
      for (type in c("active", "HIV", "HOPE")) {
        
        name_of_df<-paste0("tracts_in_buffers_", type, "_united")
        print(name_of_df)
        
        temp_Num<-get(name_of_df)[[organ_loop]][[distance_loop]][[year_loop]]%>%
          st_drop_geometry()%>%
          group_by(catchment_area)%>%
          summarize(total_cases=sum(tract_cases, na.rm=TRUE))%>%
          pull(total_cases)%>%
          first()
        
        temp_Denom<-Tract_total_HIV_cases[[year_loop]]
        
        
        Results_HIV_df[[organ_loop]]<-Results_HIV_df[[organ_loop]]%>%
          add_row(
            Characteristic = paste(type, distance_loop),
            Numerator = round(temp_Num, 0),
            Denominator = temp_Denom,
            Percentage = round(100*temp_Num/temp_Denom,1),
            Year= as.numeric(year_loop)
          ) 
        
      }
    }
  }
}

Aggregate Coverage Results for People Without HIV

This final section calculates geographic coverage for the population without HIV.

The structure mirrors the PLWH calculation but uses:

tract_noncases instead of tract_cases
Tract_total_nonHIV_cases as the denominator

What Is Being Calculated?

For each:

Organ
Distance
Year

the script computes:

The numerator: total estimated non-HIV population within the united active-center catchment area
The denominator: total estimated non-HIV population across all census tracts for that year
The percentage: share of the non-HIV population within geographic reach

Why Only Active Centers?

In this implementation, non-HIV coverage is calculated for active transplant centers only.

This provides a general population access benchmark that can be compared against:

Coverage for people living with HIV
Coverage within HIV-performing or HOPE-performing center categories

How It Works

For each organ, distance, and year:

The united active-center tract object is selected.
Geometry is dropped for efficient aggregation.
tract_noncases is summed within the catchment area.
The total is divided by the national non-HIV population estimate for that year.
A new row is appended to Results_nonHIV_df[[organ]].

These results allow comparison between:

Geographic access for people living with HIV
Geographic access for the general population
Changes over time
Differences across organ types and service-area definitions

Together, the PLWH and non-HIV tables provide the quantitative foundation for the manuscript’s access comparisons.

Click to show/hide R Code

#Calculate tract-level results for people without HIV
for (organ_loop in organ_list)
{
  for (distance_loop in distance_list){
    for (year_loop in year_list){
      for (type in c("active")) {
        
        name_of_df<-paste0("tracts_in_buffers_", type, "_united")
        print(name_of_df)
        
        temp_Num<-get(name_of_df)[[organ_loop]][[distance_loop]][[year_loop]]%>%
          st_drop_geometry()%>%
          group_by(catchment_area)%>%
          summarize(total_noncases=sum(tract_noncases, na.rm=TRUE))%>%
          pull(total_noncases)%>%
          first()
        
        temp_Denom<-Tract_total_nonHIV_cases[[year_loop]]
        
        Results_nonHIV_df[[organ_loop]]<-Results_nonHIV_df[[organ_loop]]%>%
          add_row(
            Characteristic = paste(type, distance_loop),
            Numerator = round(temp_Num, 0),
            Denominator = temp_Denom,
            Percentage = round(100*temp_Num/temp_Denom,1),
            Year= as.numeric(year_loop))
        
        
        
      }
    }
  }
  
}

Other portions of the analysis

Setup: Defines global paths, data sources, cohort inclusion criteria, and analysis-wide constants.
Functions: Reusable helper functions for cohort construction, matching, costing, and modeling.
Tables: Summary tables and regression outputs generated from the final models.
Figures:Visualizations of costs, risks, and model-based estimates.
About: methods, assumptions, and disclosures

--- title: "Calculate aggregated results" format: html --- This script aggregates tract-level geographic results into organ-level summary tables. Earlier steps: - Assigned estimated HIV and non-HIV populations to census tracts - Identified which tracts fall within transplant center catchment areas - United overlapping buffers to prevent double-counting Here, we translate those geographic intersections into **interpretable population coverage statistics**. ::: {.Rcode title="Source code"} The full R script is available at: - [`R/calculate_aggregated_results.R`](https://github.com/VagishHemmige/HOPE-CDC-analysis-2026/blob/master/R/calculate_aggregated_results.R) This R script file is itself reliant on the following helper files: - [`R/setup.R`](https://github.com/VagishHemmige/HOPE-CDC-analysis-2026/blob/master/R/setup.R) - [`R/functions.R`](https://github.com/VagishHemmige/HOPE-CDC-analysis-2026/blob/master/R/functions.R) ::: --- ## Initialize Result Containers and Baseline Totals This first section prepares the data structures that will store aggregated coverage results and calculates the population baselines used as denominators. ### Step 1: Create Empty Results Tables For each organ, two empty result tables are initialized: - `Results_HIV_df` — for people living with HIV (PLWH) - `Results_nonHIV_df` — for people without HIV Each table is structured to store: - **Characteristic** — the center category and distance (e.g., `active 60min`) - **Numerator** — population within the catchment area - **Denominator** — total population nationwide (for that year) - **Percentage** — share of the total population within reach - **Year** — analysis year These tables will be filled in later sections of the script. --- ### Step 2: Calculate Total Population Denominators Before computing coverage percentages, we calculate the total estimated population counts across all census tracts. For each year: - The total number of estimated HIV cases (`tract_cases`) is summed across all tracts. - The total estimated HIV-negative population (`tract_noncases`) is summed across all tracts. - Geometry is dropped before summarizing to improve computational efficiency. These totals are stored in: - `Tract_total_HIV_cases[[year]]` - `Tract_total_nonHIV_cases[[year]]` These values serve as the denominators for all subsequent percentage calculations. ```{r, eval=FALSE} Results_HIV_df<-list() Results_nonHIV_df<-list() Tract_total_HIV_cases<-list() Tract_total_nonHIV_cases<-list() for (organ_loop in organ_list) { #Initialize data frames Results_HIV_df[[organ_loop]] <- tibble( Characteristic = character(), Numerator = numeric(), Denominator = numeric(), Percentage = numeric(), Year= numeric() ) Results_nonHIV_df[[organ_loop]] <- tibble( Characteristic = character(), Numerator = numeric(), Denominator = numeric(), Percentage = numeric(), Year= numeric() ) } for (year_loop in year_list) { #Calculate the total number of HIV cases assigned to a tract Tract_total_HIV_cases[[year_loop]]<-Merged_tracts[[year_loop]]%>% st_drop_geometry()%>% summarize(total_cases=sum(tract_cases, na.rm=TRUE))%>% pull(total_cases)%>% first() #Calculate the total number of nonHIV cases (i.e. people without HIV) assigned to a tract Tract_total_nonHIV_cases[[year_loop]]<-Merged_tracts[[year_loop]]%>% st_drop_geometry()%>% summarize(total_cases=sum(tract_noncases, na.rm=TRUE))%>% pull(total_cases)%>% first() } ``` ## Aggregate Coverage Results for People Living with HIV (PLWH) This section calculates the proportion of people living with HIV who reside within transplant center catchment areas. For each: - Organ - Distance (e.g., 50 miles, 60 minutes) - Year - Center category (`active`, `HIV`, `HOPE`) the script computes: - The **numerator**: total estimated HIV cases (`tract_cases`) located within the united catchment area - The **denominator**: total estimated HIV cases across all census tracts for that year - The **percentage**: share of PLWH within geographic reach ### How It Works For each combination of organ, distance, year, and center type: 1. The appropriate united tract-buffer object is dynamically selected. 2. Geometry is dropped to improve efficiency. 3. Tract-level HIV cases are summed within the catchment area. 4. That sum is divided by the national total for the year. 5. A new row is appended to `Results_HIV_df[[organ]]`. The resulting table quantifies geographic access to transplant services for people living with HIV and enables comparison: - Across organs - Across distances - Across center categories - Across time (e.g., 2017 vs 2022) ```{r, eval=FALSE} #Calculate table for PLWH for (organ_loop in organ_list) { for (distance_loop in distance_list){ for (year_loop in year_list){ for (type in c("active", "HIV", "HOPE")) { name_of_df<-paste0("tracts_in_buffers_", type, "_united") print(name_of_df) temp_Num<-get(name_of_df)[[organ_loop]][[distance_loop]][[year_loop]]%>% st_drop_geometry()%>% group_by(catchment_area)%>% summarize(total_cases=sum(tract_cases, na.rm=TRUE))%>% pull(total_cases)%>% first() temp_Denom<-Tract_total_HIV_cases[[year_loop]] Results_HIV_df[[organ_loop]]<-Results_HIV_df[[organ_loop]]%>% add_row( Characteristic = paste(type, distance_loop), Numerator = round(temp_Num, 0), Denominator = temp_Denom, Percentage = round(100*temp_Num/temp_Denom,1), Year= as.numeric(year_loop) ) } } } } ``` ## Aggregate Coverage Results for People Without HIV This final section calculates geographic coverage for the population **without HIV**. The structure mirrors the PLWH calculation but uses: - `tract_noncases` instead of `tract_cases` - `Tract_total_nonHIV_cases` as the denominator ### What Is Being Calculated? For each: - Organ - Distance - Year the script computes: - The **numerator**: total estimated non-HIV population within the united active-center catchment area - The **denominator**: total estimated non-HIV population across all census tracts for that year - The **percentage**: share of the non-HIV population within geographic reach ### Why Only Active Centers? In this implementation, non-HIV coverage is calculated for **active transplant centers** only. This provides a general population access benchmark that can be compared against: - Coverage for people living with HIV - Coverage within HIV-performing or HOPE-performing center categories ### How It Works For each organ, distance, and year: 1. The united active-center tract object is selected. 2. Geometry is dropped for efficient aggregation. 3. `tract_noncases` is summed within the catchment area. 4. The total is divided by the national non-HIV population estimate for that year. 5. A new row is appended to `Results_nonHIV_df[[organ]]`. --- These results allow comparison between: - Geographic access for people living with HIV - Geographic access for the general population - Changes over time - Differences across organ types and service-area definitions Together, the PLWH and non-HIV tables provide the quantitative foundation for the manuscript’s access comparisons. ```{r, eval=FALSE} #Calculate tract-level results for people without HIV for (organ_loop in organ_list) { for (distance_loop in distance_list){ for (year_loop in year_list){ for (type in c("active")) { name_of_df<-paste0("tracts_in_buffers_", type, "_united") print(name_of_df) temp_Num<-get(name_of_df)[[organ_loop]][[distance_loop]][[year_loop]]%>% st_drop_geometry()%>% group_by(catchment_area)%>% summarize(total_noncases=sum(tract_noncases, na.rm=TRUE))%>% pull(total_noncases)%>% first() temp_Denom<-Tract_total_nonHIV_cases[[year_loop]] Results_nonHIV_df[[organ_loop]]<-Results_nonHIV_df[[organ_loop]]%>% add_row( Characteristic = paste(type, distance_loop), Numerator = round(temp_Num, 0), Denominator = temp_Denom, Percentage = round(100*temp_Num/temp_Denom,1), Year= as.numeric(year_loop)) } } } } ``` ## Other portions of the analysis - [**Setup**](setup.qmd): Defines global paths, data sources, cohort inclusion criteria, and analysis-wide constants. - [**Functions**](functions.qmd): Reusable helper functions for cohort construction, matching, costing, and modeling. - [**Tables**](tables.qmd): Summary tables and regression outputs generated from the final models. - [**Figures**](figures.qmd):Visualizations of costs, risks, and model-based estimates. - [**About**](about.qmd): methods, assumptions, and disclosures