Initial data loading and preprocessing of USRDS core and transplant files
This portion of the code imports files from the USRDS data set necessary for the cryptococcus analysis. This portion makes use of the load_usrds_data() function from the usRds package as well as several tidyverse cleaning functions.
Key files used:
patients: This is a part of the Core data set of the USRDS and contains key patient demographics.
tx: Also a part of the Core data set of the USRDS and contains summary information about kidney transplants from UNOS.
txunos_trr_ki and txunos_trr_kp: Part of the transplant data set of the USRDS, these files contain more detailed information derived from UNOS that is part of the Scientific Registry of Transplant Recipients.
The code below loads and merges transplant records across multiple UNOS-derived files, constructs a cumulative transplant count per patient, and reshapes the data into a time-varying format that tracks graft status over time.
Click to show/hide R Code
#Import core demographics from "patients" filepatients_raw<-usRds::load_usrds_file("patients")%>%select(-ZIPCODE) #This is ZIP code at time of USRDS initiation, but we want at time of crypto dx#Import key information about the transplants from the TX and UNOS databasestx_raw<-usRds::load_usrds_file("tx")%>%select(USRDS_ID, TDATE, FAILDATE, TRR_ID_CODE)ki_raw<-usRds::load_usrds_file("txunos_trr_ki")%>%select(USRDS_ID, ORGTYP, HRTX, LUTX, INTX, LITX,PITX,BMTX, TRR_ID_CODE)kp_raw<-usRds::load_usrds_file("txunos_trr_kp")%>%select(USRDS_ID, ORGTYP, HRTX, LUTX, INTX, LITX,PITX,BMTX, TRR_ID_CODE)#This combines the three datasetstx_clean<-tx_raw%>%left_join(bind_rows(ki_raw, kp_raw))%>%arrange(USRDS_ID, TDATE)%>%group_by(USRDS_ID)%>%mutate(cumulative_transplant_total=row_number())%>% ungroup#Create a time-varying dataset that can be used to track whether a pt has an active or inactive graft and the cumulative number of txstx_status<-tx_clean%>%select(USRDS_ID, TDATE, FAILDATE, cumulative_transplant_total)%>%pivot_longer(cols =c(TDATE, FAILDATE),names_to ="event_type",values_to ="event_date" ) %>%filter(!is.na(event_date)) %>%mutate(graft_status =case_when( event_type =="TDATE"~"Active", event_type =="FAILDATE"~"Failed" ) ) %>%select(-event_type)%>%arrange(USRDS_ID, cumulative_transplant_total, event_date, graft_status)
Initialize flowchart
The next component makes use of the flowchart package. This package combines two key processes in an epidemiologic analysis:
Dataset preparation: Sequential application of eligibility criteria to define the analytic cohort, with explicit tracking of excluded records at each step.
The code below initializes a flowchart-aware cohort object and applies sequential eligibility filters, retaining both inclusion counts and labeled exclusion steps for downstream reporting.
Click to show/hide R Code
#Initialize a flowchart cohortpatients_clean<-patients_raw%>%as_fc(label="Patients in USRDS")%>%fc_filter(TOTTX>0, label="Prior transplant", label_exc ="Excluded: No prior transplant", show_exc =TRUE)%>%fc_filter(TX1DATE<as.Date("2021-01-01"), label="Transplant prior to 2021", label_exc ="Excluded: Transplant 2021 or later", show_exc =TRUE)
Identification of comorbidities and confirmation of Medicare coverage continuity
This component identifies baseline comorbidities among transplant recipients using diagnosis codes derived from Medicare Institutional (Part A) and physician/supplier (Part B) claims. Diagnosis code lists defined in R/setup.R are applied uniformly across claims files to establish the presence and timing of comorbid conditions.
This portion makes use of the get_IN_ICD() and get_PS_ICD() functions from the usRds package to obtain dates that specific ICD codes are used for patients with a history of at least one kidney transplant in the study period.
The establish_dx_date() function from the usRds package uses the results of this analysis to ascertain the date when comorbidities meet the criteria for a formal diagnosis:
Two outpatient encounters or
One inpatient encounter
In parallel, Medicare coverage history is retrieved for all transplant recipients from the payhist file, which is part of the Core dataest. These data will be used to confirm continuous enrollment during the analytic period. We verify the absence of gaps indicating missing data, noting that every subsequent period in the dataset for a patient starts the day after the previous period ends and thereby supporting valid longitudinal assessment of diagnoses and downstream cost analyses.
Finally, we separate cryptococcus diagnosis dates from other comorbidities for a separate list, given the key nature of the date of cryptococcus diagnosis for subsequent analyses.
Click to show/hide R Code
#Create list of USRDS ids for patients who have undergone transplanttransplant_id_list<-patients_clean$data%>%pull(USRDS_ID)# We now seek to determine comorbidities by using diagnosis codes from the setup.R filecomorbidity_diagnosis_date<-list()# Combine all ICD codes from all comorbidities into one list comorbidity_ICD_combined_list<-unlist(comorbidity_ICD_list, use.names=FALSE)#Scrape files for any comorbidity claimcomorbidity_claims_df<-bind_rows(get_IN_ICD(icd_codes = comorbidity_ICD_combined_list, years =2006:2021, usrds_ids = transplant_id_list ),get_PS_ICD(icd_codes = comorbidity_ICD_combined_list, years =2006:2021, usrds_ids = transplant_id_list )%>%rename(CODE=DIAG))%>%arrange(USRDS_ID, CLM_FROM)#Create comorbidity_diagnosis_date data framefor (comorbidity innames(comorbidity_ICD_list)){ comorbidity_diagnosis_date[[comorbidity]]<-comorbidity_claims_df%>%filter(CODE %in% comorbidity_ICD_list[[comorbidity]])%>%establish_dx_date(diagnosis_established = comorbidity)}#Load Medicare coverage history for all patients with transplantmedicare_history<-load_usrds_file("payhist",usrds_ids = transplant_id_list)%>%arrange(USRDS_ID, BEGDATE)%>%group_by(USRDS_ID)%>%mutate(lag_ENDDATE=lag(ENDDATE))%>%mutate(gap=as.numeric(BEGDATE-lag_ENDDATE))%>%arrange(desc(gap))#Confirm no gaps (gap should always be 1 or missing)if (any(!is.na(medicare_history$gap) & medicare_history$gap !=1)) {stop("Gap assumption violated: `gap` contains values other than 1 or NA.")}#Format a df with the cryptococcus dxcryptococcus_df<-comorbidity_diagnosis_date$cryptococcus%>%select(-diagnosis)%>%rename(cryptococcus_dx_date=date_established)
Further development of the STROBE flowchart
Subsequently, key diagnosis dates, including the date of cryptococcosis diagnosis, are merged into the analytic patient dataset.
This code block then extends the STROBE flowchart by integrating diagnosis timing, eligibility criteria, and coverage requirements to finalize the analytic cohorts. Patients are classified as cryptococcosis cases or potential controls based on the presence and timing of diagnosis, and sequential exclusion criteria are applied to ensure incident disease, adult status at diagnosis, and appropriate calendar-time eligibility.
The verify_medicare_primary() function from the usrds package is used to confirm that patients with a cryptococcus diagnosis have Medicare primary coverage for at least 365 days prior to the cryptococcus diagnosis date to ensure that the diagnosis is new and not a carryover from an unobserved period.
These steps culminate in the final case and control cohorts, with all inclusion and exclusion decisions explicitly tracked and visualized in the updated STROBE flow diagram.
Click to show/hide R Code
#Join cryptococcus date to fc cohortpatients_clean$data<-left_join(patients_clean$data, cryptococcus_df)%>%mutate(cryptococcus_case=ifelse(is.na(cryptococcus_dx_date), "Potential control", "Case"))#Continue to create analytic cohortpatients_merged<-patients_clean%>%#Remove patients with a diagnosis of cryptococcus prior to transplantfc_filter(cryptococcus_dx_date>TX1DATE |is.na(cryptococcus_dx_date), label="No cryptococcus dx prior to transplant", label_exc ="Excluded: cryptococcus dx prior to transplant", show_exc =TRUE)%>%fc_filter((time_length(interval(BORN, cryptococcus_dx_date), "years") >=18) |is.na(cryptococcus_dx_date),label="Age 18+ at time of cryptococcus if cryptococcus patient", label_exc ="Excluded: First cryptococcus prior to age 18", show_exc =TRUE)%>%fc_filter((year(cryptococcus_dx_date)>=2007&year(cryptococcus_dx_date)<=2020) |is.na(cryptococcus_dx_date),label="Incident cryptococcus between 1/1/2007 and 12/31/2020", label_exc ="Incident cryptococcus outside of specified date range", show_exc =TRUE)%>%#Split cohortsfc_split(cryptococcus_case)#Check Medicare coverage for 365-day lookback period from day of first episode of cryptococcuspatients_merged$data<-patients_merged$data%>%verify_medicare_primary(index_date ="cryptococcus_dx_date",lookback_days =365,coverage_start_variable ="coverage_start_date",coverage_end_variable ="coverage_end_date" )%>%mutate(medicare_primary_TF=ifelse(cryptococcus_case=="Potential control", TRUE, medicare_primary_TF))patients_merged2<-patients_merged%>%fc_filter(medicare_primary_TF==TRUE, label ="365+ days of Medicare primary coverage\nprior to first cryptococcus claim", label_exc ="Excluded: Fewer than 365 days of coverage",show_exc =TRUE)patients_merged2$data<-patients_merged2$data%>%select(-medicare_primary_TF)%>%#Prepare data for cohort initializationmutate(terminal_date=coalesce(coverage_end_date, censor_date))patients_merged2<-patients_merged2%>%fc_filter((terminal_date - cryptococcus_dx_date >=minimum_followup) |is.na(cryptococcus_dx_date), label ="Minimum followup exceeded", label_exc="Excluded: Minimum follow-up threshold not met",show_exc =TRUE)patients_merged2%>%fc_draw()
Creation of a time-varying cohort
We use several functions from the usRds package:
create_usrds_cohort()
add_cohort_covariate()
finalize_usrds_cohort()
These functions split each patient into multiple rows, with each row describing a discrete period of time. This captures that patients will have time-varying status for covariates such as cirrhosis status, etc.
Patients with cryptococcus infection join the cohort on the date of cryptococcosis diagnosis, while patients without cryptococcus (potential controls) join on the date of their first kidney transplant.
“Time since transplant” resets after each new transplant.
Click to show/hide R Code
#Now we need to construct the time-varying data set#Ungroupinitial_cohort<-patients_merged2$data%>%ungroup()%>%#Cases join when they experience cryptococcus#Controls start on date of first transplantmutate(cohort_join_date =coalesce(as.Date(cryptococcus_dx_date),as.Date(TX1DATE) ) )#Initialize cohortprematching_cohort<-create_usrds_cohort(df=initial_cohort,start_date ="cohort_join_date",end_date ="terminal_date")%>%# Add cirrhosisadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["cirrhosis"]],covariate_date="date_established",covariate_variable_name="cirrhosis")%>%# Add CMVadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["CMV"]],covariate_date="date_established",covariate_variable_name="CMV")%>%# Add diabetesadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Diabetes"]],covariate_date="date_established",covariate_variable_name="diabetes")%>%# Add HIVadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["HIV"]],covariate_date="date_established",covariate_variable_name="HIV")%>%# Add liver transplantadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Liver transplant"]],covariate_date="date_established",covariate_variable_name="liver_transplant")%>%# Add lung transplantadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Lung transplant"]],covariate_date="date_established",covariate_variable_name="lung_transplant")%>%# Add heart transplantadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Heart transplant"]],covariate_date="date_established",covariate_variable_name="heart_transplant")%>%# Add pancreas transplantadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Pancreas transplant"]],covariate_date="date_established",covariate_variable_name="pancreas_transplant")%>%# Add heart-lung transplantadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Heart-lung transplant"]],covariate_date="date_established",covariate_variable_name="heartlung_transplant")%>%# Add intestinal transplantadd_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Intestinal transplant"]],covariate_date="date_established",covariate_variable_name="intestinal_transplant")%>%#Add time-varying information about transplant statusadd_cohort_covariate(covariate_data_frame=tx_status,covariate_date="event_date",covariate_variable_name="cumulative_transplant_total",covariate_value ="cumulative_transplant_total")%>%#Add time-varying information about transplant status (whether current graft is active or failed)add_cohort_covariate(covariate_data_frame=tx_status,covariate_date="event_date",covariate_variable_name="current_graft_status",covariate_value ="graft_status")%>%#Add time-varying information about transplant status (date of most recent transplant)add_cohort_covariate(covariate_data_frame=tx_status%>%filter(graft_status=="Active"),covariate_date="event_date",covariate_variable_name="most_recent_transplant_date",covariate_value ="event_date")%>%#Add time-varying information about transplant status (date of most recent graft failure)add_cohort_covariate(covariate_data_frame=tx_status%>%filter(graft_status=="Failed"),covariate_date="event_date",covariate_variable_name="most_recent_failure_date",covariate_value ="event_date")%>%# Add Medicare current coverageadd_cohort_covariate(covariate_data_frame=medicare_history,covariate_date="BEGDATE",covariate_variable_name="current_medicare_coverage",covariate_value ="PAYER" )%>%finalize_usrds_cohort(baseline_date_variable ="most_recent_transplant_date")
The analysis then proceeds on to the execute_matching component.
Other portions of the analysis
Setup: Defines global paths, data sources, cohort inclusion criteria, and analysis-wide constants.
Functions: Reusable helper functions for cohort construction, matching, costing, and modeling.
Execute matching: Implements risk-set–based greedy matching without replacement to construct the analytic cohort.
Post-match processing: Derives analytic variables, time-aligned cost windows, and follow-up structure after matching.
Modeling: Fits prespecified cost and outcome models using the matched cohort.
Tables: Summary tables and regression outputs generated from the final models.
Figures:Visualizations of costs, risks, and model-based estimates.
---title: "Create cohort"format: html---The code in this script creates the cohort of kidney transplant patients eligible for the study for use in other scripts.::: Rcode### Source codeThe full R script is available at:- [`R/create_cohort.R`](https://github.com/VagishHemmige/Cryptococcus-cost_analysis/blob/master/R/create_cohort.R)This R script file is itself reliant on the following helper files:- [`R/setup.R`](https://github.com/VagishHemmige/Cryptococcus-cost_analysis/blob/master/R/setup.R)- [`R/functions.R`](https://github.com/VagishHemmige/Cryptococcus-cost_analysis/blob/master/R/functions.R):::## Initial data loading and preprocessing of USRDS core and transplant filesThis portion of the code imports files from the USRDS data set necessary for the cryptococcus analysis. This portion makes use of the `load_usrds_data()`function from the `usRds` package as well as several tidyverse cleaning functions.Key files used:- `patients`: This is a part of the `Core` data set of the USRDS and contains key patient demographics.- `tx`: Also a part of the `Core` data set of the USRDS and contains summary information about kidney transplants from UNOS.- `txunos_trr_ki` and `txunos_trr_kp`: Part of the `transplant` data set of the USRDS, these files contain more detailed informationderived from UNOS that is part of the Scientific Registry of Transplant Recipients.The code below loads and merges transplant records across multiple UNOS-derived files, constructs a cumulative transplant count per patient, and reshapes the data into a time-varying format that tracks graft status over time.```{r, eval=FALSE}#Import core demographics from "patients" filepatients_raw<-usRds::load_usrds_file("patients")%>% select(-ZIPCODE) #This is ZIP code at time of USRDS initiation, but we want at time of crypto dx#Import key information about the transplants from the TX and UNOS databasestx_raw<-usRds::load_usrds_file("tx")%>% select(USRDS_ID, TDATE, FAILDATE, TRR_ID_CODE)ki_raw<-usRds::load_usrds_file("txunos_trr_ki")%>% select(USRDS_ID, ORGTYP, HRTX, LUTX, INTX, LITX,PITX,BMTX, TRR_ID_CODE)kp_raw<-usRds::load_usrds_file("txunos_trr_kp")%>% select(USRDS_ID, ORGTYP, HRTX, LUTX, INTX, LITX,PITX,BMTX, TRR_ID_CODE)#This combines the three datasetstx_clean<-tx_raw%>% left_join(bind_rows(ki_raw, kp_raw))%>% arrange(USRDS_ID, TDATE)%>% group_by(USRDS_ID)%>% mutate(cumulative_transplant_total=row_number())%>% ungroup#Create a time-varying dataset that can be used to track whether a pt has an active or inactive graft and the cumulative number of txstx_status<-tx_clean%>% select(USRDS_ID, TDATE, FAILDATE, cumulative_transplant_total)%>% pivot_longer( cols = c(TDATE, FAILDATE), names_to = "event_type", values_to = "event_date" ) %>% filter(!is.na(event_date)) %>% mutate( graft_status = case_when( event_type == "TDATE" ~ "Active", event_type == "FAILDATE" ~ "Failed" ) ) %>% select(-event_type)%>% arrange(USRDS_ID, cumulative_transplant_total, event_date, graft_status)```## Initialize flowchartThe next component makes use of the `flowchart` package. This package combines two key processes in an epidemiologic analysis:- **Dataset preparation:** Sequential application of eligibility criteria to define the analytic cohort, with explicit tracking of excluded records at each step.- **[STROBE](https://www.strobe-statement.org/) diagram preparation:** [STrengthening the Reporting of OBservational studies in Epidemiology ](https://www.strobe-statement.org/) diagrams visually depict the process of including/excluding patients in an observational study and grouping them into cohorts. Many journals require these as a standard Figure 1.The code below initializes a flowchart-aware cohort object and applies sequential eligibility filters, retaining both inclusion counts and labeled exclusion steps for downstream reporting.```{r, eval=FALSE}#Initialize a flowchart cohortpatients_clean<-patients_raw%>% as_fc(label="Patients in USRDS")%>% fc_filter(TOTTX>0, label="Prior transplant", label_exc = "Excluded: No prior transplant", show_exc = TRUE)%>% fc_filter(TX1DATE<as.Date("2021-01-01"), label="Transplant prior to 2021", label_exc = "Excluded: Transplant 2021 or later", show_exc = TRUE)```## Identification of comorbidities and confirmation of Medicare coverage continuityThis component identifies baseline comorbidities among transplant recipients using diagnosis codes derived from Medicare Institutional (Part A) and physician/supplier (Part B) claims. Diagnosis code lists defined in `R/setup.R` are applied uniformly across claims files to establish the presence and timing of comorbid conditions. This portion makes use of the `get_IN_ICD()` and `get_PS_ICD()` functions from the `usRds` package to obtain dates that specific ICD codes are used for patients with a history of at least one kidney transplant in the study period.The `establish_dx_date()` function from the `usRds` package uses the results of this analysis to ascertain the date when comorbidities meet the criteria for a formal diagnosis:- **Two** outpatient encountersor- **One** inpatient encounterIn parallel, Medicare coverage history is retrieved for all transplant recipients from the `payhist` file, which is part of the Core dataest. These data will be used to confirm continuous enrollment during the analytic period. We verify the absence of gaps indicating missing data, noting that every subsequent period in the dataset for a patient starts the day after the previous period ends and thereby supporting valid longitudinal assessment of diagnoses and downstream cost analyses. Finally, we separate cryptococcus diagnosis dates from other comorbidities for a separate list, given the key nature of the date of cryptococcus diagnosis for subsequent analyses.```{r, eval=FALSE}#Create list of USRDS ids for patients who have undergone transplanttransplant_id_list<-patients_clean$data%>% pull(USRDS_ID)# We now seek to determine comorbidities by using diagnosis codes from the setup.R filecomorbidity_diagnosis_date<-list()# Combine all ICD codes from all comorbidities into one list comorbidity_ICD_combined_list<-unlist(comorbidity_ICD_list, use.names=FALSE)#Scrape files for any comorbidity claimcomorbidity_claims_df<-bind_rows(get_IN_ICD(icd_codes = comorbidity_ICD_combined_list, years = 2006:2021, usrds_ids = transplant_id_list ), get_PS_ICD(icd_codes = comorbidity_ICD_combined_list, years = 2006:2021, usrds_ids = transplant_id_list )%>%rename(CODE=DIAG))%>% arrange(USRDS_ID, CLM_FROM)#Create comorbidity_diagnosis_date data framefor (comorbidity in names(comorbidity_ICD_list)){ comorbidity_diagnosis_date[[comorbidity]]<-comorbidity_claims_df%>% filter(CODE %in% comorbidity_ICD_list[[comorbidity]])%>% establish_dx_date(diagnosis_established = comorbidity)}#Load Medicare coverage history for all patients with transplantmedicare_history<-load_usrds_file("payhist", usrds_ids = transplant_id_list)%>% arrange(USRDS_ID, BEGDATE)%>% group_by(USRDS_ID)%>% mutate(lag_ENDDATE=lag(ENDDATE))%>% mutate(gap=as.numeric(BEGDATE-lag_ENDDATE))%>% arrange(desc(gap))#Confirm no gaps (gap should always be 1 or missing)if (any(!is.na(medicare_history$gap) & medicare_history$gap != 1)) { stop("Gap assumption violated: `gap` contains values other than 1 or NA.")}#Format a df with the cryptococcus dxcryptococcus_df<-comorbidity_diagnosis_date$cryptococcus%>% select(-diagnosis)%>% rename(cryptococcus_dx_date=date_established)```## Further development of the STROBE flowchartSubsequently, key diagnosis dates, including the date of cryptococcosis diagnosis, are merged into the analytic patient dataset.This code block then extends the STROBE flowchart by integrating diagnosis timing, eligibility criteria, and coverage requirements to finalize the analytic cohorts. Patients are classified as cryptococcosis cases or potential controls based on the presence and timing of diagnosis, and sequential exclusion criteria are applied to ensure incident disease, adult status at diagnosis, and appropriate calendar-time eligibility. The `verify_medicare_primary()` function from the `usrds` package is used to confirm that patients with a cryptococcus diagnosis have Medicare primary coverage for at least 365 days prior to the cryptococcus diagnosis date to ensure that the diagnosis is new and not a carryover from an unobserved period.These steps culminate in the final case and control cohorts, with all inclusion and exclusion decisions explicitly tracked and visualized in the updated STROBE flow diagram.```{r, eval=FALSE}#Join cryptococcus date to fc cohortpatients_clean$data<-left_join(patients_clean$data, cryptococcus_df)%>% mutate(cryptococcus_case=ifelse(is.na(cryptococcus_dx_date), "Potential control", "Case"))#Continue to create analytic cohortpatients_merged<-patients_clean%>% #Remove patients with a diagnosis of cryptococcus prior to transplant fc_filter(cryptococcus_dx_date>TX1DATE | is.na(cryptococcus_dx_date), label="No cryptococcus dx prior to transplant", label_exc = "Excluded: cryptococcus dx prior to transplant", show_exc = TRUE)%>% fc_filter((time_length(interval(BORN, cryptococcus_dx_date), "years") >= 18) | is.na(cryptococcus_dx_date), label="Age 18+ at time of cryptococcus if cryptococcus patient", label_exc = "Excluded: First cryptococcus prior to age 18", show_exc = TRUE)%>% fc_filter((year(cryptococcus_dx_date)>=2007 & year(cryptococcus_dx_date)<=2020) | is.na(cryptococcus_dx_date), label="Incident cryptococcus between 1/1/2007 and 12/31/2020", label_exc = "Incident cryptococcus outside of specified date range", show_exc = TRUE)%>% #Split cohorts fc_split(cryptococcus_case) #Check Medicare coverage for 365-day lookback period from day of first episode of cryptococcuspatients_merged$data<-patients_merged$data%>% verify_medicare_primary(index_date = "cryptococcus_dx_date", lookback_days = 365, coverage_start_variable = "coverage_start_date", coverage_end_variable = "coverage_end_date" )%>% mutate(medicare_primary_TF=ifelse(cryptococcus_case=="Potential control", TRUE, medicare_primary_TF))patients_merged2<-patients_merged%>% fc_filter(medicare_primary_TF==TRUE, label = "365+ days of Medicare primary coverage\nprior to first cryptococcus claim", label_exc = "Excluded: Fewer than 365 days of coverage", show_exc = TRUE)patients_merged2$data<-patients_merged2$data%>% select(-medicare_primary_TF)%>% #Prepare data for cohort initialization mutate(terminal_date=coalesce(coverage_end_date, censor_date))patients_merged2<-patients_merged2%>% fc_filter((terminal_date - cryptococcus_dx_date >=minimum_followup) | is.na(cryptococcus_dx_date), label = "Minimum followup exceeded", label_exc= "Excluded: Minimum follow-up threshold not met", show_exc = TRUE)patients_merged2%>% fc_draw()```## Creation of a time-varying cohortWe use several functions from the `usRds` package:- `create_usrds_cohort()`- `add_cohort_covariate()`- `finalize_usrds_cohort()`These functions split each patient into multiple rows, with each row describing a discrete period of time.This captures that patients will have time-varying status for covariates such as cirrhosis status, etc.Patients with cryptococcus infection join the cohort on the date of cryptococcosis diagnosis, while patients without cryptococcus (potential controls) join on the date of their first kidney transplant."Time since transplant" resets after each new transplant.```{r eval=FALSE}#Now we need to construct the time-varying data set#Ungroupinitial_cohort<-patients_merged2$data%>% ungroup()%>%#Cases join when they experience cryptococcus#Controls start on date of first transplant mutate( cohort_join_date = coalesce( as.Date(cryptococcus_dx_date), as.Date(TX1DATE) ) )#Initialize cohortprematching_cohort<-create_usrds_cohort(df=initial_cohort, start_date = "cohort_join_date", end_date = "terminal_date")%>% # Add cirrhosis add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["cirrhosis"]], covariate_date="date_established", covariate_variable_name="cirrhosis")%>% # Add CMV add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["CMV"]], covariate_date="date_established", covariate_variable_name="CMV")%>% # Add diabetes add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Diabetes"]], covariate_date="date_established", covariate_variable_name="diabetes")%>% # Add HIV add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["HIV"]], covariate_date="date_established", covariate_variable_name="HIV")%>% # Add liver transplant add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Liver transplant"]], covariate_date="date_established", covariate_variable_name="liver_transplant")%>% # Add lung transplant add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Lung transplant"]], covariate_date="date_established", covariate_variable_name="lung_transplant")%>% # Add heart transplant add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Heart transplant"]], covariate_date="date_established", covariate_variable_name="heart_transplant")%>% # Add pancreas transplant add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Pancreas transplant"]], covariate_date="date_established", covariate_variable_name="pancreas_transplant")%>% # Add heart-lung transplant add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Heart-lung transplant"]], covariate_date="date_established", covariate_variable_name="heartlung_transplant")%>% # Add intestinal transplant add_cohort_covariate(covariate_data_frame=comorbidity_diagnosis_date[["Intestinal transplant"]], covariate_date="date_established", covariate_variable_name="intestinal_transplant")%>% #Add time-varying information about transplant status add_cohort_covariate(covariate_data_frame=tx_status, covariate_date="event_date", covariate_variable_name="cumulative_transplant_total", covariate_value = "cumulative_transplant_total")%>% #Add time-varying information about transplant status (whether current graft is active or failed) add_cohort_covariate(covariate_data_frame=tx_status, covariate_date="event_date", covariate_variable_name="current_graft_status", covariate_value = "graft_status")%>% #Add time-varying information about transplant status (date of most recent transplant) add_cohort_covariate(covariate_data_frame=tx_status%>%filter(graft_status=="Active"), covariate_date="event_date", covariate_variable_name="most_recent_transplant_date", covariate_value = "event_date")%>% #Add time-varying information about transplant status (date of most recent graft failure) add_cohort_covariate(covariate_data_frame=tx_status%>%filter(graft_status=="Failed"), covariate_date="event_date", covariate_variable_name="most_recent_failure_date", covariate_value = "event_date")%>% # Add Medicare current coverage add_cohort_covariate(covariate_data_frame=medicare_history, covariate_date="BEGDATE", covariate_variable_name="current_medicare_coverage", covariate_value = "PAYER" )%>% finalize_usrds_cohort(baseline_date_variable = "most_recent_transplant_date")```The analysis then proceeds on to the [execute_matching](execute_matching.qmd) component.## Other portions of the analysis- [**Setup**](setup.qmd): Defines global paths, data sources, cohort inclusion criteria, and analysis-wide constants.- [**Functions**](functions.qmd): Reusable helper functions for cohort construction, matching, costing, and modeling.- [**Execute matching**](execute_matching.qmd): Implements risk-set–based greedy matching without replacement to construct the analytic cohort.- [**Post-match processing**](postmatch_processing.qmd): Derives analytic variables, time-aligned cost windows, and follow-up structure after matching.- [**Modeling**](modeling.qmd): Fits prespecified cost and outcome models using the matched cohort.- [**Tables**](tables.qmd): Summary tables and regression outputs generated from the final models.- [**Figures**](figures.qmd):Visualizations of costs, risks, and model-based estimates.- [**About**](about.qmd): methods, assumptions, and disclosures