Skip to contents

This function loads a USRDS data file using the internal registry and optionally applies:

  • Factor labels using format definitions from Appendix C

  • Variable labels using descriptions from Appendix B

Usage

load_usrds_file(
  file_key,
  factor_labels = TRUE,
  var_labels = FALSE,
  usrds_ids = NULL,
  col_select = NULL,
  ...
)

Arguments

file_key

Character. File key (e.g., "PATIENTS", "RXHIST"). Case-insensitive match to registered file names.

factor_labels

Logical. Whether to apply factor labels (from Appendix C). Default = TRUE.

var_labels

Logical. Whether to apply variable labels (from Appendix B). Default = FALSE.

usrds_ids

Optional. A vector of USRDS_IDs to retain in the file. Only applied if the file contains a USRDS_ID column.

col_select

Optional. A character vector or tidyselect expression for selecting columns to load.

...

Additional arguments passed to the file reader (e.g., as_factor for read_sas)

Value

A tibble with the contents of the selected file, optionally labeled with factors and variable descriptions.

Details

All variable names are standardized to uppercase after loading.

col_select supports tidyselect syntax for CSV and SAS files, and is also available for Parquet files (resolved automatically to column names).

Examples

if (FALSE) {
# Load the PATIENTS file and apply factor labels (Appendix C)
df <- load_usrds_file("PATIENTS", factor_labels = TRUE)

# Load the RXHIST file with both factor and variable labels
df <- load_usrds_file("RXHIST", factor_labels = TRUE, var_labels = TRUE)

# Select only columns starting with "CD" (works for all file types)
df <- load_usrds_file("PATIENTS", col_select = dplyr::starts_with("CD"))

# Load only specific columns by name
df <- load_usrds_file("PATIENTS", col_select = c("USRDS_ID", "CDEATH"))

# Filter to a subset of USRDS_IDs
df <- load_usrds_file("RXHIST", usrds_ids = c("100000123", "100000456"))
}