This function loads a USRDS data file using the internal registry and optionally applies:
Factor labels using format definitions from Appendix C
Variable labels using descriptions from Appendix B
Usage
load_usrds_file(
file_key,
factor_labels = TRUE,
var_labels = FALSE,
usrds_ids = NULL,
col_select = NULL,
...
)Arguments
- file_key
Character. File key (e.g., "PATIENTS", "RXHIST"). Case-insensitive match to registered file names.
- factor_labels
Logical. Whether to apply factor labels (from Appendix C). Default = TRUE.
- var_labels
Logical. Whether to apply variable labels (from Appendix B). Default = FALSE.
- usrds_ids
Optional. A vector of USRDS_IDs to retain in the file. Only applied if the file contains a
USRDS_IDcolumn.- col_select
Optional. A character vector or tidyselect expression for selecting columns to load.
- ...
Additional arguments passed to the file reader (e.g.,
as_factorforread_sas)
Value
A tibble with the contents of the selected file, optionally labeled with factors and variable descriptions.
Details
All variable names are standardized to uppercase after loading.
col_select supports tidyselect syntax for CSV and SAS files, and is also available for
Parquet files (resolved automatically to column names).
Examples
if (FALSE) {
# Load the PATIENTS file and apply factor labels (Appendix C)
df <- load_usrds_file("PATIENTS", factor_labels = TRUE)
# Load the RXHIST file with both factor and variable labels
df <- load_usrds_file("RXHIST", factor_labels = TRUE, var_labels = TRUE)
# Select only columns starting with "CD" (works for all file types)
df <- load_usrds_file("PATIENTS", col_select = dplyr::starts_with("CD"))
# Load only specific columns by name
df <- load_usrds_file("PATIENTS", col_select = c("USRDS_ID", "CDEATH"))
# Filter to a subset of USRDS_IDs
df <- load_usrds_file("RXHIST", usrds_ids = c("100000123", "100000456"))
}
