Skip to contents

The strobe package provides tools for tracking cohort inclusion/exclusion in observational studies using a STROBE-inspired flow. STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) is a widely adopted guideline for transparent reporting of cohort, case-control, and cross-sectional studies. Learn more at the official website: https://www.strobe-statement.org/.

The strobe package both applies filtering criteria to a patient-level data frame while tracking the number of patients excluded and included at each step. The package can also print and/or save the final STROBE diagram for inclusion in a publication.

By logging each filtering step—conditions applied, number remaining, and number excluded—strobe helps ensure reproducibility, clarity, and publication-quality documentation of cohort derivation.

Installation

You can install the development version of strobe from GitHub with:

# install.packages("devtools")
devtools::install_github("VagishHemmige/strobe")

Example

Note: The example below requires the medicaldata package. Install it with:

install.packages("medicaldata")

We now create an SVG plot in R and then print the strobe log with the get_strobe_log function.

library(strobe)
library(medicaldata)
data(cytomegalovirus)

df <- strobe_initialize(cytomegalovirus, "Initial CMV transplant cohort")
df <- strobe_filter(df, "age >= 30", "Age ≥ 30", "Excluded: Age < 30")
df <- strobe_filter(df, "recipient.cmv == 1", "Recipient CMV positive", "Excluded: CMV negative")

get_strobe_log()
#> # A tibble: 3 × 7
#>   id    parent inclusion_label         exclusion_reason filter remaining dropped
#>   <chr> <chr>  <chr>                   <chr>            <chr>      <int>   <int>
#> 1 start <NA>   Initial CMV transplant… <NA>             <NA>          64      NA
#> 2 step1 start  Age ≥ 30                Excluded: Age <… age >…        63       1
#> 3 step2 step1  Recipient CMV positive  Excluded: CMV n… recip…        40      23

We plot the STROBE diagram associated with the above code. Interactively, the plot_strobe_diagram() function can also directly create plots that can be viewed in the Rstudio viewer, but direct visualization in an .rmd file which has been knitted may cause formatting issues.

svg_file_1 <- plot_strobe_diagram(export_file = "man/figures/strobe-diagram_readme1_1.svg", 
                                incl_fontsize = 90, excl_fontsize = 90, 
                                lock_width_min = TRUE, 
                                incl_width_min = 20, excl_width_min = 15)
knitr::include_graphics("man/figures/strobe-diagram_readme1_1.svg")

The strobe package can also support terminal branching based on the value of a factor variable, if the factor variable has six or fewer distinct values:

#Add terminal branch based on the value of the variable

df<-create_terminal_branch(df, variable = "cgvhd", label_prefix="CGVHD value:")
get_strobe_log()
#> # A tibble: 5 × 7
#>   id    parent inclusion_label         exclusion_reason filter remaining dropped
#>   <chr> <chr>  <chr>                   <chr>            <chr>      <int>   <int>
#> 1 start <NA>   Initial CMV transplant… <NA>             <NA>          64      NA
#> 2 step1 start  Age ≥ 30                Excluded: Age <… age >…        63       1
#> 3 step2 step1  Recipient CMV positive  Excluded: CMV n… recip…        40      23
#> 4 step3 step2  CGVHD value:0           <NA>             <NA>          22      NA
#> 5 step4 step2  CGVHD value:1           <NA>             <NA>          18      NA

The plot_strobe_diagram function can handle terminal branching:

svg_file_2 <- plot_strobe_diagram(export_file = "man/figures/strobe-diagram_readme1_2.svg", 
                                incl_fontsize = 90, excl_fontsize = 90, 
                                lock_width_min = TRUE, 
                                incl_width_min = 20, excl_width_min = 15)
knitr::include_graphics("man/figures/strobe-diagram_readme1_2.svg")