In this post I will demonstrate how R has helped me in my role as a clinical research coordinator to help complete daily tasks and make data informed decisions. My goal is to share these experiences to shine a light on how R/RStudio can improve workflow and decision making in the workplace.
As a brief note, some of the programming power in R can be generalized to other languages and software, but R is my chosen language and RStudio my chosen Integrated Development Environment (IDE)—Like many R users, I do not use R outside of an IDE.
Reusability
The beauty of writing programming scripts is that you can run the script repeatedly to produce the same result and after including new data in the environment you get the same structured output with the updated information. For instance, the REDCap system we use for data management tracks scheduling information. Loading these data in R can facilitate queries of the large data table with strict parameters. In this example I want to see incomplete 6-month follow up cases with a closing window before today. We’ll consider these hypothetical cases no longer eligible for assessment. Below is pseudo-code and output to give an idea of what this looks like in R.
Script.R`
Data <| Select(record_id, rater, assessment, end_date, completed, date_scheduled) <| Filter(assessment == “6-month”, end_date < today(), completed == “incomplete”)
Console
2000 Michael 6-MO 01-22-2024 Incomplete NA 2011 Michael 6-MO 03-02-2024 Incomplete NA
The code returns all the selected columns but filtered for rows that meet the conditions specified. I now know there are two missed cases that I do not have to contact to schedule because the current date is past the end window date—I could also simply filter these out of the data set, then run a query to see the 6-month cases I do need to schedule. I can write as much syntax as I need to find all the information to help me schedule without spending a lot of time reading through a large data table. Further, when the case load from the above inquiry changes, the script remains the same. After transferring the new data from REDCap, hit run on the R script, and get the new case load. Creating these kinds of scripts early in the project can save a lot of time later.
Reporting
There are several ways to create reports and dashboards using RStudio. The most popular are R Markdown, Quarto (the new R Markdown), and Shiny. Each has its strength for creating reproducible html files, PDFs, Word documents, Excel files, web applications, etc. Developers are consistently pushing new features for Quarto and Shiny. Quarto also works with python and julia. Shiny works with python too.
REDCap covers most of the necessary reports for Project SUCCESS. With that said, R has proven to be a handy tool to not only report on outcomes data that involves more advanced calculations, but also other aspects of the project related to my role that do not live within the REDCap ecosystem.
R packages (extensions)
As an open-source language, there are thousands of packages built for use cases ranging from those that are very broad to those that are extremely specific. In fact, there are packages to make a direct connection between R and REDCap (redcapAPI and REDCapR). For many popular software applications, there is an R package to make API calls like the packages for Teams (Microsoft365R and teamR) and Qualtrics (qualtRics). In both research and management/coordination roles we often work with date and time variables—the lubridate package remains undefeated for formatting and manipulating date and time variables.
Recently, the team wanted to know the average time spent completing part 1 and part 2 assessments by rater. We have all the assessment recordings in a Teams folder of which each has a file name with the format “recordId_assessmentNumber_assessmentPart_rater_date.mp4”.
After downloading the assessment recording data from Teams I was able to (1) parse the file names to grab the values, (2) use the lubridate package to apply math functions to the duration of videos, then (3) group and filter as needed to find the average time spent by assessment part and individual rater. After creating a few tables to answer the question with different breakdowns, the data was exported to multiple sheets in Excel using the openxlsx package.
Conclusion
I discussed a couple of use cases that helped keep me informed on my own work and the work of others by bringing R/RStudio into the tech stack. R works well as a complementary software to REDCap and Teams (R could not replace either of those). It goes beyond our preconceptions of the tool as statistical software due to the packages that allow for integration, data processing, reporting, and organization. To reiterate, some benefits include comparing data from independent data pools, writing flexible and reusable scripts, reporting data in various forms, and the enormous number of special purpose packages.
Michael Rejtig, Clinical Research Coordinator
Connect with me on Linkedin