The transplantr package provides a set of vectorised functions for audit and clinical research in solid organ transplantation. These are particularly intended to work well with multiple datapoints in large series of data, where manual calculations would be particularly tedious.
The functions provided fall into four groups:
Although the package was built with unit tests, inaccuracies cannot be completely excluded. it is not a medical device and should not be used for making clinical decisions. References to the publications of the formulae are provided in the vignettes and documentation of individual functions.
transplantr can be installed from CRAN:
# install transplantr
install.packages("transplantr")
# load transplantr once installed
library(transplantr)
The development version can be installed from GitHub, if you want all the latest features, together with all the latest bugs and errors. Installing from CRAN is the best option for most users as the submitted packages have to pass some very pedantic automated tests before they can be hosted on CRAN. If you do want the caveat emptor, you have been warned version, this is how:
As vectorised functions, the functions can be applied across a whole dataset fairly rapidly. I find that the easiest way to do this is using a “pipe” of functions from the dplyr package. dplyr can be installed on its own or, as I would recommend, by installing the whole tidyverse family of packages - a family which includes the legendary ggplot2 graphing package.
# install whole tidyverse
install.packages("tidyverse")
# install just dplyr
install.packages("dplyr")
Although recommended, dplyr is not necessary for most transplantr functions to work. dplyr is needed for the EPTS and KDPI functions, and additionally stringr is needed for the hla_mm_level_str()
function and also for the chi2dob()
function, one unlikely to be needed by anyone working outside Scotland!
By default, all the functions work with the units most commonly used in the UK, which for creatinine and bilirubin is µmol/l, but each function using either of these can be used with mg/dl instead by changing an optional units
parameter to "US"
or by calling a wrapper function suffixed with _US()
; e.g. when calculating eGFR, the ckd_epi_US()
function calls ckd_epi()
using creatinine in mg/dl.
Albumin is generally reported in g/l in the UK, but more commonly as g/dl in the US. The few functions using albumin default to g/l but change to g/dl if the units
parameter is set to "US"
or the _US()
wrapper function is called.
Which is the best option to use? Calling the wrapper function uses fewer keystrokes so is quicker to type, but as it is a function calling another function, there is a slight increase in computational overhead; for example, benchmarking the time taken to calculate 100,000 eGFR results, the median time was 106 milliseconds with ckd_epi()
and 108 milliseconds using the more convenient ckd_epi_US()
.
Let’s say you want to calculate MELD scores for a series of liver transplant candidates. OK, you probably actually want MELD-Na, but let’s go with MELD as it has fewer variables! The data is in a dataframe or tibble called “oltx.assessments” and the relevant variables are Patient.INR, Patient.Bilirubin, Patient.Creatinine and Patient.Dialysed. To add a new Patient.MELD variable to the dataframe, you would use a dplyr pipe with the mutate()
verb:
oltx.assessments <- oltx.assessments %>%
mutate(Patient.MELD = meld(INR = Patient.INR, bili = Patient.Bilirubin,
creat = Patient.Creatinine, dialysis = Patient.Dialysed, units = "SI"))
The units = "SI"
can be left out provided that creatinine and bilirubin are both in µmol/l. To switch to mg/dl, use units = "US"
or call meld_US()
instead.
Although I think dplyr makes life much easier when organising data, I concede that some people prefer to use base R functions instead. Using a vectorised function with multiple vector inputs is not easy in base R but can be done with the mapply()
, or more easily with the pmap_dbl()
from the purrr package.
# attach oltx.assessments to save a lot of typing!
attach(oltx.assessments)
# method using pmap_dbl()
oltx.assessments$Patient.MELD = pmap_dbl(list(Patient.INR, Patient.Bilirubin,
Patient.Creatinine, Patient.Dialysed),
meld, units = "SI")
# alternative method using mapply()
oltx.assessments$Patient.MELD = mapply(FUN = meld, Patient.INR, Patient.Bilirubin,
Patient.Creatinine, Patient.Dialysed,
MoreArgs = list(units = "SI"),
SIMPLIFY = TRUE)
# detach oltx.assessments to avoid namespace errors
detach(oltx.assessments)
The advantage of using a dplyr pipe, apart from easier code, is speed. Benchmarking on a basic Linux laptop showed that the median time to perform vectorised calculation of 100,000 MELD scores was 115 milliseconds, compared with 6007 milliseconds using `pmap_dbl()
and 6484 with mapply()
.
Due to rounding errors, functions may return slightly different values to those from other calculators, including online calculators and even using R and transplantr on different computers. But if you encounter a definite bug, please file a report with minimal reproducible example of your R code at GitHub. Requests for new features can also be made on GitHub.