Introducing the oasis R package

oasis_logo

Introducing the oasis R package — a R package implementation of OASIS is Automated Statistical Inference for Segmentation (OASIS) currently hosted on github. The OASIS algorithm takes structural brain MRI studies with a FLAIR, T1, T2, and PD volume from patients with multiple sclerosis (MS) and produces OASIS probability maps of MS lesion presence, which can be thresholded into a binary lesion segmentation. With the oasis R package, you can make OASIS probability maps of lesion presence using the model described in the OASIS paper. You can also retrain an OASIS model using your own data and gold standard lesion segmentations and make predictions from this model. This blog post provides a detailed tutorial on how to use the oasis R package. The post is divided into four sections, namely:

  1. SETUP — setup for your workstation to use the oasis R package
  2. CREATING FEATURES FOR THE OASIS MODEL — using the oasis R package to create a dataframe(s) of OASIS feature vectors from an MRI study(ies) for training an OASIS model from your data
  3. TRAINING THE OASIS MODEL — training an OASIS model from your data
  4. OASIS PROBABILITY MAPS — using your trained OASIS model or the OASIS model from the paper to make an OASIS probability map of lesion presence

Many thanks to my package coauthors John Muschelli and Taki Shinohara, as well as Amanda Mejia who helped test the package. I especially could not have made this package without my coauthor John, who taught me how to make an R package and helped with each step along the way — John you are the best!

SETUP

The first thing that you need to do is install FSL, which we will be calling through R for image preprocessing and image smoothing using the fslr R package. It is a good idea to have the most up to date version of FSL possible. Instructions for installing FSL can be found here:

http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FslInstallation

After installing FSL, load the oasis package from github. The package oasis depends on the packages oro.nifti and fslr, so both of these packages will be loaded when you load oasis.

library(devtools)
install_github('emsweene/oasis')
library(oasis)

Next, determine the location of FSL on your machine. You can set the path for FSL for fslr, as follows:

options(fsl.path=’/path/to/fsl’)

CREATING FEATURES FOR THE OASIS MODEL

To train your model and make OASIS segmentations you need a FLAIR, T1, T2, and PD volume for each MRI study. For training the model you will also need a binary masks of `gold standard’ lesion segmentations for each study, that are in the same space as the T1 study. The code below loads in the images from a single MRI study using the oro.nifti package.

FLAIR = readNIfTI(‘/path/to/flair’, reorient = FALSE)
T2 = readNIfTI(‘/path/to/t2’, reorient = FALSE)
T1 = readNIfTI(‘/path/to/t1’, reorient = FALSE)
PD = readNIfTI(‘/path/to/pd’, reorient = FALSE)
GOLD_STANDARD = readNIfTI(‘/path/to/gold_standard’, reorient = FALSE)

We can visualize the data by plotting the T1 volume with the ‘gold standard’ manual lesion segmentation overlayed onto the image.

overlay = GOLD_STANDARD
overlay[GOLD_STANDARD == FALSE] = NA
orthographic(T1, overlay, col.y = "orange")

Rplot_1

Next you can prepare the data for fitting the OASIS model using the function oasis_train_dataframe. This function returns a dataframe of the features for the OASIS model for voxels that are part of the voxel selection procedure. The output of the oasis_train_dataframe function can be used with the oasis_training function. In the case that you have not preprocessed the data, you can use the following code that will preprocess the data for you. The preprocessing steps are image inhomogeneity correction and rigid registration to the space of the T1 volume. Please note that if you have already preprocessed or skull stripped your data before analysis the preprocessing steps from FSL may fail. So it is best to only call this on data that has not yet been preprocessed.

study_1_dataframe = oasis_train_dataframe(flair = FLAIR,
t1 = T1,
t2 = T2,
pd = PD,
gold_standard = GOLD_STANDARD,
brain_mask = NULL,
preproc = TRUE,
normalize = TRUE,
cores = 4)

Here we have set brain_mask = NULL, which runs the FSL brain extraction tool with fslr and creates a brain mask. If you have already have skull stripped your data, running the brain extraction tool may produce poor results. We have also set preproc= TRUE, which means that the data will be preprocessed using FSL. We have set normalize = TRUE, which means that the z-score normalization over the brain mask will be performed on all images, as described in the paper.  Setting cores = 4 allows some of image preprocessing steps and feature extraction to be run in parallel.  The code for this package is written in a way so that no more than 4 cores are ever used, therefore setting cores any higher than 4 will not speed up the process.

names(study_1_dataframe)
##[1] "FLAIR" "T1" "T2" "PD" "FLAIR_10"
##[6] "T1_10" "T2_10" "PD_10" "FLAIR_20" "T1_20"
##[11] "T2_20" "PD_20" "GoldStandard"

Setting return_preproc = TRUE returns the preprocessed data and allows you to quality check the preprocessing. In the case where you set return_preproc = TRUE, the function oasis_train_dataframe will return a list containing a dataframe for use with the oasis_training function, the FLAIR volume, the T1 volume, the T2 volume, the PD volume, the brain mask for the subject, and the voxel selection mask.

study_1_dataframe_preproc = oasis_train_dataframe(flair = FLAIR,
t1 = T1,
t2 = T2,
pd = PD,
gold_standard = GOLD_STANDARD,
brain_mask = NULL,
preproc = TRUE,
normalize = TRUE,
return_preproc = TRUE,
cores = 4)

names(study_1_dataframe_preproc)
##[1] "oasis_dataframe" "flair" "t1" "t2"
##[5] "pd" "brain_mask" "voxel_selection"

names(study_1_dataframe_preproc[[1]])
##[1] "FLAIR" "T1" "T2" "PD" "FLAIR_10"
##[6] "T1_10" "T2_10" "PD_10" "FLAIR_20" "T1_20"
##[11] "T2_20" "PD_20" "GoldStandard"

Often, you may only have a subset of slices with gold standard segmentations. You can also pass in the slice numbers and the orientation of the slices using the slices and orientation option below.

study_1_dataframe_slices = oasis_train_dataframe(flair = FLAIR,
t1 = T1,
t2 = T2,
pd = PD,
gold_standard = GOLD_STANDARD,
brain_mask = NULL,
preproc = TRUE,
normalize = TRUE,
slices = c(80,90,100),
orientation = "axial",
cores = 4)

If you have your own brain mask in the space of the T1 volume and/or you have processed your own data, you can use the following code:

BRAIN_MASK = readNIfTI(‘/path/to/brain_mask’ , reorient = FALSE)

study_1_dataframe_brain_mask= oasis_train_dataframe(flair = FLAIR,
t1 = T1,
t2 = T2,
pd = PD,
gold_standard = GOLD_STANDARD,
brain_mask = BRAIN_MASK,
preproc = FALSE,
cores = 4)

TRAINING THE OASIS MODEL

In the case where you have data from just one subject, you can train the OASIS model as shown below. The oasis_training function works on objects returned from oasis_train_dataframe . Then function oasis_training produces a glm object, which can be used to make probability maps of lesion presence on future MRI studies.

oasis_model_one_subject = oasis_training(study_1_dataframe)

In the case where you have multiple studies, say study_1_dataframe, study_2_dataframe, and study_3_dataframe, which are all outputs of the function oasis_train_dataframe, then you can just run the function on all of the dataframes:

oasis_model_multi_subject = oasis_training (study_1_dataframe,
study_2_dataframe, study_3_dataframe)

When you have returned a list from oasis_train_dataframe that contains the preprocessed images, you need to set the option remove_preproc = TRUE on the oasis function, so that these images are removed before fitting the model.  For example, we returned the preprocessed images in study_1_dataframe_preproc, therefore we would run

oasis_model_one_subject = oasis_training (study_1_dataframe_preproc, remove_preproc = TRUE)

OASIS PROBABILITY MAPS

Making predictions from the OASIS model can be done using the oasis_predict function. To make predictions using the model from the paper, we can use the option model = NULL. Again, we have the option to preprocess the data, using the option preproc = TRUE.

oasis_maps_paper_model = oasis_predict(flair = FLAIR ,
t1 = T1,
t2 = T2,
pd = PD,
brain_mask = NULL,
preproc = TRUE,
normalize = TRUE,
model = NULL,
cores = 4)

We can also make predictions with our own model, using the model option.

oasis_maps_user_model = oasis_predict(flair = FLAIR,
t1 = T1,
t2 = T2,
pd = PD,
brain_mask = NULL,
preproc = TRUE,
normalize = TRUE,
model = oasis_model_one_subject,
cores = 4)

In addition, you can also return the binary lesion segmentation created from the OASIS probability map, by setting the parameter binary = TRUE. The default for this segmentation is created using the threshold of 0.16 in the OASIS paper (this threshold has a Dice Similarity Coefficient of 0.61 on the validation set in the paper). You can also supply your own threshold value using the threshold parameter.

oasis_maps_user_model_binary = oasis_predict(flair = FLAIR,
t1 = T1,
t2 = T2,
pd = PD,
brain_mask = NULL,
preproc = TRUE,
normalize = TRUE,
model = oasis_model_one_subject,
binary = TRUE,
threshold = 0.50,
cores = 4)

We now look at the overlaid binary segmentation on the T1 volumes,

names(oasis_maps_user_model_binary)
##[1] "oasis_map" "binary_map"

overlay = oasis_maps_user_model_binary$binary_map
overlay[oasis_maps_user_model_binary$binary_map == FALSE] = NA
orthographic(T1, overlay, col.y = "orange")

hi

This concludes the tutorial for oasis R package. If you have any questions or suggestions for functionality for the package, please leave them in the Comments section below.

Coursera Motivation

Over the past year, I have signed up for over 10 Coursera classes.   My research involves working with Neuroimaging data and I want to take Coursera classes in order to learn some of the basics of Neuroscience.   I’m tired of being in meetings with collaborators and referring to parts of the brain by frantically gesturing to that point on my head.  And please do not tell anyone that I imagine the blood brain barrier as a trash bag covering the brain, because I have no better biological intuition to go on.

I love the idea of Coursera, but more specifically I really love the idea when I sign up for the courses months in advance.  Whenever the courses start I always come up with a slew of excuses.  But this time, something is different.  A few months ago, I emailed my dad about a course I though we would both enjoy: Drugs and the Brain by Henry A. Lester at the California Institute of Technology.   My dad is a retired chemist from a pharmaceutical company, so this is where our interests overlap.  And when the course started, I did what I always do and made up excuses. Until I go this email:

To: Elizabeth Sweeney
From: Daniel Sweeney

Subject: Graduate School at CalTech

Elizabeth,

I passed my quiz with a 17.5/22 for the first week at Caltech and I am now off to a bar with some of my brighter classmates to celebrate until Saturday.  I hope you and some of the other students manage to pass your quiz.

Best of luck,

Dad

Oh snap. Ever confident, I started watching the lectures and replied:

To: Daniel Sweeney
From: Elizabeth Sweeney
Subject: Re: Graduate School at CalTech

Dad,

I’m done with 3 of the 11 mini lectures.  Myself and the other more studious members of our class are taking our time with the material and intend to score a bit higher on this weeks quiz. Have fun at the bar with the rest of the C students!

Elizabeth

And today I am proud to say that I have finished my first week of Coursera lectures.   But I need to be a little more careful when I trash talk the C students in this course.  This week’s lectures and quiz proved difficult for me, as they required a background in Chemistry.  On my first quiz attempt I scored an 11.75/22.  After reviewing my mistakes and going over some of the material again, I took the quiz a second time and scored  a 20.50/22.  Not so bad (although all the questions were exactly the same…).  I am happy that I am picking up many of the broader topics from the course, which I think will be useful.  For example I now know that the blood brain barrier is made up of tight junctions between endothelial cells (not a trash bag).  Polar molecules and proteins cannot diffuse across the blood brain barrier. But non polar molecules can, such as nicotine. I ‘m planning to continue with the course, as I am sure to receive more motivating emails from my dad along the way!

Converting .dcm to .nii Image Format in R

## This code reads a dicom image, converts it to a nifit and then writes the nifit file.
##  Requires the packages oro.dicom and oro.nifti.
##
## Elizabeth Sweeney
## August 26th 2013

library(oro.dicom)
library(oro.nifti)

dicom <- readDICOM(‘~/folder’, flipud = FALSE)
nifti <- dicom2nifti(dicom)
writeNIfTI(nifti, filename = “nifti”, verbose = TRUE, gzipped = TRUE)

From LaTeX to Word

Update: HBM now accepts LaTeX submissions (see comment below from A.M. Winkler). 

A recent submission to Human Brain Mapping required us to change a document from .tex to .doc.  This is not a rant about journals submissions in .doc format, but is instead a few tips for making the process a little less painful.  Thanks to John Muschelli and Leonardo Collado Tores for help with this!

Use Pandoc to convert your LaTeX document to Word and then convert your references from BibTeX to Endnote:

1) Install Pandoc http://johnmacfarlane.net/pandoc/installing.html

2) Open a terminal.  Go to the folder with your LaTeX document and run this from the command line:

pandoc -s -S Original_document.tex -o New_document.docx

3) Install Endnote (you can get a free 30 day trial from this website) http://endnote.com

4) This website has a great BibTeX to Endnote converter that is easy to use: http://sydney.edu.au/engineering/it/~tapted/bib2endnote.html

We still had to go through the document and double check the equations, add all the references by hand (groan…) and fix the tables.  But these steps  really help to shorten the job.