Aug 14, 2015 - Computational Analysis of Flow Cytometry Data (PART II). When comparing two cell populations, the events from the different populations.
Flow cytometry data are numbers rich.Data from experiments can be population measurements (percent of CD4+ cells, for example), or it can be expression level (median fluorescent expression of CD69 on activated T cells).Many times, researchers are content to show histograms to illustrate their point after a flow experiment. This approach misses the opportunity to take that content rich data and extend the analysis into a statistical analysis.To properly perform statistical analysis, the first step is to understand the hypothesis. The hypothesis will guide the statistical analysis, identifying the correct test to be performed.
There are several things that need to be considered when beginning the statistical analysis of the data.1. Design your experiment properly from the start.Statistical power answers the question of what is the probability of correctly rejecting the null hypothesis when the null hypothesis falls. There are three factors that influence the power of an experiment: the sample size, the spread of the data and the number of replicates.
The power of the experiment is related to the ability of the experiment to avoid statistical errors.2. Know the classes of statistical errors and how to avoid them.False positives are when a true null hypothesis is incorrectly rejected. False negatives (Type II errors) are when the test fails to reject a false null hypothesis.In fact, the power of the experiment is defined as the b which is equal to the True positive/(true positive + false negative)3. Use the appropriate statistical test.The biological hypothesis and experimental design will determine what is the appropriate test for the data. The distribution of the data is also important to consider. How best to determine the correct test?
Can help you determine which test is most appropriate.4. Set the appropriate threshold.The a value is the threshold that will be used to determine in the data is statistically significant or not. For historical reasons, this value is usually set at 0.05. This can be interpreted as the chance of finding significance where there is none (i.e.
The chance of committing a Type I error).5. Avoid the more significant trap.Once the a value is set, if the P-value is below that value, the data is statistically significant.
The data is not more significant if the and the threshold is 0.05 than if the P-value is 0.04. If there is an expectation, and a desire to decrease the Type I error, the threshold should be set to a more stringent level (0.01 or more).6. Avoid multiple pairwise comparisons.In the case where the experimental design has Drug X, Drug Y and the combination of Drug X and Y, to be compared to an untreated sample, what is the best test?
Pairwise comparisons should not be performed in this case for the following reason. With the a set to 0.05, there is a 5% change of committing a Type one error. With each comparison, the change of committing a Type I error increases, as showing in the chart below. Number of pairwise comparisonsChanges of a Type I error210%315%419%523%At the end of the day, the statistical analysis of your flow cytometry data is a critical step for proving the validity of the hypothesis that was being tested. With careful and considered approach to performing the correct testing, the published data will stand up to the rigors of peer review and help lead to another discovery. I enjoy answering paradigm-shifting questions and trouble-shooting puzzling glitches. I also like finding new ways to enhance old procedures.
I’m passionate about my professional relationships and strive to fill them with positive energy.My other passions include grilling, wine tasting, and real food. To be honest, my biggest passion is flow cytometry, which is something that Carol and I share. My personal mission is to make flow cytometry education accessible, relevant, and fun. I’ve had a long history in the field starting all the way back in graduate school.
About Tim BushnellI enjoy answering paradigm-shifting questions and trouble-shooting puzzling glitches. I also like finding new ways to enhance old procedures. I’m passionate about my professional relationships and strive to fill them with positive energy.My other passions include grilling, wine tasting, and real food. To be honest, my biggest passion is flow cytometry, which is something that Carol and I share. My personal mission is to make flow cytometry education accessible, relevant, and fun.
I’ve had a long history in the field starting all the way back in graduate school.
So the project was born out of frustration. 'We both have backgrounds in more traditional engineering disciplines,' Castillo-Hair said. 'I'm from mechanical engineering; John is from electrical engineering.
We're familiar with having numbers that actually mean something.' 'We're engineers at heart, and we need to characterize the things we build,' Sexton added. 'To date, in this field, we've largely been playing in our own sandboxes, which is very frustrating for somebody who's trying to take things that other people have made and use them in your own research.
It was clear to us that we needed to use the same yardstick everywhere.' 'This software is a big step,' Castillo-Hair said. 'It allows you to refer gene expression units to a common number that you can very easily compare to what other people have measured in your lab, even when completely different instrument settings have been used. With a few additional steps, data from different instruments and labs can become comparable as well.' To make it accessible to the largest number of researchers, the program has been designed to work with Microsoft Excel, for which the Rice team has designed templates.
Once a simple Excel file is prepared, FlowCal reads and processes data from calibration particles. This information is then used to convert cell fluorescence to calibrated units. Finally, the program generates an Excel file with cell fluorescence statistics and a set of plots of calibration particles and cell samples.Sexton said FlowCal has become a standard tool in the Tabor lab. 'This is easy enough to use that even people without any programming knowledge can pick it up really quickly,' he said.The researchers programmed FlowCal in the common Python language and encourage others to add on or modify it to suit their own needs. 'We've made it pretty easy to collaborate, if other people want to add to it,' Sexton said.