Department of Statistics
Iowa State University
Fully Bayesian analysis of RNAseq data for gene expression heterosis detection
Heterosis, or hybrid vigor, is the enhancement of the phenotype of hybrid progeny relative to their inbred parents. To identify genes displaying a heterosis pattern in their expression, we construct a gene-specific overdispersed count regression model. Since there are ~40,000 genes and ~10 samples, we build a hierarchical model for the gene-specific parameters to provide a data-based borrowing of information across genes. To implement a fully Bayesian analysis, we construct a novel parallelized Markov chain Monte Carlo algorithm that efficiently utilizes the architecture of a graphical processing unit through embarrassingly parallel computations and parallel reductions. We demonstrate the utility of the method to identify gene expression heterosis through a variety of simulation studies and analyze an RNAseq maize dataset to identify genes with 6 different types of heterosis.
Refreshments at 3:45pm in Snedecor 2101.