Cluster Aanlysis of Gene Expression Profiles via Flexible Count Models for RNA-seq Data

Loading...
Thumbnail Image

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Clustering RNA-seq data is used to characterize environment-induced (e.g., treatment) differences in gene expression profiles by separating genes into clusters based on their expression patterns. Wang et al. [2013] recently adopted the bi-Poisson distribution, obtained via the trivariate reduction method, as a model for clustering bivariate RNA-seq data. We discuss the inadequacy of the bi-Poisson distribution in modelling the correlation between dependent Poisson counts, and its impact on clustering such data. We introduce an alternative Gaussian copula model that incorporates a flexible dependence structure for the counts, report simulation results to compare the performance of the Gaussian copula and bi-Poisson models, and investigate the impact on clustering of Poisson counts of misspecified dependence structures. We illustrate our methodology on a lung cancer RNA-seq data.

Description

Keywords

Citation

Ruan, J. (2015). Cluster Aanlysis of Gene Expression Profiles via Flexible Count Models for RNA-seq Data (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca. doi:10.11575/PRISM/25338

Endorsement

Review

Supplemented By

Referenced By