Phase Diagram of QDA Classifier for Two-Component Mixture with Diagonal Covariance Matrices
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Consider the two-component mixture model with sample size much smaller than the number of predictors. This thesis deals with the problem of classification with rare and weak signals which often occurs in this scenario (p >> n). In the two-class classification problem of normal covariates with zero mean and diagonal covariance matrices, Chen et al. (2023) provided a detection boundary for the quadratic discriminant analysis (QDA) classifier and proposed to use higher criticism thresholding (HCT), a feature selection procedure, to improve classification accuracy of the QDA classifier. This thesis is an extension of their work. They derived QDA successful and unsuccessful region assuming covariance matrices are known (i.e. at population level), and an impossibility subregion when covariance matrices are unknown (i.e. at sample level) which forms an QDA unsuccessful subregion. In this thesis, we first derive the phase transition of QDA at population level, which separates QDA successful region and unsuccessful region. At the sample level, we work out a sufficient condition for the QDA classifier to be successful, i.e. we found a QDA successful subregion. To demonstrate our theoretical results, we conduct extensive simulation studies to examine the performance of QDA classifier at both population level and sample level, and both within and outside succesful (sub)region. For sample level, we compare the performance of QDA classifier with HCT thresholding and without thresholding and find that HCT can improve QDA classification accuracy.