DSpace Repository

Testing the properties of selection criteria: an application to copy number polymorphism measurements

Show simple item record

dc.contributor.advisor Finch, Stephen J. en_US
dc.contributor.author Saint Fleur, Rose Edy en_US
dc.contributor.other Department of Applied Mathematics and Statistics en_US
dc.date.accessioned 2012-05-15T18:06:38Z
dc.date.available 2012-05-15T18:06:38Z
dc.date.issued 1-Aug-10 en_US
dc.date.submitted Aug-10 en_US
dc.identifier SaintFleur_grad.sunysb_0771E_10193.pdf en_US
dc.identifier.uri http://hdl.handle.net/1951/55610
dc.description.abstract Variation in the human genome is present in many forms, including single-nucleotide polymorphisms (SNPs) and copy number polymorphisms (CNPs). CNPs have many categories such as small insertion-deletion polymorphisms, variable number of repetitive sequences, and genomic structural alterations. A major question that researchers in the field of statistical genetics need to answer is the number of CNP categories in a given dataset. In this study, I compare five information criteria (BIC, AIC, NEC, CLC, and ICL-BIC) to find if there is a"best" measure among them in finding the correct number of components (correct number of CNP categories). I consider six design factors: equal/unequal within-component variances, high/low separations, sample size, mixture proportion, multiple random starting values, and transformation using two known number of components (3 and 6). The result indicates that under"ideal" conditions (that is, small number of components, large separation between components, constant within component variance, and no subsequent transformation of mixture data), each criterion performs well. When the data is a monotonic transformation of data from a mixture, the BIC criterion, which is the most commonly used criterion in CNP research, has a low component number accuracy rate. I then considered the application of the Box-Cox transformation whether or not it was needed. The application of the Box-Cox transformation did not reduce the component number accuracy rate of the CLC, ICL-BIC, and BIC when it was not needed. The component number accuracy rates for the BIC criterion with Box-Cox transformation applied were improved when the mixture data was transformed. The Box-Cox transformation should be used routinely with CLC, ICL-BIC, or BIC criterion to estimate the number of components in a CNP mixture analysis. en_US
dc.description.sponsorship Stony Brook University Libraries. SBU Graduate School in Department of Applied Mathematics and Statistics. Lawrence Martin (Dean of Graduate School). en_US
dc.format Electronic Resource en_US
dc.language.iso en_US en_US
dc.publisher The Graduate School, Stony Brook University: Stony Brook, NY. en_US
dc.subject.lcsh Statistics en_US
dc.title Testing the properties of selection criteria: an application to copy number polymorphism measurements en_US
dc.type Dissertation en_US
dc.description.advisor Advisor(s): Stephen J. Finch. Committee Member(s): Nancy R. Mendell; Wei Zhu; Derek Gordon. en_US
dc.mimetype Application/PDF en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account