• Login
    View Item 
    •   DSpace Home
    • Stony Brook University
    • Stony Brook Theses & Dissertations [SBU]
    • View Item
    •   DSpace Home
    • Stony Brook University
    • Stony Brook Theses & Dissertations [SBU]
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    LoginRegister

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Testing the properties of selection criteria: an application to copy number polymorphism measurements

    Thumbnail
    View/Open
    SaintFleur_grad.sunysb_0771E_10193.pdf (1.349Mb)
    Date
    1-Aug-10
    Author
    Saint Fleur, Rose Edy
    Publisher
    The Graduate School, Stony Brook University: Stony Brook, NY.
    Metadata
    Show full item record
    Abstract
    Variation in the human genome is present in many forms, including single-nucleotide polymorphisms (SNPs) and copy number polymorphisms (CNPs). CNPs have many categories such as small insertion-deletion polymorphisms, variable number of repetitive sequences, and genomic structural alterations. A major question that researchers in the field of statistical genetics need to answer is the number of CNP categories in a given dataset. In this study, I compare five information criteria (BIC, AIC, NEC, CLC, and ICL-BIC) to find if there is a"best" measure among them in finding the correct number of components (correct number of CNP categories). I consider six design factors: equal/unequal within-component variances, high/low separations, sample size, mixture proportion, multiple random starting values, and transformation using two known number of components (3 and 6). The result indicates that under"ideal" conditions (that is, small number of components, large separation between components, constant within component variance, and no subsequent transformation of mixture data), each criterion performs well. When the data is a monotonic transformation of data from a mixture, the BIC criterion, which is the most commonly used criterion in CNP research, has a low component number accuracy rate. I then considered the application of the Box-Cox transformation whether or not it was needed. The application of the Box-Cox transformation did not reduce the component number accuracy rate of the CLC, ICL-BIC, and BIC when it was not needed. The component number accuracy rates for the BIC criterion with Box-Cox transformation applied were improved when the mixture data was transformed. The Box-Cox transformation should be used routinely with CLC, ICL-BIC, or BIC criterion to estimate the number of components in a CNP mixture analysis.
    URI
    http://hdl.handle.net/1951/55610
    Collections
    • Stony Brook Theses & Dissertations [SBU] [1955]

    SUNY Digital Repository Support
    DSpace software copyright © 2002-2022  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV
     

     


    SUNY Digital Repository Support
    DSpace software copyright © 2002-2022  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV