• Login
    View Item 
    •   DSpace Home
    • Stony Brook University
    • Stony Brook Theses & Dissertations [SBU]
    • View Item
    •   DSpace Home
    • Stony Brook University
    • Stony Brook Theses & Dissertations [SBU]
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsDepartmentThis CollectionBy Issue DateAuthorsTitlesSubjectsDepartment

    My Account

    LoginRegister

    Statistics

    Most Popular ItemsStatistics by CountryMost Popular Authors

    Partial Correlation Network Analysis for Mixed Data

    Thumbnail
    View/Open
    StonyBrookUniversityETDPageEmbargo_20130517082608_116839.pdf (40.31Kb)
    Date
    1-May-12
    Author
    Leong, Shirley Hui Yee
    Publisher
    The Graduate School, Stony Brook University: Stony Brook, NY.
    Metadata
    Show full item record
    Abstract
    The partial correlation is well defined for continuous data and popularly used in network analysis. Its strength is in its interpretation as the relationship between two variables after removing the effects of other variables. We follow up on a recent proposal of such a measure for categorical data, but the properties of which were not well studied. The new partial correlation is defined as the first canonical correlation of Pearson residuals from logistic regressions. This is analogous to the continuous case, where the partial correlation is obtained from correlating residuals from linear regressions. A simulation study is presented to examine the properties of the new partial correlation and compare it to other measures, such as the partial phi coefficient. In the limiting case, the new partial correlation and the partial phi coefficient converge in estimate and inference. However, the partial phi coefficient cannot be applied to multi-categorical data. Furthermore, it is not an efficient measure to control for more than one variable. The new partial correlation is well defined for the multi-categorical case and can readily control for more than one variable. Being derived as the canonical correlation, the new partial correlation can also measure the relationship between continuous and categorical variables as the multiple correlation between the Pearson residuals from the logistic regression and the usual residual from the linear regression when the response variables are categorical and continuous respectively. Now that we are fully capable of obtaining partial correlation networks for any data types, continuous, categorical or mixed, our next goal is to compare the network structure between different groups and to examine the impact of continuous, in addition to categorical covariates, on the pathway connections. This is accomplished by extending the two-level regression approach for continuous data originally developed by our research group (Pradhan, 2009) to categorical data and mixed data network analysis. By linearly regressing the first canonical variates and replacing the slope coefficient with an expression of the covariates, we can test for the effect of covariates (both categorical and continuous) on the partial correlation and the network structure. This new covariate partial correlation network analysis approach is illustrated through two studies on the links between human genotypes (single-nucleotide polymorphisms) and disease phenotypes.
    Description
    141 pg.
    URI
    http://hdl.handle.net/1951/60234
    Collections
    • Stony Brook Theses & Dissertations [SBU] [1955]

    SUNY Digital Repository Support
    DSpace software copyright © 2002-2023  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV
     

     


    SUNY Digital Repository Support
    DSpace software copyright © 2002-2023  DuraSpace
    Contact Us | Send Feedback
    DSpace Express is a service operated by 
    Atmire NV