Machine Learning for Genome-Wide Association Study (R21GM085665)
Project Description: Susceptibility to complex diseases is determined by the coordinated function of multiple genetic variants and environmental factors interacting in a composite and potentially nonlinear manner. Although the genome-wide views provided by advanced single nucleotide polymorphism (SNP) arrays present an opportunity to discover previously unrecognized genomic patterns, the ability to recognize such complex features of genetic architecture has important implications for the use of genome-wide association study to discover genetic determinants of health and disease from massive genomic data. The focus of this application is the development and validation of effective computational statistics approaches to detect complex interaction effects of multi-locus SNPs which could be useful for classification and prediction of disease or dysfunction, and provide novel insights into the pathogenesis of complex phenotypes.
Based on very promising preliminary studies, the specific aims of this R21 application are carefully designed: (1) to refine and evaluate the recently developed significant conditional association (SCA) criterion and heuristic combinatorial interaction growing (HCIG) search strategy (specifically designed to discover complex interaction effects of multi-locus jointly-predictive SNPs), test and validate on real SNP based realistic simulations and compare with a panel of most relevant existing methods; and (2) to apply SCA- HCIG method to the real SNP data of NIAMS-funded FMS cohort in relationship to metabolic syndrome etc. and develop SNP marker based classification/ prediction models, assessed by the prediction power and initial biological plausibility of the implicated SNP subsets.
This proposal represents a unique cross-disciplinary collaboration focusing on the development of new analytical methods to more effectively identify interacting susceptibility SNPs and environmental factors that can be used to determine individual risk to a specific disease and to estimate prognosis and response to treatment. The results could also suggest novel preventive intervetions and therapeutic targets, reduce the burden of diseases, and accelerate the realization of truly personalized medicine.
Public Health Relevance Statement: Susceptibility to complex diseases is determined by the coordinated function of multiple genetic and environmental factors. The identified interacting susceptibility SNPs and risk factors can be used to determine individual risk to a specific disease and to estimate prognosis and response to treatment. The results could also suggest novel preventive interventions and therapeutic targets, reduce the burden of diseases, and accelerate the realization of truly personalized medicine.
©2004, Computational Bioinformatics and Bioimaging Laboratory
(CBIL), Advanced Research Institute, Virginia Tech.
Updated: 04/02/2011. Suggestions/Comments