Intelligent
Mapping and Exploration of Gene Expression Profiles (CA83231)
As
a step toward understanding the complex differences between normal
and cancer cells, much research has been devoted to analyses of
genes that are differentially expressed in particular cells. Though
recent technological advances have made it possible to conduct
serial and/or simultaneous analysis of the expression patterns
of thousands of genes, no comprehensive study has been reported
on how many genes are expressed differentially and whether most
differences are cell line-specific. The long-term goal of this
research is to develop intelligent data mapping and visual explanation
technologies to improve information exploration and interpretation
from high-throughput gene expression profiles for molecular analysis
of cancer.
Suggested
by preliminary evidence from mRNA profiles of breast/prostate
cancer cells that transcriptome patterns are rich in information
about mechanisms that underlie cancer development, in this R21
research, multidisciplinary knowledge of molecular biology and
computational intelligence are applied to (1) design cost effective
molecular experiments to establish gene transcriptome distributions
across cell lines, (2) pilot test the existence of transcriptome
clusters in the molecular species space that correlate to cell
phenotypes, and (3) identify key biomarkers that differentiate
different cell lines with the highest prediction values. Since
new knowledge can only be further acquired by exploring all of
the interesting aspects of complex transcriptome data in high-dimensional
space, in this R33 application a statistically principled hierarchical
visual exploration technique is proposed to effectively reveal
and interpret the intrinsic but hidden characteristics of transcriptome
clusters that should better define the nature of cancer biology
and therapeutic targets. A novel integration of information theory
and computer graphics will permit (1) an automatic identification
and modeling of biomarker clusters, (2) a probabilistic principal
component analysis to form hierarchical visualization spaces allowing
the complete data set to be analyzed at the top level with best
separated sub-clusters analyzed at deeper levels, and (3) an interactive
intelligent interface for task/hypothesis driven data mining and
decision making. The innovative nature of the research relies
on the concept of combining (1) a hybrid stepwise nonlinear discriminant
analysis for biomarker identification and (2) a hierarchical visual
exploration of multi-foci high-dimensional transcriptome distribution
to interpret the complex relationships between molecular events
and cell phenotypes.
Copyright
©2004, Computational Bioinformatics and Bioimaging Laboratory
(CBIL), Advanced Research Institute, Virginia Tech.
Last
Updated: 03/03/2009. Suggestions/Comments
- Webmaster