PREDICTING THE CLINICAL STATUS OF HUMAN BREAST CANCER USING GENE EXPRESSION PROFILES
Mike West, Carrie Blanchette, Holly Dressman, Erich Huang, Seiichi Ishida, Rainer Spang,
Harry Zuzan, Jeffrey R. Marks, and Joseph R. Nevins
Duke University
January 2001
The practice of tumor diagnosis depends largely on visual interpretation of gross
pathological and histological specimens together with limited biochemical data.
These visual features, as well as immunohistochemical staining patterns,
are reflective of the genes expressed within the tumor cell. By measuring gene
expression directly, there is the potential for refining the diagnosis and
classification of neoplastic tissues based on thousands of parameters where
previously only a few existed. To do this, we have developed Bayesian statistical
regression models that provide predictive capability based on gene expression data
derived from DNA microarray analysis of a series of primary breast cancer samples.
These patterns have the capacity to discriminate breast tumors on the basis of estrogen
receptor (ER) status, and also on the basic categorized lymph node status. Most importantly,
we assess the utility and validity of such models in predicting status of tumors in
cross-validation determinations. The practical value of such approaches relies
critically on the ability to not only assess relative probabilities of clinical outcomes
for future samples, but also to provide an honest assessment of the uncertainties
associated with such predictive classifications based on the selection of gene subsets
for each validation analysis. This latter point is of critical importance in the
ability to apply these methodologies to actual tumor diagnosis.