PREDICTING THE CLINICAL STATUS OF HUMAN BREAST CANCER USING GENE EXPRESSION PROFILES

Mike West, Carrie Blanchette, Holly Dressman, Erich Huang, Seiichi Ishida, Rainer Spang, Harry Zuzan, Jeffrey R. Marks, and Joseph R. Nevins

Duke University

January 2001

The practice of tumor diagnosis depends largely on visual interpretation of gross pathological and histological specimens together with limited biochemical data. These visual features, as well as immunohistochemical staining patterns, are reflective of the genes expressed within the tumor cell. By measuring gene expression directly, there is the potential for refining the diagnosis and classification of neoplastic tissues based on thousands of parameters where previously only a few existed. To do this, we have developed Bayesian statistical regression models that provide predictive capability based on gene expression data derived from DNA microarray analysis of a series of primary breast cancer samples. These patterns have the capacity to discriminate breast tumors on the basis of estrogen receptor (ER) status, and also on the basic categorized lymph node status. Most importantly, we assess the utility and validity of such models in predicting status of tumors in cross-validation determinations. The practical value of such approaches relies critically on the ability to not only assess relative probabilities of clinical outcomes for future samples, but also to provide an honest assessment of the uncertainties associated with such predictive classifications based on the selection of gene subsets for each validation analysis. This latter point is of critical importance in the ability to apply these methodologies to actual tumor diagnosis.


The text and figures and available in pdf