PREDICTION AND UNCERTAINTY IN THE ANALYSIS OF GENE EXPRESSION PROFILES

Rainer Spang, Harry Zuzan, Mike West, Joseph Nevins, Carrie Blanchette and Jeffrey R. Marks

September 2000,

We have developed a complete statistical model for the analysis of tumor specific gene expression profiles. It gives investigators a global overview on large scale gene expression data, indicating trends in the data as to which tumor phenotype a certain sample belongs, but also summarizing the uncertainties inherent to these trends. In this paper we demonstrate the use of this method in the context of a gene expression profiling study of 27 human breast cancers. The study is aimed on unrevealing the molecular differences of tumors with different estrogen receptor status. In addition to good predictive performance with respect to pure classification of the expression profiles, the model also uncovers conflicts in the data with respect to the classification of some of the tumors, highlighting them as critical cases where additional investigations are appropriate


The manuscript is available in postscript and pdf format.