LIMITS OF HOMOLOGY DETECTION BY PAIRWISE SEQUENCE COMPARISON

Rainer Spang and Martin Vingron

August 2000,

Noise in database searches resulting from random sequences similarities increases as the databases expand rapidly. The noise problems are not a technical shortcoming of the database search programs, but a logical consequence of the idea of homology searches. The effect can be observed in simulation experiments. We have investigated noise levels in pairwise alignment based database searches. The noise levels of 38 releases of the Swiss-Prot database, display perfect logarithmic growth with the total length of the databases. Clustering of real biological sequences reduces noise levels, but the effect is marginal. Paper version: Bioinformatics Volume 17 Issue 4 pages 338 to 342

Keywords: alignment statistics, molecular database searches, bioinformatics


The manuscript is available in postscript format