LIMITS OF HOMOLOGY DETECTION BY PAIRWISE SEQUENCE COMPARISON
Rainer Spang and Martin Vingron
August 2000,
Noise in database searches resulting from random sequences
similarities increases as the databases expand rapidly. The noise
problems are not a technical shortcoming of the database search
programs, but a logical consequence of the idea of homology
searches. The effect can be observed in simulation experiments.
We have investigated noise levels in pairwise alignment based database
searches.
The noise levels of 38 releases of the Swiss-Prot database,
display perfect logarithmic growth with the total length of the
databases. Clustering of real biological sequences reduces
noise levels, but the effect is marginal.
Paper version: Bioinformatics Volume 17 Issue 4 pages 338 to 342
Keywords: alignment statistics, molecular database searches, bioinformatics
The manuscript is available in postscript format