Publication
RECOMB 1999
Conference paper
Sequence homology detection through large scale pattern discovery
Abstract
We describe a new approach for identifying sequence similarity between a query sequence and a data base of proteins. The central idea is the use of a set of patterns obtained from the underlying data base through an one-time computation. These patterns are subsequently searched for on every query sequence presented to the system. A pattern matched by a region of the query pinpoints to a potential local similarity between that region and all the data base sequences also matching that pattern. By using a set of prudently chosen patterns, the tool presented in this work is able to discover weak but biologically important similarities.