Near-neighbor search in pattern distance spaces
Abstract
In this paper, we study the near-neighbor problem based on pattern similarity, a new type of similarity which conventional distance metrics such as Lp norm cannot model effectively. The problem, however, is important to many applications. For example, in DNA microarray analysis, the expression levels of two closely related genes may rise and fall under different external conditions or at different time. Although the magnitude of their expression levels may not be close, the patterns they exhibit over the time or under different conditions can be very similar. In this paper, we measure the distance between two objects by pattern similarity, i.e., whether the two objects exhibit a synchronous pattern of rise and fall under different conditions. We then present an efficient algorithm for near-neighbor search based on pattern similarity, and we perform tests on several real and synthetic data sets to show its effectiveness. Copyright © by SIAM.