An efficient indexing method for nearest neighbor searches in high-dimensional image databases
Abstract
Nearest neighbor (NN) search is emerging as an important search paradigm in a variety of applications in which objects are represented as vectors of d numeric features. However, despite decades of efforts, except for the filtering approach such as the VA-file, the current solutions to find exact kNNs are far from satisfactory for large d. The filtering approach represents vectors as compact approximations and by first scanning these smaller approximations, only a small fraction of the real vectors are visited. In this paper, we introduce the local polar coordinate file (LPC-file) using the filtering approach for nearest-neighbor searches in high-dimensional image databases. The basic idea is to partition the vector space into rectangular cells and then to approximate vectors by polar coordinates on the partitioned local cells. The LPC information significantly enhances the discriminatory power of the approximation. To demonstrate the effectiveness of the LPC-file, we conducted extensive experiments and compared the performance with the VA-file and the sequential scan by using synthetic and real data sets. The experimental results demonstrate that the LPC-file outperforms both of the VA-file and the sequential scan in total elapsed time and in the number of disk accesses and that the LPC-file is robust in both "good" distributions (such as random) and "bad" distributions (such as skewed and clustered).