[ICDE'12] Jia Pan, et.al.

Bi-level Locality Sensitive Hashing for k-Nearest Neighbor Computation

Jia Pan, et.al. on April 1, 2012
doi.org
obsidian에서 수정하기

Abstract

A new Bi-level LSH algorithm to perform approximate k-nearest neighbor search in high dimensional spaces that maps well to current GPU architectures and can improve the quality of approximate KNN queries as compared to prior LSH-based algorithms. We present a new Bi-level LSH algorithm to perform approximate k-nearest neighbor search in high dimensional spaces. Our formulation is based on a two-level scheme. In the first level, we use a RP-tree that divides the dataset into sub-groups with bounded aspect ratios and is used to distinguish well-separated clusters. During the second level, we compute a single LSH hash table for each sub-group along with a hierarchical structure based on space-filling curves. Given a query, we first determine the sub-group that it belongs to and perform k-nearest neighbor search within the suitable buckets in the LSH hash table corresponding to the sub-group. Our algorithm also maps well to current GPU architectures and can improve the quality of approximate KNN queries as compared to prior LSH-based algorithms. We highlight its performance on two large, high-dimensional image datasets. Given a runtime budget, Bi-level LSH can provide better accuracy in terms of recall or error ration. Moreover, our formulation reduces the variation in runtime cost or the quality of results.

Figure

figure 1 figure 1

figure 2 figure 2

figure 3 figure 3

figure 4 figure 4

figure 5 figure 5

figure 6 figure 6

figure 7 figure 7

figure 8 figure 8

figure 9 figure 9

figure 10 figure 10

figure 11 figure 11

figure 12 figure 12

figure 13 figure 13

figure 14 figure 14

Table