Abstract
Ranking Preserving Hashing for Fast Similarity Search / 3911
Qifan Wang, Zhiwei Zhang, Luo Si
PDF
Hashing method becomes popular for large scale similarity search due to its storage and computational efficiency. Many machine learning techniques, ranging from unsupervised to supervised, have been proposed to design compact hashing codes. Most of the existing hashing methods generate binary codes to efficiently find similar data examples to a query. However, the ranking accuracy among the retrieved data examples is not modeled. But in many real world applications, ranking measure is important for evaluating the quality of hashing codes.In this paper, we propose a novel Ranking Preserving Hashing (RPH) approach that directly optimizes a popular ranking measure, Normalized Discounted Cumulative Gain (NDCG), to obtain effective hashing codes with high ranking accuracy. The main difficulty in the direct optimization of NDCG measure is that it depends on the ranking order of data examples, which forms a non-convex non-smooth optimization problem. We address this challenge by optimizing the expectation of NDCG measure calculated based on a linear hashing function. A gradient descent method is designed to achieve the goal. An extensive set of experiments on two large scale datasets demonstrate the superior ranking performance of the proposed approach over several state-of-the-art hashing methods.