RHash: Robust Hashing via L_infinity-norm Distortion

RHash: Robust Hashing via L_infinity-norm Distortion

Amirali Aghazadeh, Andrew Lan, Anshumali Shrivastava, Richard Baraniuk

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 1386-1394. https://doi.org/10.24963/ijcai.2017/192

Hashing is an important tool in large-scale machine learning. Unfortunately, current data-dependent hashing algorithms are not robust to small perturbations of the data points, which degrades the performance of nearest neighbor (NN) search. The culprit is the minimization of the L_2-norm, average distortion among pairs of points to find the hash function. Inspired by recent progress in robust optimization, we develop a novel hashing algorithm, dubbed RHash, that instead minimizes the L_1-norm, worst-case distortion among pairs of points. We develop practical and efficient implementations of RHash that couple the alternating direction method of multipliers (ADMM) framework with column generation to scale well to large datasets. A range of experimental evaluations demonstrate the superiority of RHash over ten state-of-the-art binary hashing schemes. In particular, we show that RHash achieves the same retrieval performance as the state-of-the-art algorithms in terms of average precision while using up to 60% fewer bits.
Keywords:
Machine Learning: Machine Learning
Natural Language Processing: Information Retrieval