Bilevel Visual Words Coding for Image Classification / 1883
Jiemi Zhang, Chenxia Wu, Deng Cai, Jianke Zhu

Bag-of-Words approach has played an important role in recent works for image classification. In consideration of efficiency, most methods use k-means clustering to generate the codebook. The obtained codebooks often lose the cluster size and shape information with distortion errors and low discriminative power. Though some efforts have been made to optimize codebook in sparse coding, they usually incur higher computational cost. Moreover, they ignore the correlations between codes in the following coding stage, that leads to low discriminative power of the final representation. In this paper, we propose a bilevel visual words coding approach in consideration of representation ability, discriminative power and efficiency. In the bilevel codebook generation stage, k-means and an efficient spectral clustering are respectively run in each level by taking both class information and the shapes of each visual word cluster into account. To obtain discriminative representation in the coding stage, we design a certain localized coding rule with bilevel codebook to select local bases. To further achieve an efficient coding referring to this rule, an online method is proposed to efficiently learn a projection of local descriptor to the visual words in the codebook. After projection, coding can be efficiently completed by a low dimensional localized soft-assignment. Experimental results show that our proposed bilevel visual words coding approach outperforms the state-of-the-art approaches for image classification.