Havt: Hierarchical Attention Vision Transformer for Fine-Grained Visual Classification

30 Pages Posted: 10 Jun 2022

See all articles by Xiaobin Hu

Xiaobin Hu

affiliation not provided to SSRN

Shining Zhu

affiliation not provided to SSRN

Taile Peng

affiliation not provided to SSRN

Abstract

Recently, Visual Transformer has made a breakthrough in the field of image recognition with its self-attention mechanism  generating attention weights capable of extracting discriminative token information of each pixel block and connecting it to class token, making it suitable for fine-grained image classification. Nevertheless, the class token in the deep layer tends to ignore the local features between layers. In addition, the embedding layer feeds fixed-size patches into the network, inevitably introducing additional image noise. Therefore, we propose a Hierarchical Attentional Visual Transformer based on the Transformer framework. A data enhancement module is introduced to use attention weights as a guide, thus reducing the impact of noise from fixed-size pixel blocks. Next, the Hierarchical Attention Selection  module is proposed to filter and fuse the tokens between each hierarchy to effectively guide the network to select discriminative tokens between hierarchy.The effectiveness of HAVT is finally validated on two general fine-grained datasets.

Keywords: Fine-Grained visual classification, Vision transformer, Hierarchical attention selection, Attention-guided data augmentation

Suggested Citation

Hu, Xiaobin and Zhu, Shining and Peng, Taile, Havt: Hierarchical Attention Vision Transformer for Fine-Grained Visual Classification. Available at SSRN: https://ssrn.com/abstract=4132949 or http://dx.doi.org/10.2139/ssrn.4132949

Xiaobin Hu

affiliation not provided to SSRN ( email )

No Address Available

Shining Zhu

affiliation not provided to SSRN ( email )

No Address Available

Taile Peng (Contact Author)

affiliation not provided to SSRN ( email )

No Address Available

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
118
Abstract Views
362
Rank
510,990
PlumX Metrics