Lightmhs: A Smaller Hippocampus Segmentation Network Based on Mobilevit
13 Pages Posted: 20 Jul 2023
There are 2 versions of this paper
LightMHS: A Smaller Hippocampus Segmentation Network Based on MobileViT
Abstract
The morphological analysis and volume measurement of the hippocampus is crucial to the study of many brain diseases. Therefore, a rapid and accurate method of hippocampus segmentation is necessary to assist physicians in diagnosis and treatment. U-Net and its latest variants have become the leading networks for medical image segmentation in recent years, and the architecture based on Transformer has also received extensive attention. However, many networks cannot be used quickly and efficiently for medical image segmentation due to their large number of parameters, high computational complexity and slow to use. To this end, we combine the advantages of CNNs and ViTs(Vision Transformer) and propose a lightweight model: LightMHS for the segmentation of 3D hippocampus. In order to obtain local context information, the encoder first utilizes 3D CNN to extract spatial feature maps, and we propose an attention module to learn spatial and channel relationships. Considering the importance of local features and global semantics for 3D segmentation, we introduce a lightweight ViT to learn high-level features of scale invariance and further fuse local and global information. To evaluate the effectiveness of encoder feature representation, we design three decoders of different complexity to generate segmentation maps. We validate our model on three public hippocampus datasets, and the experimental results show that compared with other models, we achieved more beneficial performance with fewer parameters and lower computational complexity. The code is available at \url{https://github.com/zyh202127/LightMHS}.
Note:
Funding declaration: This work was supported by the Natural Science Foundation of Jiangsu Province under Grant BK20190079.
Conflict of Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Keywords: Vision Transformer, 3D CNN, Lightweight, Multi-scale features fusion, Hippocampus Segmentation.
Suggested Citation: Suggested Citation