Efficient Combination of CNN and Transformer for Dual-Teacher Uncertainty-Aware Guided Semi-Supervised Medical Image Segmentation
19 Pages Posted: 12 Apr 2022
Abstract
Background and objective: Deep learning-based methods for fast target segmentation of magnetic resonance imaging (MRI) have become increasingly popular in recent years. However, because the success of deep learning methods in medical image segmentation tasks usually depends on a large amount of labeled data, the time-consuming and laborious problem of data annotation is a major challenge in medical image segmentation tasks. The purpose of this work is to enhance the segmentation of MR images using a semi-supervised learning approach that requires only a small amount of labeled data and exploits the effective information that can be gained from a large amount of unlabeled data.
Methods: To utilize the effective information of the unlabeled data, we designed the method of guiding the Student segmentation model simultaneously by the Dual-Teacher structure of CNN and transformer forming the subject network. Both Teacher A and Student models are CNNs, and the TA-S module they form is a mean teacher structure with added data noise. In the TB-S module formed by the combination of Student and Teacher B models, their backbone networks CNN and transformer capture the local and global information of the image at the same time, respectively, to create pseudo labels for each other and perform cross-supervision. The Dual-Teacher guides the Student through synchronous training and performs knowledge rectification and communication with each other through consistent regular constraints, which better utilizes the valid information in the unlabeled data. In addition, the segmentation predictions of Teacher A and Student and Teacher A and Teacher B are screened for uncertainty assessment during the training process to enhance the prediction accuracy and generalization of the model. This method uses the mechanism of simultaneous training of the synthetic structure composed of TA-S and TB-S modules to jointly guide the optimization of the Student model to obtain better segmentation ability.
Results: We evaluated the proposed method on a publicly available MRI dataset from a cardiac segmentation competition organized by MICCAI in 2017. Compared with several existing state-of-the-art semi-supervised segmentation methods, the method achieves better segmentation results in terms of Dice coefficient and HD distance evaluation metrics of 0.878 and 4.9 mm and 0.886 and 5.0 mm, respectively, using a training set containing only 10% and 20% of labeled data.
Conclusion: This method fuses CNN and transformer to design a new Teacher-Student semi-supervised learning optimization strategy, which greatly improves the utilization of a large number of unlabeled medical images and the effectiveness of model segmentation results.
Note:
Funding Information: This work was supported by the Natural Science Foundation of Jiangsu Province China under Grant BK20190079, National Natural Science Foundation of China under Grants U2141234 and 62176105, the Key Research and Development Program of Guangdong Province China under Grant 2020B1111010002.
Declaration of Interests: The authors declare no conflicts of interest.
Keywords: Magnetic resonance imaging (MRI), Deep learning, Transformer, Semi-supervised learning, Image segmentation
Suggested Citation: Suggested Citation