Multimodal Classification for Multi-Task Medical Datasets

12 Pages Posted: 17 May 2025

See all articles by Eduard Lloret

Eduard Lloret

Shanghai Jiao Tong University (SJTU)

Jing Ke

Shanghai Jiao Tong University (SJTU) - School of Electronic Information and Electrical Engineering

Yiqing Shen

Johns Hopkins University

Xiaohang Wang

affiliation not provided to SSRN

Caifeng Wan

affiliation not provided to SSRN

Abstract

Clinical data for diagnosis often comes in multiple forms, and it is rare that a diagnosis relies on just one type of information. In most cases, medical diagnoses are based on the integration of three key types of inputs: 3D or volumetric data (such as CT scans or MRIs), 2D image data (like X-rays or retinal scans), and metadata or quantitative data (including lab results, patient demographics, and clinical history). However, many current models are constrained by specific input modalities, making it challenging to process and analyze the full breadth of available data. Furthermore, when modalities are missing, traditional approaches often require additional computational resources to handle the gaps, making both training and inference inefficient. This paper addresses these limitations by proposing a multitask multimodal architecture that is agnostic to the most common medical data modalities. Our approach enables a single model to classify multiple conditions, such as tumors, glaucoma, and other medical abnormalities, using a combination of 3D, 2D, and metadata inputs without the need for separate models for each task. Additionally, we introduce a new batch computation technique that allows the model to flexibly handle missing modalities, making it less dependent on specific input types and improving overall efficiency. This flexibility ensures that the model can adapt to varying input combinations without the need for retraining or extra computational resources. Another challenge in healthcare AI is the difficulty for multi-task learning-models to understand and simultaneously solve diverse tasks, which usually results in the need for multiple models, each dedicated to a specific task. To overcome this, we leverage the general capabilities of Large Language Models (LLMs) to tackle the multi-task problem. By doing so, our system can classify different conditions (e.g., tumor detection, glaucoma diagnosis) in a single unified model, improving both performance and resource efficiency. By integrating these diverse data types and tasks into one system, our approach not only simplifies the diagnostic workflow but also improves diagnostic accuracy, reduces errors, and accelerates decision-making. This paper proposes a flexible, efficient, and scalable solution that can handle the complexity of modern medical data, offering a more adaptable AI model that works across varying modalities and tasks. As healthcare continues to rely on a multitude of data sources, the need for such models is critical to improving patient outcomes and optimizing clinical workflows.

Note:
Funding Information: Natural Science Foundation of Shanghai (Grant Number: 23ZR1430700)

Conflict of Interests: The authors have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Keywords: Multimodal, Dataset, Multi-task, Diagnosis, Missing Modality

Suggested Citation

Lloret, Eduard and Ke, Jing and Shen, Yiqing and Wang, Xiaohang and Wan, Caifeng, Multimodal Classification for Multi-Task Medical Datasets. Available at SSRN: https://ssrn.com/abstract=5245393 or http://dx.doi.org/10.2139/ssrn.5245393

Eduard Lloret

Shanghai Jiao Tong University (SJTU) ( email )

800 Dongchuan Road
Shanghai, 200240
China

Jing Ke (Contact Author)

Shanghai Jiao Tong University (SJTU) - School of Electronic Information and Electrical Engineering ( email )

Yiqing Shen

Johns Hopkins University ( email )

Baltimore, MD 20036-1984
United States

Xiaohang Wang

affiliation not provided to SSRN ( email )

Nigeria

Caifeng Wan

affiliation not provided to SSRN ( email )

Nigeria

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
12
Abstract Views
192
PlumX Metrics