Improved Performance of Nanotoxicity Prediction Models Using Automated Machine Learning

Xiao, Xiao; Trinh, Tung  X.; Yoon, Tae-Hyun

doi:10.2139/ssrn.4010487

Download This Paper

Open PDF in Browser

Add Paper to My Library

Improved Performance of Nanotoxicity Prediction Models Using Automated Machine Learning

NANOTODAY-D-21-01586

22 Pages Posted: 17 Jan 2022

See all articles by Xiao Xiao

Tung X. Trinh

Seoul National University - Department of Plastic and Reconstructive Surgery

Tae-Hyun Yoon

Hanyang University - College of Natural Science

Abstract

Computational modeling, particularly with machine learning models, has been of significant interest for non-animal testing of nanotoxicity. Machine learning algorithms find a relationship between the endpoint and descriptors through mathematical functions. However, the tuning of all parameters of the algorithms requires time, expertise, and an intensive search for producing optimized predictive models. Current approaches for optimizing machine learning algorithms still require sufficient computing power (e.g., graphical processing units and multiple-cores central processing units). The development of an automated machine learning (autoML) approach and publicly available platforms (e.g., Google Vertex AI, Microsoft Azure, and Dataiku) have shown benefits to the users who have little machine learning knowledge by applying automatic data preprocessing, algorithms, and hyperparameter selection to produce models via various combinations. In this study, we used autoML to develop predictive models for the cellular toxicity of metal and oxide nanoparticles and benchmarked autoML and machine learning (ML) models. Our results demonstrated that autoML produced higher-performance models than the ML approach. Models from three autoML platforms provided satisfactory performance, and no platform outperformed the others. Models built from datasets with a higher data quality (measured by using physicochemical scores) showed better performance. The size of datasets showed effects on the performance of autoML models, but those effects resulted from a relationship between the data quality and model performance.

Keywords: nanotoxicity modeling, automated machine learning, Oxide, metal, data quality

Suggested Citation: Suggested Citation

Xiao, Xiao and Trinh, Tung X. and Yoon, Tae-Hyun, Improved Performance of Nanotoxicity Prediction Models Using Automated Machine Learning. NANOTODAY-D-21-01586, Available at SSRN: https://ssrn.com/abstract=4010487 or http://dx.doi.org/10.2139/ssrn.4010487