Multifactorial Analysis of Fluorescence Detection for Soil Total Petroleum Hydrocarbons Using Random Forest and Multiple Linear Regression
26 Pages Posted: 30 Jul 2024
Abstract
This study combined Random Forest (RF) and Multiple Linear Regression (MLR) approaches to analyze the influence of various factors on the fluorescence detection of total petroleum hydrocarbons in soil. We considered the effects of soil moisture, organic matter, and minerals, and tested samples of three common soil types and varying concentrations of soil petroleum hydrocarbons using a self-developed fluorescence imaging technology. The fluorescence signals are greatly influenced by moisture, organic matter, and minerals, exhibiting distinct effects depending on the soil types and hydrocarbon concentrations. The RF model improves accuracy and consistency by constructing decision trees, making it appropriate for non-linear and high-dimensional data scenarios, although its underperformance in our study. The MLR model provides a comprehensive understanding of the linear relationships between variables, displaying better statistical performance and consistency in most cases of our experiment, with a coefficient of determination (R2) above 0.8, and Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE) all lower than those of the RF. Our research provides an important scientific basis for monitoring, evaluating, and managing soil petroleum hydrocarbon pollution, aiding in the formulation of effective soil pollution prevention strategies, and offers a foundation for further research into environmental risk assessment and soil remediation.
Keywords: Polycyclic aromatic hydrocarbons, Random forest, Multiple linear regression, Influencing factors
Suggested Citation: Suggested Citation