Ancestry Analysis Using an Innovative 56 Aim-Indel Panel and Machine Learning Methods

24 Pages Posted: 6 Dec 2022

See all articles by Zhu Bofeng

Zhu Bofeng

Southern Medical University - Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification

Liu Liu

Southern Medical University

Shuanglin Li

Southern Medical University

Wei Cui

Southern Medical University

Yating Fang

Southern Medical University

Shuyan Mei

Southern Medical University

Man Chen

Southern Medical University

Hui Xu

Southern Medical University

Xiaole Bai

Southern Medical University

Abstract

Insertion/deletion polymorphisms (InDel) can be used as one of the ancestry informative markers (AIM) in ancestry analysis. In this study, an innovative panel consisting of 56 AIM-InDel loci was used to investigate the genetic structure and genetic relationships between the Inner Mongolia Manchu (IMM) group and 26 reference populations. The IMM group was closely related in genetic background to East Asian populations, especially the Han Chinese in Beijing. Moreover, populations from northern and southern East Asia displayed obvious variations in ancestral components, suggesting the potential value of this panel in distinguishing the populations from northern and southern East Asia. Subsequently, four machine learning models were performed based on the 56 InDel loci to evaluate the performance of this panel in ancestry prediction. The random forest model presented better performance in ancestry prediction, with 91.87% and 99.73% accuracy for the five and three continental populations, respectively. All IMM individuals were assigned to the East Asian populations using the random forest model and were more closely related to the northern East Asian populations. Furthermore, the random forest model distinguished 87.18% of the IMM individuals from the six East Asian groups, suggesting that the random forest model based on the 56 AIM-InDels could be a potential tool for ancestry analysis.

Keywords: Ancestry analysis / Insertion/deletion polymorphisms / Inner Mongolia Manchus / Genetic relationship / Machine learning

Suggested Citation

Bofeng, Zhu and Liu, Liu and Li, Shuanglin and Cui, Wei and Fang, Yating and Mei, Shuyan and Chen, Man and Xu, Hui and Bai, Xiaole, Ancestry Analysis Using an Innovative 56 Aim-Indel Panel and Machine Learning Methods. Available at SSRN: https://ssrn.com/abstract=4282572 or http://dx.doi.org/10.2139/ssrn.4282572

Zhu Bofeng (Contact Author)

Southern Medical University - Guangzhou Key Laboratory of Forensic Multi-Omics for Precision Identification ( email )

Liu Liu

Southern Medical University ( email )

Guangzhou
China

Shuanglin Li

Southern Medical University ( email )

Guangzhou
China

Wei Cui

Southern Medical University ( email )

Guangzhou
China

Yating Fang

Southern Medical University ( email )

Guangzhou
China

Shuyan Mei

Southern Medical University ( email )

Guangzhou
China

Man Chen

Southern Medical University ( email )

Guangzhou
China

Hui Xu

Southern Medical University ( email )

Guangzhou
China

Xiaole Bai

Southern Medical University ( email )

Guangzhou
China

Do you have a job opening that you would like to promote on SSRN?

Paper statistics

Downloads
45
Abstract Views
235
PlumX Metrics