Mind Your Ps: Using Probability in the Interpretation of Molecular Epidemiology Data

61 Pages Posted: 14 Jul 2021

See all articles by Ana Raquel Penedos

Ana Raquel Penedos

Public Health England

Aurora Fernández-García

Carlos III Institute of Health - National Centre for Microbiology

Mihaela Lazar

Romanian Institute of Science and Technology - National Institute of Research and Development for Microbiology and Immunology „Cantacuzino”

Kajal Ralh

Government of the United Kingdom - National Infection Service

David Williams

Government of the United Kingdom - National Infection Service

Kevin Brown

Public Health England

Date Written: May 16, 2021

Abstract

Background
Assessing relatedness of pathogen sequences in clinical samples is a core goal in molecular epidemiology. Tools for Bayesian analysis of phylogeny, such as the BEAST software package, have been typically used in the analysis of sequence/time data in the field. However, they are computationally-, time-, and knowledge-intensive, demanding resources that many laboratories do not have available or cannot allocate frequently.

Methods
To evaluate a faster and simpler alternative method to support the routine interpretation of sequence data for epidemiology, we obtained sequences for two regions in the measles virus genome, N-450 and MF-NCR, from patient samples of genotypes B3, D4 and D8 taken between 2011 and 2017 in the UK and Romania. A mathematical model incorporating time, possible shared ancestry and the Poisson distribution describing the number of expected substitutions at a given time point was developed to exclude epidemiological relatedness between pairs of sequences. The model was validated against the commonly used Bayesian phylogenetic method using an independent dataset collected in 2017-19.

Findings
We demonstrate that our model, using time and sequence information to predict whether two samples may be related within a given time frame, minimises the risk of erroneous exclusion of relatedness. An easy-to-use implementation in the form of a guide and spreadsheet is provided for convenient application.

Interpretation
The proposed model only requires a previously calculated substitution rate for the locus and pathogen of interest. It allows for an informed but quick decision on the likelihood of relatedness between two samples within a time frame, without the need for phylogenetic reconstruction, thus facilitating rapid epidemiological interpretation of sequence data.

Funding
This work was funded by Public Health England (PHE). The World Health Organization European Regional Office funded Aurora Fernández-García and Mihaela Lazar training visits to PHE.

Keywords: Measles, outbreak, elimination, epidemiology, molecular epidemiology, clinical virology

Suggested Citation

Penedos, Ana Raquel and Fernández-García, Aurora and Lazar, Mihaela and Ralh, Kajal and Williams, David and Brown, Kevin, Mind Your Ps: Using Probability in the Interpretation of Molecular Epidemiology Data (May 16, 2021). Available at SSRN: https://ssrn.com/abstract=3872835 or http://dx.doi.org/10.2139/ssrn.3872835

Ana Raquel Penedos (Contact Author)

Public Health England ( email )

61 Colindale Avenue
London, NW9 5EQ
United Kingdom

Aurora Fernández-García

Carlos III Institute of Health - National Centre for Microbiology

Monforte de Lemos 5
Madrid, Madrid 28029
Spain

Mihaela Lazar

Romanian Institute of Science and Technology - National Institute of Research and Development for Microbiology and Immunology „Cantacuzino” ( email )

Splaiul Independenţei nr. 103, Sector 5
050096 Municipiul București
Romania

Kajal Ralh

Government of the United Kingdom - National Infection Service ( email )

United Kingdom

David Williams

Government of the United Kingdom - National Infection Service ( email )

United Kingdom

Kevin Brown

Public Health England ( email )

Wellington House,
133-155 Waterloo Rd,
London, SE1 8UG

Do you have negative results from your research you’d like to share?

Paper statistics

Downloads
78
Abstract Views
335
Rank
559,388
PlumX Metrics