Preprints with The Lancet is a collaboration between The Lancet Group of journals and SSRN to facilitate the open sharing of preprints for early engagement, community comment, and collaboration. Preprints available here are not Lancet publications or necessarily under review with a Lancet journal. These preprints are early-stage research papers that have not been peer-reviewed. The usual SSRN checks and a Lancet-specific check for appropriateness and transparency have been applied. The findings should not be used for clinical or public health decision-making or presented without highlighting these facts. For more information, please see the FAQs.
Low-Cost, Transcriptional Diagnostic to Accurately Categorize Lymphomas in Low- and Middle-Income Countries
43 Pages Posted: 4 Jun 2020
More...Abstract
Background: The lack of access to adequate pathology services is a critical roadblock for both improvements in health and sustainable development across lower- and middle-income countries (LMICs). We hypothesized that a low-cost, parsimonious gene expression assay using paraffin-embedded biopsies from LMICs could distinguish lymphoma subtypes and guide treatment.
Methods: We reviewed all biopsies obtained between 2006-2018 for suspicion o lymphoma at INCAN hospital in Guatemala City. Gold-standard diagnoses were established by immunohistochemistry and FISH then binned into 9 categories: nonmalignant, aggressive B-cell, diffuse large B-cell (DLBCL), follicular, Hodgkin, mantle cell, marginal zone, NK/T-cell, or mature T-cell lymphoma. We established a chemical ligation probe-based assay (CLPA) that quantifies expression of 37 genes by capillary electrophoresis for <$10 USD/sample. To assign bins based on gene expression, 13 models were evaluated as candidate base learners and class probabilities from each model were then used as predictors in an extreme gradient boosting super learner. An additional two-class model was developed to classify DLBCL cell-of-origin (COO). Cases with call probabilities <0.6 were classified as indeterminate.
Findings: Assay failure occurred in 60 (8·9%)/670 biopsies and was enriched among Hodgkin lymphomas (24·8%). 560 diagnostic samples were divided into 70% (n=397) training and 30% (n=163) validation cohorts. Overall accuracy for the validation cohort was 86% [95% CI; 80-91%]. After excluding 28 (17%) indeterminate calls, accuracy increased to 94% [95% CI; 89-97%]. Accuracy for a cohort of relapsed/refractory biopsies (n=39) was 79% and 88% after excluding indeterminate cases. Accuracy for DLBCL COO classification compared to the Hans IHC algorithm (n=51) was 80% [95% CI; 67-90%].
Interpretation: Machine-learning analysis of gene expression accurately classifies paraffin-embedded lymphoma biopsies from LMICs. Low-cost, open source assays could transform diagnosis, subtyping, and assessment of therapeutic targets for patients with cancer worldwide.
Funding Statement: American Society of Hematology, US State Department, ASCO, LLS, Celgene and NIH
Declaration of Interests: T.G., S.L.D. and R.T. are employees of DxTerity Diagnostics. D.M.W. is a co-founder of Travera, Ajax and Root Diagnostics. He receives consulting or advisory board fees from Magnetar, Bantam, ASELL, Ossium, Myeloid Therapeutics, Daiichi Sankyo, and Elstar. He receives research funding from Daiichi Sankyo and Verastem. The remaining authors declare no conflicts-of-interest.
Ethics Approval Statement: This study was approved by the Institutional Review Boards of Dana-Farber Cancer Institute and Stanford University and the Ethics Committee of La Liga Nacional Contra el Cáncer Research.
Keywords: Lymphoma; global health; global oncology; transcriptional profiling; diagnosis; lymphoma classification; global diagnostics; cancer diagnostics; machine learning; artificial intelligence
Suggested Citation: Suggested Citation