Abstract
The diagnostic evaluation and clinical characterization of rare hereditary anemia (RHA) is to date still challenging. In particular, there is little knowledge of the broad metabolic impact of many of the molecular defects underlying RHA. In this study we explored the potential of untargeted metabolomics to diagnose a relatively common type of RHA: pyruvate kinase deficiency (PKD). In total, 1,903 unique metabolite features were identified in dried blood spot samples from 16 PKD patients and 32 healthy controls. A metabolic fingerprint was identified using a machine learning algorithm, and subsequently a binary classification model was designed. The model showed high performance characteristics (AUC 0.990, 95% CI: 0.981-0.999) and an accurate class assignment was achieved for all newly added control (n=13) and patient samples, (n=6) with the exception of one patient (accuracy 94%). Important metabolites in the metabolic fingerprint included glycolytic intermediates, polyamines and several acyl carnitines. In general, the application of untargeted metabolomics in dried blood spots is a novel functional tool that holds promise for the diagnostic stratification and studies on the disease pathophysiology in RHA.
Introduction
The group of rare hereditary anemias (RHA) includes a large variety of intrinsic defects of red blood cells and erythropoiesis. Our knowledge of the pathophysiology of RHA has recently vastly improved, powered by genetic testing and subsequent increased knowledge of underlying molecular defects.1-4 However, in a substantial number of patients, the clinical phenotype does not fit classical disease criteria, the response to therapy is unexpectedly poor, or a molecular defect cannot be identified.5-7 In addition, in patients with well-described genetic defects, there is often no clear genotype-phenotype correlation.7-9
Pyruvate kinase deficiency (PKD; OMIM 266 200), the most common red cell glycolytic enzyme defect, is no exception in this respect. The clinical phenotype of PKD varies widely, from well-compensated hemolytic anemia to severe hemolysis and neonatal mortality. Currently the diagnosis of PKD relies on the measurement of PK activity and/or the identification of homozygous or compound heterozygous mutations in the PKLR gene.10,11
However, in a significant number of patients only one mutation is identified. In addition, the exact mechanisms leading to reduced lifespan of PK-deficient erythrocytes are still largely unknown. Thus, in order to improve the diagnostic evaluation as well as our understanding of PKD pathophysiology and the genotypeto- phenotype correlation, novel functional tests are needed.
In this study we demonstrate the potential of untargeted metabolomics in dried blood spots (DBS) in the diagnostic evaluation of PKD and report for the first time a metabolic fingerprint for PKD.
Methods
Samples
Sixteen patients diagnosed with PKD based on clinical phenotype, enzyme activity assays and molecular defect were included. Healthy controls (HC; institutional blood donor service) served as controls. All patients or their legal guardians approved the use of remnant samples for method development and validation, in agreement with institutional and national regulations. All procedures followed were in accordance with the ethical standards of the University Medical Center Utrecht and with the Helsinki Declaration of 1976, as revised in 2000. In order to obtain DBS, 50 microL aliquots of blood were spotted onto Guthrie card filter paper (Whatman no. 903 Protein Saver TM cards). Filter paper was left to dry for at least 4 hours at room temperature, and subsequently stored at -80⁰C in a foil bag with a desiccant package pending further analysis.
Metabolic profiling
Sample preparation, direct infusion high resolution mass spectrometry (DI-HRMS) and data processing was performed as previously reported.12,13 Mass peak intensities for metabolite annotation were averaged over technical triplicates. In addition, as DI-HRMS is unable to separate isomers, mass peak intensities consisted of summed intensities of these isomers. Metabolite annotation was performed using a peak calling bioinformatics pipeline developed in R programming software, based on the human metabolome database (version 3.6) (https://github.com/UMCUGenetics/DIMS). This resulted in 3,835 metabolite annotations corresponding to 1,903 unique metabolite features.14
In order to compare the metabolic profiles between HC and PKD, mass peak intensities for each identified feature were converted to Z-scores. These scores, based on metabolic control samples that were added to each DI-HRMS run, were calculated by the following formula:
**Metabolic controls exist of a batch of banked DBS samples from individuals in whom an inborn error of metabolism (IEM) was excluded after an extensive diagnostic workup.
Data analysis
T-test and multivariate analysis were conducted in MetaboAnalyst.15 Classification of data was performed in R software (Version 3.6.1) using the caret package, which contains a set of data processing functions that facilitate the generation of predictive models. Support vector machine (SVM) with linear kernel was used for the classification of HC and PKD samples. SVM algorithms use a set of mathematical functions that are defined as the kernel. The function of kernel is to take data as input and transform it into the required form, for example a linear or polynomial kernel. We applied SVM with a linear kernel, the simplest kernel function, to perform the classification of HC and PKD. SVM with linear kernel is a supervised machine learning model that uses a classification method, which is based on mapping the data into a high dimensional space.
This allows the separation of two groups of samples into distinctive regions by the identification of a small fraction of samples that separates the groups, also referred to as ‘support vectors’. Separation can be achieved by identifying a separating hyperplane, or decision boundary, between the support vectors. 16 Classification of the test set was determined by projecting each of the new samples into this space. Data and R code are available upon request.
Results
Explorative untargeted metabolomics analysis
A total of 1,903 unique metabolite features (and their respective isomers) were analyzed for 16 PKD patients and 32 HC samples. Clinical and laboratory characteristics, and baseline comparison are summarized in Table 1. The most significant differences between the groups, identified by a t-test, included glycolytic intermediates like phosphoenolpyruvic acid and 2-/3-phosphoglyceric acid, polyamines (spermidine and spermine) and several acyl carnitines (methylmalonylcarnitine and propionylcarnitine) (Figure 1A). Broad data exploration to assess the variation between samples and separation between groups was performed by unsupervised principal component analysis (PCA) and supervised partial least square discriminant analysis (PLS-DA), the latter taking group label into account as a response variable. Both analyses revealed close clustering of control samples and a more heterogeneous delineation for PKD patients (Online Supplementary Figure S1).
Machine learning algorithm identifies metabolic profile for PKD
In order to explore the potential of this extensive metabolic fingerprint in predicting PKD a binary classification model was constructed using an SVM with linear kernel. SVM has advantages over PLS-DA with regard to robustness to outliers, resistance to overfitting and predictive power.16 An optimal hyperplane to separate classes based on all metabolomics data was determined by cross validation (4-fold, five repeats). The final model had high performance characteristics with an average accuracy of 96%.
In addition, receiver operator characteristic curves with area under the curve (AUC) were used as performance indicator (Online Supplementary Figure S2A). Important features for classification in this model include the polyamines spermidine and spermine, as well as phosphoenolpyruvic acid, 2-/3-phosphoglyceric acid and glutathione (Figure 1B). Most of these features were increased in PKD, with the exception of glutathione and asparaginyl-proline/prolylasparagine (Figure 1C).
Metabolic profile predicts new samples with high accuracy
External model validation was performed by predicting new control (n=13) and PKD samples (n=6). This resulted in accurate prediction for all controls, and all but one patient (accuracy =94%) (Figure 1D). In order to assess uncertainty of the model and its predictive ability, bootstrap resampling was applied to the complete dataset. By randomly generating training and validation (test) data from the original data, a similarly high prediction performance was achieved, supporting the validity of the presented model (Online Supplementary Figure S2B).
Metabolic profiles reflect PKD disease severity
In order to explore the heterogeneity of PKD metabolic profiles in relation to clinical phenotype, PCA and PLSDA were performed for the entire group of patients and controls. Based on presence of spleen and transfusion frequency phenotypes were distinguished as mild, moderate and severe. Most resemblance to controls in metabolic profile was clear for mild phenotypes, followed by severely affected patients (Online Supplementary Figure S4).
Discussion
In this study we performed untargeted metabolomics on DBS and report for the first time a metabolic disease fingerprint for PKD. By establishing a predictive machine learning model, the diagnostic potential of this approach was demonstrated. This metabolic fingerprint has potential to mature into a powerful clinical tool, capable of confirming or ruling out the diagnosis of PKD. However, the limitations of machine learning models were also demonstrated by the incorrect classification of one PKD patient who was homozygous for the common p.(Arg510Gln) mutation.17 Clinically, this patient exhibited very mild phenotypic features. As confirmed by the clinical severity PLS-DA, patients with a mild phenotype and controls overlap most in their DBS metabolome (Online Supplementary Figure S4). Since approximately 30% of the initial cohort consists of such mildly affected patients, this could further explain why PCA and PLS-DA were unable to achieve separation between groups.
Interestingly, severely affected patients who are heavily transfused (>6 erythrocyte transfusions in the past 12 months) despite having undergone a splenectomy, still showed a clearly distinctive metabolic profile compared to HC and two of them were furthermore correctly assigned as patients (Figure 1D; Online Supplementary Table S1). Although numbers are modest and further studies are needed, this indicates that this approach is reliable even in the setting of transfusions.
Our approach using untargeted metabolomics provides novel insights regarding the broad metabolic impact of PKD that could be relevant to better understand the etiology of PKD-related symptoms. While glycolytic metabolites and their disturbance have been characterized to some extent, little is known regarding the broad scale impact of PKD on metabolism. In this respect, the identification of novel distinctive metabolites, such as polyamines, which have been found to stabilize the red blood cell (RBC) plasma membrane,18 and acyl carnitines, which are involved in turnover and repair of the RBC membrane,19 are promising starting points for further studies of the PKD pathophysiology.
We here report for the first time a metabolic profile for PKD obtained from dried whole blood spots. This profile resembles the integrated disease specific metabolome to a greater extent compared to the exclusive investigation of the red blood cell metabolome.20,21
In addition, this analysis requires only 50 mL of whole blood and can be obtained in a minimally invasive manner by sampling a single blood drop, making it very attractive for (international) sample exchange. Further advantages of DI-HRMS include relatively uncomplicated sample extraction steps and a short run-time of 3 minutes per sample.
The rise of ‘omic’ approaches in the recent past has provided new opportunities for understanding and classifying a wide range of disorders. In contrast to conventional medical biology approaches, which focus on individual genes, proteins or metabolites, modern biology regards diseases as a complex, dynamic and especially integrated network.22 Our study, demonstrates the potential diagnostic application of untargeted metabolomics for PKD. However, the current model was constructed for the binary classification of healthy controls and PKD patients. Future applications, including more samples from various types of RHA could enable the development of an algorithm which is suited for the broader differential diagnosis of RHA in patients.
In conclusion, we demonstrate by proof of principle for PKD, that untargeted metabolomics in DBS is a novel functional tool to identify disease fingerprints and study the pathophysiology in RHA.
This approach opens up a novel area of diagnosis and research in the field of RBC disorders and has the potential to improve diagnostic evaluation and clinical management of patients.
Footnotes
- Received July 16, 2020
- Accepted August 28, 2020
Correspondence
Disclosures
EvB receives research funding from Agios Pharmaceuticals, Novartis, Bayer, Pfizer and RR Mechatronics, does consultancy for Agios Pharmaceuticals, Novartis and is on the data safety monitoring board for Imara; RvW receives research funding from RR Mechatronics and Agios Pharmaceuticals. The other authors report no relevant conflicts of interest.
Contributions
BvD and MBr contributed to collection, analysis and interpretation of the data, and wrote the first draft of the manuscript; EvB was actively involved in collecting patient samples and carefully revised the manuscript; WvS, EN, and NV were all involved in the study design and carefully revised the manuscript; MBa, JJ and RvW were principal investigators and were involved in all aspects of the study, including design, collection, and interpretation of data, as well as revising and co-writing the manuscript.
Funding
This study was supported in part by research funding from Metakids (Grant No. 2017-075) to JJ.
Acknowledgments
The authors would like to thank Nienke van Unen for her technical support in Bio-informatics.
References
- Da Costa L, Narla A, Mohandas N. An update on the pathogenesis and diagnosis of Diamond-Blackfan anemia. F1000Res. 2018; 7:F1000. https://doi.org/10.12688/f1000research.15542.1PubMedPubMed CentralGoogle Scholar
- Grace RF, Zanella A, Neufeld EJ. Erythrocyte pyruvate kinase deficiency: 2015 status report. Am J Hematol. 2015; 90(9):825-830. https://doi.org/10.1002/ajh.24088PubMedPubMed CentralGoogle Scholar
- Albuisson J, Murthy SE, Bandell M. Dehydrated hereditary stomatocytosis linked to gain-of-function mutations in mechanically activated PIEZO1 ion channels. Nat Commun. 2013; 4:1884. https://doi.org/10.1038/ncomms2899PubMedPubMed CentralGoogle Scholar
- Roy NBA, Babbs C. The pathogenesis, diagnosis and management of congenital dyserythropoietic anaemia type I. Br J Haematol. 2019; 185(3):436-449. https://doi.org/10.1111/bjh.15817PubMedPubMed CentralGoogle Scholar
- Vercellati C, Marcello AP, Fermo E, Barcellini W, Zanella A, Bianchi P. A case of hereditary spherocytosis misdiagnosed as pyruvate kinase deficient hemolytic anemia. Clin Lab. 2013; 59(3-4):421-424. https://doi.org/10.7754/Clin.Lab.2012.120905Google Scholar
- Steinberg-Shemer O, Keel S, Dgany O. Diamond Blackfan Anemia: a nonclassical Pptient with diagnosis assisted by genomic analysis. J Pediatr Hematol Oncol. 2016; 38(7):e260-262. https://doi.org/10.1097/MPH.0000000000000587PubMedPubMed CentralGoogle Scholar
- Russo R, Andolfo I, Manna F. Multigene panel testing improves diagnosis and management of patients with hereditary anemias. Am J Hematol. 2018; 93(5):672-682. https://doi.org/10.1002/ajh.25058PubMedGoogle Scholar
- van Dooijeweert B, van Ommen CH, Smiers FJ. Pediatric Diamond-Blackfan anemia in the Netherlands: an overview of clinical characteristics and underlying molecular defects. Eur J Haematol. 2018; 100(2):163-170. https://doi.org/10.1111/ejh.12995PubMedGoogle Scholar
- Zanella A, Fermo E, Bianchi P, Chiarelli LR, Valentini G. Pyruvate kinase deficiency: the genotype-phenotype association. Blood Rev. 2007; 21(4):217-231. https://doi.org/10.1016/j.blre.2007.01.001PubMedGoogle Scholar
- Grace RF, Bianchi P, van Beers EJ. Clinical spectrum of pyruvate kinase deficiency: data from the Pyruvate Kinase Deficiency Natural History Study. Blood. 2018; 131(20):2183-2192. https://doi.org/10.1182/blood-2017-10-810796PubMedGoogle Scholar
- Bianchi P, Fermo E, Glader B. Addressing the diagnostic gaps in pyruvate kinase deficiency: Consensus recommendations on the diagnosis of pyruvate kinase deficiency. Am J Hematol. 2019; 94(1):149-161. https://doi.org/10.1002/ajh.25325PubMedPubMed CentralGoogle Scholar
- Haijes HA, Willemsen M, Van der Ham M. Direct infusion based metabolomics identifies metabolic disease in patients' dried blood spots and plasma. Metabolites. 2019; 9(1):12. https://doi.org/10.3390/metabo9010012PubMedPubMed CentralGoogle Scholar
- de Sain-van der Velden MGM, van der Ham M, Gerrits J. Quantification of metabolites in dried blood spots by direct infusion high resolution mass spectrometry. Anal Chim Acta. 2017; 979:45-50. https://doi.org/10.1016/j.aca.2017.04.038PubMedGoogle Scholar
- Wishart DS, Jewison T, Guo AC, Wilson M, Knox C. HMDB 3.0 — The Human Metabolome Database in 2013. Nucleic Acids Res. 2013; 41(D1):D801-807. https://doi.org/10.1093/nar/gks1065PubMedPubMed CentralGoogle Scholar
- Chong J, Wishart DS, Xia J. Using MetaboAnalyst 4.0 for Comprehensive and integrative metabolomics data analysis. Curr Protoc Bioinformatics. 2019; 68(1):e86. https://doi.org/10.1002/cpbi.86PubMedGoogle Scholar
- Gromski PS, Muhamadali H, Ellis DI. A tutorial review: metabolomics and partial least squares-discriminant analysis--a marriage of convenience or a shotgun wedding. Anal Chim Acta. 2015; 879:10-23. https://doi.org/10.1016/j.aca.2015.02.012PubMedGoogle Scholar
- Bianchi P, Fermo E, Lezon-Geyda K. Genotype-phenotype correlation and molecular heterogeneity in pyruvate kinase deficiency. Am J Hematol. 2020; 95(5):472-482. https://doi.org/10.1002/ajh.25753PubMedPubMed CentralGoogle Scholar
- Ballas SK, Mohandas N, Marton LJ, Shohet SB. Stabilization of erythrocyte membranes by polyamines. Proc Natl Acad Sci U S A. 1983; 80(7):1942-1946. https://doi.org/10.1073/pnas.80.7.1942PubMedPubMed CentralGoogle Scholar
- Arduini A, Mancinelli G, Radatti GL, Dottori S, Molajoni F, Ramsay RR. Role of carnitine and carnitine palmitoyltransferase as integral components of the pathway for membrane phospholipid fatty acid turnover in intact human erythrocytes. J Biol Chem. 1992; 267(18):12673-12681. https://doi.org/10.1016/S0021-9258(18)42330-7Google Scholar
- Darghouth D, Koehl B, Madalinski G. Pathophysiology of sickle cell disease is mirrored by the red blood cell metabolome. Blood. 2011; 117(6):e57-66. https://doi.org/10.1182/blood-2010-07-299636PubMedGoogle Scholar
- Darghouth D, Koehl B, Heilier JF. Alterations of red blood cell metabolome in overhydrated hereditary stomatocytosis. Haematologica. 2011; 96(12):1861-1865. https://doi.org/10.3324/haematol.2011.045179PubMedPubMed CentralGoogle Scholar
- Tebani A, Afonso C, Marret S, Bekri S. Omics-based strategies in precision medicine: toward a paradigm shift in inborn errors of metabolism investigations. Int J Mol Sci. 2016; 17(9):1555. https://doi.org/10.3390/ijms17091555PubMedPubMed CentralGoogle Scholar
Data Supplements
Figures & Tables
Article Information
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.