The diagnosis of myelodysplastic syndromes (MDS) might be challenging and relies on the convergence of cytological, cytogenetic, and molecular factors. Multiparametric flow cytometry (MFC) helps diagnose MDS, especially when other features do not contribute to the decision-making process, but its usefulness remains underestimated, mostly due to a lack of standardization of cytometers. We present here an innovative model integrating artificial intelligence (AI) with MFC to improve the diagnosis and the classification of MDS. We develop a machine learning model through an elasticnet algorithm directed on a cohort of 191 patients, only based on flow cytometry parameters selected by the Boruta algorithm, to build a simple but reliable prediction score with five parameters. Our AI-assisted MDS prediction score greatly improves the sensitivity of the Ogata score while keeping an excellent specificity validated on an external cohort of 89 patients with an Area Under the Curve of 0.935. This model allows the diagnosis of both high- and low-risk MDS with 91.8% sensitivity and 92.5% specificity. Interestingly, it highlights a progressive evolution of the score from clonal hematopoiesis of indeterminate potential (CHIP) to highrisk MDS, suggesting a linear evolution between these different stages. By significantly decreasing the overall misclassification of 52% for patients with MDS and of 31.3% for those without MDS (P=0.02), our AI-assisted prediction score outperforms the Ogata score and positions itself as a reliable tool to help diagnose MDS.
Myelodysplastic syndromes (MDS) are a heterogeneous group of myeloid neoplasms the incidence of which increases with age, with a median age at diagnosis of 75 years.1 MDS are characterized by ineffective hematopoiesis leading to peripheral cytopenia, and dysplastic features in the erythroid, myeloid, monocytic and megakaryocytic cell lineages in bone marrow (BM) and peripheral blood (PB). Besides the symptoms and complications associated with cytopenia, MDS have a time-dependent heterogeneous but life-threatening potential for malignant transformation into acute myeloid leukemia (AML).2
According to the World Health Organization (WHO) 2016 classification, cytomorphology and cytogenetics are the gold standard for MDS diagnosis.3 However, cytomorphological analysis of BM smears may be challenging and identification of myelodysplastic features requires a welltrained cytologist. It is especially important when cytogenetic analysis does not reveal any chromosomal abnormality, which happens in 50% of patients. Thus, additional diagnostic procedures such as next generation sequencing (NGS) and multiparametric flow cytometry (MFC) need to be performed to facilitate the diagnosis. Several MDS MFC-scores have already been reported, like the Ogata score4 focusing on progenitor cells, the RED-score5 analyzing nucleated red blood cells, or the integrated flow score (iFS)6 that includes aspects of most of the MFC scores. However, the widespread use of these diagnostic tools is severely limited by a lack of standardization of the procedure among the centers performing flow cytometry, especially when using different flow cytometry machines and different brands of antibodies.
Over the past few years, due to the increased sensitivity of molecular biology techniques, several definitions and classifications of pre-MDS conditions have been proposed.7 Among these pre-MDS conditions are ICUS (idiopathic cytopenia of unknown significance), CHIP (clonal hematopoiesis of indeterminate potential), and CCUS (clonal cytopenia of unknown significance). They have relied either on the presence (ICUS and CCUS) or the absence (CHIP) of cytopenia, or on the detection (CHIP and CCUS) or not (ICUS) of clonal hematopoiesis in molecular biology or cytogenetic analyses. These stages do not require any treatment and CHIP is associated with a 1% risk of transformation to MDS per year. The differential diagnosis between these pre-MDS stages may be challenging and have implications on the patient’s follow-up. Besides, it is often difficult to distinguish pre-MDS stages from low-risk MDS in the absence of marked BM dysplasia, whereas there is a huge difference in the risk of ultimate malignant transformation.8
Recently, artificial intelligence (AI) has begun to play an important role in numerous areas of medicine. Several methods (machine learning, convolutional neural networks) have been developed in hematology to address these specific problems. Interestingly, more and more feature selection algorithms are being developed, which allow us to select and focus on the most important parameters of a given pathology.
In this study, using a 10-color single-tube, we sought to discriminate patients with MDS from patients without MDS based on the profile obtained with MFC in a well-characterized and multicentric cohort from hematology departments of three different centers. After features selection using the Boruta algorithm, we established a diagnostic score to accurately distinguish between patients without MDS and patients with or without excess blasts.
Between 2019 and 2021, patients with suspected MDS who had undergone MFC evaluation at initial diagnosis were retrospectively identified in the hematology departments of 3 different centers (Amiens, Ambroise Paré and Cochin hospitals). Peripheral blood (PB) cytopenia were defined as platelets below 150x109/L (thrombocytopenia), neutrophils below 1.8x109/L (neutropenia), and hemoglobin concentration below 12 g/dL or 13 g/dL (anemia) for women and men, respectively. All MDS diagnoses were made according to the 2016 World Health Organization (WHO) classification. Clinical, morphologic, immunophenotypic, molecular, and cytogenetic data were reviewed. The Revised International Prognostic Scoring System (IPSS-R) was calculated as previously described.9 We divided the total cohort into two; first, we designed a learning cohort with patients from two hospitals (Ambroise Paré and Cochin), and then we used an external validation cohort with patients from the Amiens hospital. This process is the gold standard of development and medical application on a machine learning algorithm. This method, associated with crossvalidation on a learning cohort, allows generalized performances which can be applied to other hospitals, as well as algorithms with less overfitting to be obtained.10
One center used a Navios™ instrument (Beckman Coulter, Miami, FL, USA) and the two others used FACSCanto™ and FACSLyric™ instruments (Beckton Dickinson, Franklin Lakes, NJ, USA). Before each series, the settings of the photomultipliers were checked with fluorescent calibration beads.
Direct immunolabeling was performed on 50 mL of whole BM. After 20 minutes incubation, red blood cells were lysed with VersalyseTM solution (Beckman Coulter) according to the manufacturer’s instructions, and the samples were washed once in phosphate-buffered saline (PBS). At least 50,000 events were acquired. All the antibodies used are listed in Online Supplementary Tables S1-S3. As described by Della Porta et al.11 regarding the Ogata score, four parameters (1 point each when outside the normal ranges) were analyzed: 1) the percentage of CD34+ myeloid progenitors among all acquired cells (threshold for normal <2%); 2) the percentage of B-cell progenitors, defined as CD34+CD38+CD19+ cells, among all CD34+ cells (threshold for normal >5%); 3) the lymphocyte/myeloid progenitor CD45 ratio (normal range 4-7.5); and 4) the granulocyte/lymphocyte SSC peak channel ratio (threshold for normal >6). The Ogata score was positive if ≥2. Expression of CD7, CD56 and HLA-DR on blast cells was also assessed, as well as the total hematogone ratio (number of hematogones/number of CD34+ cells).
R software 4.0.5 was used for the statistical analysis: χ2 test for categorical variables, non-parametric Kruskall Wallis test and Pearson correlation for quantitative parameters. We performed features selection using the Boruta algorithm with 150 random forest iterations and obtained a predictive model by logistic regression penalized by an elasticnet algorithm on previously selected features.12 We used an α coefficient of penalization of 0.6 and a 10-fold crossvalidation on training and test cohorts to reduce bias and optimize threshold category; an algorithm performance with a matrix of prediction on the Amiens Hospital validation cohort was obtained. A Receiver Operating Characteristic (ROC) curve analysis was performed on this validation cohort. Finally, we used a Cochran-Mantel-Haenszel χ2 test to analyze differences between the different matrices of prediction obtained by the Ogata and the elasticnet score on each group.
Machine learning was performed on MDS and no MDS patients without CHIP (n=280). CHIP was added to the figures a posteriori to observe the behavior of the model on this pathology. Two-tailed P<0.05 was considered statistically significant.
A total of 105 (34.65%) patients without MDS (no MDS), 23 with pre-MDS (7.59%), 112 (36.96%) with MDS without excess blasts, and 63 patients (20.79%) with MDS with excess blasts (MDS-EB) were enrolled in the total cohort (Table 1). All patients with pre-MDS stages listed in the introduction were combined in the pre-MDS group because of the small number of patients in each category. Among the 105 no MDS cases, there are patients with drug toxicity (n=15, 14.28%), autoimmune disease (n=10, 9.52%), liver insuffiency (n=12, 11.43%), bone marrow metastasis (n=8, 7.61%), idiopathic thrombocytopenia (ITP) (n=12, 11.43%), infections (n=12, 11.43%), vitamin B9 or B12 deficiency (n=7, 6.67%), non-Hodgkin lymphoma (n=12, 11.43%), aplastic anemia (n=7, 6.67%), kidney failure (n=6, 5.71%), and multiple myeloma (n=4, 3.81%).
In the total cohort, there was a significant difference in median age at diagnosis between these groups (72 years for patients without malignancy, 75 for pre-MDS, 78 for MDS, and 80 for MDS-EB; P=0.016). We did not found any significant difference between MCV values, probably due to the high values in some no MDS patients (vitamin deficiency).
A positive-Ogata score classified 71% of MDS patients in the MDS group, and accurately classified 81.10% of no MDS patients. It performed better for patients with MDS-EB, accurately classifying 95% of them in the MDS group, whereas it performed less well for the diagnosis of MDS without excess of blasts, both for MDS-MLD and MDS-SLD, only identifying 57.10% and 47.13% of them, respectively. Thus, the sensitivity of the Ogata score was 69.80% with 93.80% specificity, with a 95% positive predictive value (PPV) and a 63% negative predictive value (NPV).
As expected, the percentage of CD34+ myeloid progenitors was significantly higher in the MDS-EB group (2.76%) than in the patients with no MDS (0.71%), pre-MDS (0.63%), and MDS without excess of blasts (0.92%) (P<0.001).
We found no significant immunophenotypic aberrations, with a similar median value of CD7+ blasts (P=0.942), CD56+ blasts (P=0.551), and HLA-DR negative blasts (P=0.658) across the different groups. On the other hand, the percentage of CD7+ blast cells was significantly higher in the MDS groups (P<0.001).
To help distinguish patients with MDS from those with no MDS, we applied a Boruta feature selection algorithm on flow cytometry parameters. The qualitative granulocyte/lymphocyte SSC peak channel ratio (SSC Ogata score <6), the total hematogone ratio (number of hematogones/number of CD34+ cells), the percentage of CD34+ B-cell progenitors among all CD34+ cells (hematogone Ogata score), and the percentage of CD34+ myeloid progenitors were informative features to predict the diagnosis of MDS by MFC. On the contrary, the percentage of CD34+CD38- blast cells and the lymphocyte/myeloid progenitor CD45 ratio were found to be the least relevant features (Figure 1).
We then used an elasticnet model to evaluate the probability of MDS (with or without excess blasts) on the learning and the test cohorts (plus pre-MDS patients; Figure 2A), allowing us to propose this mathematical formula: MDS prediction score = -1.58 + 2.928*SSC Ogata score + 0.965*hematogone Ogata score + 0.8907*%CD34+ myeloid progenitors – 0.0032*Total hematogone ratio In our learning cohort, a threshold of 0 with this formula proved ideal to distinguish patients with MDS from pre-MDS or no MDS patients using crossvalidation. We then validated this model with a threshold of 0 on the Amiens Hospital validation cohort, and we obtained a ma trix prediction with sensitivity of 91.84%, a specificity of 92.48%, and positive and negative predictive values of 93.75% and 90.03%, respectively. The prognostic values of our AI-assisted MDS prediction score strikingly outperforms the Ogata score, particularly by significantly improving the sensitivity and the NPV (Table 2).
To validate our diagnosis threshold with normalized data, we built a cumulative proportion plot for all pathologies on the total cohort that shows representativeness of this cohort without imbalanced data (Figure 2B).
With a cut off of 0, ROC curve analysis performed on the same validation cohort showed an Area Under the Curve of 0.935, highlighting the excellent performance of the MDS prediction score by which only 6.47% of the patients were misclassified (Figure 2C).
In the total cohort (191 patients on learning and test cohorts plus 89 patients on external validation cohort: total n=280), our elasticnet MDS prediction score allowed for a clear distribution of MDS, MDS-EB and no MDS groups (Figure 2D). To improve the accuracy of diagnosis for these three subgroups, we refined the thresholds of our model and identified three groups: group A with an elasticnet MDS prediction score between -3 and 0, group B higher than 0 and less than 3, and group C higher than 3 (Figure 3). Group A included many patients without MDS (87.88% with no MDS; 90.90% of the no MDS patients in the global cohort were in this group: n=132). Group B had more MDS patients without EB (88.17% with MDS in this group: n=93), and group C had more MDS-EB patients or MDS without EB but with multiple abnormalities like multi-lineage dysplasia or genetic abnormalities (100% of MDS in this group: n=78) (Table 3). Patients with pre-MDS stages were equally distributed in group A and group B. Strikingly, our AI-assisted model shows a progressive evolution of the MDS prediction score from a pre-MDS condition (CHIP) to high-risk MDS, suggesting a linear evolution between these different stages (Figure 2C and Figure 4).
We then tested the accuracy of the model to classify patients according to the IPSS-R categories. In the whole cohort, IPSS-R was available for 150 MDS patients. Only 18 MDS patients (9.42% of total MDS; 13.63% of group A) were classified in group A, all others being classified in groups B and C of our model. Of these 18 patients, only one was of intermediate risk according to the IPSS-R, whereas nine were in the low-risk group and eight in the very low-risk group. Importantly, no high-risk patients were classified in group A (Table 4).
Overall, our newly developed AI-assisted MDS prediction score improved the accuracy of MDS diagnosis, by reducing the risk of misclassification of MDS without excess blasts by 52.07% (P=0.004) and an absence of MDS by 31.33% (P=0.022) compared to the Ogata score (Figure 4). Therefore, the sensitivity of this score for the subgroup of patients without excess blasts was 78.27%. These different subtypes could not be diagnosed by flow cytometry (using cytology alone) but were informative as to the ability of the algorithm to predict MDS.
In this study, we developed an innovative model combining artificial intelligence and machine learning with flow cytometry to improve the performance of MFC in diagnosing MDS.
Here, we propose an original AI-assisted prediction score for MDS to directly investigate the value of combining AI and MFC for the diagnosis of MDS. A few studies have used convolutional neural networks with gradient boosting to assess dysplasia13,14 or to distinguish aplastic anemia from MDS with very good sensitivity and specificity;15 but, up till now, AI has remained underused in the diagnosis of MDS, particularly in combination with MFC. Two other studies demonstrated a link between morphology, mutational status and prognosis in MDS using machine learning techniques.16,17 Recent studies used unsupervised cluster analysis of flow cytometry to identify new subsets in pathological erythropoiesis or facilitate the diagnosis of MDS.18,19
The aim of this study was to distinguish patients with actual MDS from patients without MDS. Our AI-assisted MDS prediction score following an elasticnet model identified that most of the MDS patients have a score >0. By setting the cut-off value to 0, our diagnostic model shows high predictive value and strikingly outperforms the Ogata score by significantly increasing the sensitivity and the specificity of this test. If the original Ogata score performs well to discriminate MDS-EB (that are also usually easier to diagnose on BM smears), the great value of our score is in improving the accuracy of diagnosing MDS without excess blasts, whether they show single or multi-lineage dysplasia. With an error rate of 8% for both false positive and negative results, our MDS prediction score will increase user confidence for biologists and clinicians involved in the diagnostic procedure, especially when the presence of dysplastic features is not clear. In addition to its performances in the diagnosis of MDS, our model allowed for risk prediction, as we identified three risk groups (A, B and C) that correlate with the evolution of the disease. Nevertheless, one MDS-EB patient was classified in group A (see Figure 3.) This patient had bicytopenia, 6% blasts on the bone marrow aspiration with no cytogenetic abnormalities and no mutations found, and was classified in the IPSS-R category. The patient is currently being monitored and is not receiving treatment. This is an unusual case, and the excess of blasts on bone marrow smears could be reactive (this excess was not found on MFC analysis with an Ogata score = 0). The follow-up of this patient could help us to understand his classification in group A.
In our model, patients with pre-MDS cluster equally with patients without MDS. This specific distribution argues for a continuity between the occurrence of clonal hematopoiesis and the onset of MDS, as previously suggested by several teams.7,20-25
Our prediction score included a few patients diagnosed with ICUS and CCUS; these had to be combined with CHIP patients to preserve the score’s performance. It would be interesting to further explore their behavior in our model. Furthermore, we could not discriminate between the different categories of MDS (e.g., 5q- syndrome, MDS with ring sideroblasts), and this could be the subject of further tests. It would also be interesting to know whether a positive score increases the risk of developing an overt MDS. Unfortunately, the cohort was not designed to answer this question, and more patients and a longer follow-up are required.
The model has been built using the 2016 MDS WHO classification. As the recent release of the new classification might change the performances values, we aimed to reclassify our patients according to the new WHO classification. As no patient was diagnosed with MDS-EB2 with AML-defining cytogenetics, this update did not change the initial classification of the patients.
Another potential limitation of the generalizability of the model is the lack of standardization between the different centers when assessing the Ogata score. Despite published recommendations,26 centers either use different fluorochromes or different clones of antibodies. However, when using data with artificial intelligence, this variability is a real asset in helping to build the algorithm. Moreover, all of the three centers involved in the study passed the external quality controls for this panel of antibodies on normal BM samples through the Acute Leukemia French Association / French Innovative Leukemia Organization (ALFA-FILO) network. Despite using different antibodies and dyes, the MDS prediction score yielded excellent results in the validation cohort, suggesting our model could be widely used.
Flow cytometry provides faster results than most cytogenetics or molecular biology techniques and is widely available worldwide; its standardization between laboratories is, therefore, of crucial importance. It relies on the same panel as that used in the Ogata score, which is already carried out in most laboratories. At a time when cost-effectiveness is becoming increasingly important, this AI-assisted MDS prediction score enables rapid patient diagnosis and stratification to help clinicians in their quest for the best patient care.
- Received November 3, 2022
- Accepted March 7, 2023
No conflicts of interest to disclose.
TB and VC designed the research study. TB, NC, JZ, VB and LG collected and analyzed the data. DL, AC and JPM managed patients and provided clinical data. TB, VC, VB, NC, LG and AC wrote the paper. All authors approved the final version of the manuscript for publication.
The authors confirm that the data supporting the findings of this study are available within the article and its Online Supplementary Appendix.
- Zeidan AM, Shallis RM, Wang R, Davidoff A, Ma X.. Epidemiology of myelodysplastic syndromes: why characterizing the beast is a prerequisite to taming it. Blood Rev. 2019; 34:1-15. Google Scholar
- Menssen AJ, Walter MJ. Genetics of progression from MDS to secondary leukemia. Blood. 2020; 136(1):50-60. Google Scholar
- Arber DA, Orazi A, Hasserjian R. The 2016 revision to the World Health Organization classification of myeloid neoplasms and acute leukemia. Blood. 2016; 127(20):2391-2405. Google Scholar
- Ogata K, Della Porta MG, Malcovati L. Diagnostic utility of flow cytometry in low-grade myelodysplastic syndromes: a prospective validation study. Haematologica. 2009; 94(8):1066-1074. Google Scholar
- Mathis S, Chapuis N, Debord C. Flow cytometric detection of dyserythropoiesis: a sensitive and powerful diagnostic tool for myelodysplastic syndromes. Leukemia. 2013; 27(10):1981-1987. Google Scholar
- Cremers HR, Wager TD, Yarkoni T.. The relation between statistical power and inference in fMRI. PLoS One. 2017; 12(11):e0184923. Google Scholar
- Steensma DP, Bejar R, Jaiswal S. Clonal hematopoiesis of indeterminate potential and its distinction from myelodysplastic syndromes. Blood. 2015; 126(1):9-16. Google Scholar
- Sekeres MA, Taylor J.. Diagnosis and treatment of myelodysplastic syndromes: a review. JAMA. 2022; 328(9):872-880. Google Scholar
- Greenberg PL, Tuechler H, Schanz J. Revised international prognostic scoring system for myelodysplastic syndromes. Blood. 2012; 120(12):2454-2465. Google Scholar
- Andaur Navarro CL, Damen JAA, Takada T. Risk of bias in studies on prediction models developed using supervised machine learning techniques: systematic review. BMJ. 2021; 375:n2281. Google Scholar
- Della Porta MG, Picone C, Pascutto C. Multicenter validation of a reproducible flow cytometric score for the diagnosis of low-grade myelodysplastic syndromes: results of a European LeukemiaNET study. Haematologica. 2012; 97(8):1209-1217. Google Scholar
- Kursa MB, Jankowski A, Rudnicki WR. Boruta – a system for feature selection. Fundam Inform. 2010; 101(4):271-285. Google Scholar
- Mori J, Kaji S, Kawai H. Assessment of dysplasia in bone marrow smear with convolutional neural network. Sci Rep. 2020; 10(1):14734. Google Scholar
- Acevedo A, Merino A, Boldú L, Molina Á, Alférez S, Rodellar J.. A new convolutional neural network predictive model for the automatic recognition of hypogranulated neutrophils in myelodysplastic syndromes. Comput Biol Med. 2021; 134:104479. Google Scholar
- Kimura K, Tabe Y, Ai T. A novel automated image analysis system using deep convolutional neural networks can assist to differentiate MDS and AA. Sci Rep. 2019; 9(1):13385. Google Scholar
- Nagata Y, Zhao R, Awada H. Machine learning demonstrates that somatic mutations imprint invariant morphologic features in myelodysplastic syndromes. Blood. 2020; 136(20):2249-2262. Google Scholar
- Nazha A, Komrokji R, Meggendorfer M. Personalized prediction model to risk stratify patients with myelodysplastic syndromes. J Clin Oncol. 2021; 39(33):3737-3746. Google Scholar
- Porwit A, Violidaki D, Axler O, Lacombe F, Ehinger M, Béné MC. Unsupervised cluster analysis and subset characterization of abnormal erythropoiesis using the bioinformatic FLOW‐SELF Organizing Maps algorithm. Cytometry B Clin Cytom. 2022; 102(2):134-142. Google Scholar
- Duetz C, Van Gassen S, Westers TM. Computational flow cytometry as a diagnostic tool in suspected‐myelodysplastic syndromes. Cytometry A. 2021; 99(8):814-824. Google Scholar
- Genovese G, Kähler AK, Handsaker RE. Clonal hematopoiesis and blood-cancer risk inferred from blood DNA sequence. N Engl J Med. 2014; 371(26):2477-2487. Google Scholar
- Jaiswal S, Fontanillas P, Flannick J. Age-related clonal hematopoiesis associated with adverse outcomes. N Engl J Med. 2014; 371(26):2488-2498. Google Scholar
- Desai P, Mencia-Trinchant N, Savenkov O. Somatic mutations precede acute myeloid leukemia years before diagnosis. Nat Med. 2018; 24(7):1015-1023. Google Scholar
- Abelson S, Collord G, Ng SWK. Prediction of acute myeloid leukaemia risk in healthy individuals. Nature. 2018; 559(7714):400-404. Google Scholar
- Takahashi K, Wang F, Kantarjian H. Preleukaemic clonal haemopoiesis and risk of therapy-related myeloid neoplasms: a case-control study. Lancet Oncol. 2017; 18(1):100-111. Google Scholar
- Gondek LP, DeZern AE. Assessing clonal haematopoiesis: clinical burdens and benefits of diagnosing myelodysplastic syndrome precursor states. Lancet Haematol. 2020; 7(1):e73-e81. Google Scholar
- Westers TW, Ireland R, Kern W. Standardization of flow cytometry in myelodysplastic syndromes: a report from an international consortium and the European LeukemiaNet Working Group. Leukemia. 2012; 26(7):1730-1741. Google Scholar
Figures & Tables
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.