Several of the complications observed in sickle cell disease (SCD) are influenced by variation in hematologic traits (HT), such as fetal hemoglobin (HbF) level and neutrophil count. Previous large-scale genome-wide association studies carried out in largely healthy individuals have identified thousands of variants associated with HT, which have then been used to develop multi-ancestry polygenic trait scores (PTS). Here, we tested whether these PTS associate with HT in SCD patients and if they can improve statistical models associated with SCD-related complications. In 2,056 SCD patients, we found that the PTS predicted less HT variance than in non-SCD individuals of African ancestry. This was particularly striking at the Duffy/DARC locus, where we observed an epistatic interaction between the SCD genotype and the Duffy null variant (rs2814778) that led to a two-fold weaker effect on neutrophil count. PTS for these HT which are measured as part of routine practice were not associated with complications in SCD. In contrast, we found that a simple PTS for HbF that includes only six variants explained a large fraction of the phenotypic variation (20.5-27.1%), associated with acute chest syndrome and stroke risk, and improved the statistical modeling of the vaso-occlusive crisis rate. Using Mendelian randomization, we found that increasing HbF by 4.8% reduces stroke risk by 39% (P=0.0006). Taken together, our results highlight the importance of validating PTS in large diseased populations before proposing their implementation in the context of precision medicine initiatives.
Sickle cell disease (SCD), the most frequent monogenic disease worldwide, is caused by mutations in the β-globin gene.1 SCD patients present a wide range of complications such as vaso-occlusive crisis (VOC), acute chest syndrome (ACS), stroke, and end-organ dysfunction, and their life expectancy is reduced when compared to the general population.1 Critically, the causes of this clinical heterogeneity are not fully understood.
Hematologic traits (HT) are among the main factors known to be associated with clinical outcomes in SCD. Fetal hemoglobin (HbF) is a major disease modifier, and is associated with a reduction in the occurrence of several complications such as VOC, ACS, and death.2-4 HbF >30% is associated with an almost complete absence of complications in SCD patients.5 However, most SCD patients have lower HbF levels while not receiving disease-modifying therapy, and for some complications, such as stroke, the risk reduction associated with HbF has not been quantified in large cohorts. Several other HT have been associated with SCD-related complications, notably elevated white blood cell (WBC) count and neutrophil count with survival,2,6,7 low hemoglobin (Hb) levels with composite severe outcomes and death,7, 8 and platelet (PLT) count with ACS.9
Polygenic trait scores (PTS) have been developed in an effort to harness the power of large-scale human genetic studies to make useful clinical predictions, and many studies have already recognized their value in the context of precision medicine initiatives.10,11 For each participant, PTS are calculated by adding up the number of phenotype-associated alleles across associated variants, each weighted by the effect of the alleles on the phenotype. One widely discussed limitation of PTS is their poor performance when tested in populations that have different ancestral backgrounds than those in which they were optimized.12 Another equally important aspect that has not been as extensively studied is how well PTS, which are normally calibrated in “healthy” individuals, perform in “diseased” individuals.13 This is important, because PTS could, in theory, be useful to stratify patients into mild or severe categories. For example, if higher WBC count is associated with an increased risk of death in SCD patients, would a PTS developed to capture WBC count variation in healthy individuals be useful to predict death in SCD patients?
Here, we explored how SCD impacts the performance of HT PTS, and whether these PTS are clinically useful predictors of SCD-related complications. Our analyses had four aims. First, to study the performance of HT PTS in explaining HT variation in SCD patients. Second, to test if these HT PTS are associated with SCD-related complications. Third, to explore whether specific genetic variants included in the HT PTS have reduced impact (i.e., effect size) on HT variation in the context of SCD. Finally, although not one of the main goals of our study, we also aimed to carry out genome-wide association studies (GWAS) of HT in up to 1,736 SCD patients to identify strong effect variants that could influence blood-cell phenotypes in this patient population. A summary of the study design is shown in Figure 1.
Complete details of the methods used are available in the Online Supplementary Appendix.
We collected data from three SCD cohorts with SS genotype individuals: the Cooperative Study of Sickle Cell Disease (CSSCD, n=1,278),14 Genetic Modifier (GEN-MOD, n=406),15,16 and Mondor/Lyon (n=372)17 (Online Supplementary Table S1). For replication, we tested associations in the Duke University Outcome Modifying Gene (OMG) SCD cohort (n=333), which has been described elsewhere.18 We collected data according to the principles of the Declaration of Helsinki and the study was approved by the institutional ethics committees. Informed consent was provided by all study participants. For comparison, we also accessed data from non-SCD individuals of African ancestry from the BioMe cohort19 and the UK Biobank (Online Supplementary Table S1).20 To ensure that the differences found were not due to a difference in the sample size between the SCD cohorts and the UK Biobank, we down-sampled the African-ancestry UK Biobank cohort to the same number of participants (n=1,278) as in the CSSCD and repeated our analyses.
Polygenic trait scores for hematologic traits
For all HT except HbF, we used the multi-ancestry PTS obtained by the Blood-Cell Consortium to test for association with HT.19 For HbF, we derived a PTS by considering the conditional effect sizes of six variants at three loci (BCL11A, HBS1L-MYB, HBB) associated with HbF levels in SCD patients (Online Supplementary Table S2).21-23 We calculated a bootstrapped, empirical P-value to compare the variance explained by PTS in SCD and non-SCD individuals. We also tested g(HbF), a previously published 4-SNP PTS for HbF,24 in a subset of the CSSCD cohort (n=816). Finally, we analyzed the respective contribution of α-thalassemia and PTS on HT variance explained in the CSSCD cohort.
Association between hematologic traits or polygenic trait scores with SCD-related clinical outcomes
In the CSSCD cohort, we tested the association between PTS and outcomes (VOC rate, ACS rate, stroke), considering only PTS for which the corresponding HT was associated with the outcome (P<0.05). We performed an analysis of deviance to determine if the PTS improves the model beyond the measured HT.
We used a two-sample Mendelian randomization (MR) approach to test if HbF captured by the 6-SNP PTSHbF, causally impact SCD-related complications (VOC rate, ACS rate and stroke). We calculated instrument (i.e., PTS for HbF)-exposure (i.e., HbF) and instrument-outcome (i.e., complications) effects from the GEN-MOD and CSSCD cohorts, respectively. We used the multiplicative random-effect inverse variance-weighted (IVW) approach as the main method for each MR analysis.
Genome-wide association studies of hematologic traits in SCD patients
We performed GWAS for each HT available in the three SCD cohorts separately, then performed a meta-analysis of the results.
Comparing effect sizes of hematologic trait-associated single nucleotide polymorphism in SCD patients and non-SCD individuals
For each single nucleotide polymorphism (SNP)-HT pair considered in the multi-ancestry PTS models, we used the t-statistic to compare the effect in SCD (derived from SCD GWAS meta-analyses) and non-SCD individuals (derived from published multi-ancestry meta-analyses from the Blood-Cell Consortium).19 We computed q-value with a 5% false discovery rate to correct for multiple testing.
Performance of hematologic polygenic trait scores in SCD patients
We investigated the phenotypic variance explained by PTS of HT in SCD patients of African ancestry from three cohorts (CSSCD, GEN-MOD, Mondor/Lyon) and in non-SCD African-American individuals from BioMe not included in the discovery effort used to generate the PTS, as well as non-SCD individuals of African ancestry from the UK Biobank (using all African-ancestry participants [n=6,627] or a random subset with a sample size similar to the CSSCD [n=1,278]). In SCD participants, PTS reached significance (P<0.05) in at least one cohort for nine of the 12 HT tested (Table 1); this number remained unchanged after adjustment for multiple testing. PTS for hematocrit (Ht), Hb concentration, and lymphocyte count were not significant. The variance explained by significant PTS was 0.5-4.1% for RBC traits, 0.9-3.8% for WBC traits, and 0.7-4.2% for PLT traits. When we compared the performance of these PTS in SCD participants and non-SCD African-ancestry individuals, we found that all PTS with significant association, except the PTS for eosinophils, explained less phenotypic variance in SCD participants (Table 1, Figure 2A). One of the most striking differences was seen for WBC and neutrophil counts: the mean variance explained was 1.9% and 3.3% in SCD participants and 10.3% and 11.8% in non-SCD individuals, respectively (Figure 2A). HbF levels are an important modifier of severity in SCD. Since it is rarely measured in large non-SCD cohorts, the genetics of this trait have not been extensively studied in very large sample sizes. However, smaller GWAS in SCD patients have identified robust associations between HbF levels and genetic variants at three loci: BCL11A, HBS1L-MYB and the β-globin locus (reviewed by Lettre et al.25). Given this, we derived PTSHbF that includes six conditionally independent variants (Online Supplementary Table S2). PTSHbF was strongly associated with HbF levels in all three SCD cohorts and explained 20.5-27.1% of the variance (Table 1). We compared this PTS with a previously published 4-SNP model for HbF (g(HbF)).24 We used the subset of the CSSCD cohort (n=816) with genotyping or imputation data available for each SNP. Our 6-SNP PTSHbF explained 24.6% of HbF variance (P=5.7x10-51) whereas g(HbF) explained 20.8% of HbF variance (P=1.5x10-42).
Finally, we analyzed the respective contribution of α-thalassemia and PTS on HT variance explained in the CSSCD cohort (Figure 2B). Because HbF is such a strong modifier of SCD phenotype, we also assessed whether PTSHbF could contribute to the phenotypic variation of other HT. α-thalassemia was the main contributor of mean corpuscular hemoglobin (MCH) and mean corpuscular volume (MCV) but was also associated to other RBC, WBC, and PLT traits. PTSHbF explained a large fraction of several RBC traits, but was also associated with several WBC traits; this is consistent with the known beneficial effect of HbF in normalizing HT in SCD patients. PTS for the corresponding HT was the main contributor for several WBC traits and PLT count.
Taken together, although most of the PTS for HT derived in non-SCD individuals were associated with HT in SCD individuals, they did not perform as well in this patient population. By contrast, a simple PTS for HbF derived from GWAS performed in SCD patients explained a large proportion of the HbF phenotypic variance.
Associations between hematologic trait polygenic trait scores and SCD-related complications
Variation in HT has been associated with several clinical outcomes observed in SCD patients such as death,2 stroke,26-28 VOC,3 and ACS.4,9 We were able to reproduce most of these results in the genotyped subset of the CSSCD (Online Supplementary Table S3). We asked ourselves whether PTS for HT were also associated with these complications in the largest available SCD cohort (CSSCD). For these analyses, we used significant PTS (Table 1) for which the corresponding HT was also significantly associated with the complication (Online Supplementary Table S3). In total, we tested seven PTS-complication combinations, and found significant associations for PTSHbF-stroke and PTSHbF-ACS (Table 2).
We extended these analyses to determine if the PTS could improve the association of the models beyond the baseline HT measures. We reasoned that, because PTS capture HT heritable variation, they would more faithfully represent “life-long exposure” and add information that is independent of the imprecisions associated with HT measurement in the lab. The statistical models did not improve for stroke and ACS rate, but we found that PTSHbF improved the association with VOC, as described previously (Table 2).29,30 Interestingly, additional analyses revealed that PTSHbF improves the association of the model with VOC in patients with low HbF values (<10%), levels at which HbF is not associated with VOC (Online Supplementary Table S4). For high HbF values (>10%), HbF is strongly associated with VOC, and adding PTSHbF did not improve the statistical model.
Quantifying the causal impact of HbF levels on SCD complications by Mendelian randomization
Using MR, we sought to confirm the causal protective effect of high HbF levels on stroke,27,28 ACS,4 and VOC.3 Although the literature suggests links between high WBC or neutrophil counts and SCD survival,2 we did not test these potential causal relationships by MR because of limited statistical power (see Online Supplementary Appendix). We performed two-sample MR using the pseudo-independent SNP from PTSHbF. We found a causal association between HbF and stroke: a one standard deviation increase in genetically-determined HbF levels (corresponding to 4.8% of HbF) decreases the risk of stroke by 39% (odds ratio [95% confidence interval]=0.61 [0.46-0.81]; P=0.0006) (Figure 3). In this analysis, the low-frequency variant rs114398597 located between HBS1L and MYB on chromosome 6 appears as an outlier (Figure 3), but we confirmed that the MR result remained significant after its exclusion (P=3.7x10-6) (Online Supplementary Table S5). Although the direction of the effect was similar using the sensitivity MR-Egger and weighted median methods, these analyzes were not statistically significant, suggesting insufficient power (Online Supplementary Table S5). We found no heterogeneity in the effect (I2 = 0%) and confirmed the absence of horizontal pleiotropy (Egger intercept, -0.09, standard error = 0.21; P=0.69). We did not find causal association between HbF levels and ACS/VOC rates (Online Supplementary Table S5, Online Supplementary Figure S1); this may have been due to limited statistical power (Online Supplementary Appendix).
SCD partially masks the genetic effect of the Duffy/DARC null variant on white blood cell and neutrophil counts
To understand why the PTS under-performed in SCD patients (excluding PTSHbF), we carried out meta-analyses of GWAS results for the 11 HT in the three SCD cohorts, and compared effect sizes (βSCD) for the SNP found in the PTS (4,201 SNP-HT pairs) with the effect sizes from multi-ancestry meta-analyses (βnon-SCD).19 Across all SNP and HT, normalized effect sizes were weakly correlated when considering the same effect alleles (Pearson’s r = 0.09; P=2.4x10-10) (Figure 4A). Among the 273 variant-HT pairs that were nominally associated in SCD meta-analyses (P<0.05), 162 (59.3%) had a concordant direction of effect in non-SCD meta-analyses (P=0.002, binomial test).
After correction for multiple testing (q-value <0.05), we found two variants significantly associated with HT in SCD patients, but with significantly different effect size when comparing βSCD and βnon-SCD : rs8090527 and rs2814778 (Online Supplementary Table S6, Online Supplementary Figures S2, S3). We confirmed that the effect sizes were still different when comparing βSCD to βnon-SCD derived from a down-sampled subset of African-ancestry UK Biobank individuals (P<0.05) (Online Supplementary Table S6). We did not explore the association between the intergenic rs8090527 variant and PLT count further as we were unable to replicate it in an independent SCD cohort (Online Supplementary Table S6). The second variant is the previously described Duffy/DARC null variant (rs2814778),31 which had a two-fold weaker effect on neutrophil count in SCD when compared to non-SCD individuals. Based on this observation, we wondered if the apparent poor performance of the WBC and neutrophil count PTS in SCD participants was due to the lower impact of the Duffy/DARC null variant in this patient population. In non-SCD individuals, removing the
PTS variants on chromosome 1 (to ensure that the large admixture signal due to the Duffy/DARC locus does not impact the analysis) reduced the mean variance explained for WBC from 10.3% to 1.6%, and for neutrophils from 11.9% to 0.9% (Table 1, Figure 4B). When we repeated this analysis in SCD participants, the PTS for WBC was not affected (from 2.1% to 1.9%), whereas the variance explained by the neutrophil count PTS changed slightly from 3.3% to 2.2% (Figure 4B).
We next specifically focused on the association between Duffy/DARC rs2814778 and WBC or neutrophil counts in SCD and non-SCD individuals. First, consistent with the recessive inheritance of the Duffy-negative blood group, we showed that a recessive genetic model provided a better fit with the data than the standard additive model (Online Supplementary Table S7). Thus, we used a recessive model for all subsequent genetic analyses of this variant. For non-SCD individuals, the single Duffy/DARC variant explained 18.4-24.2% of the phenotypic variance (Table 3). In contrast, the associations between rs2814778 and WBC or neutrophil counts were either weak or non-significant in SCD participants, with this variant contributing only 0.9-3.3% of the variance in these HT (Table 3). To quantify the magnitude of the difference in effect sizes and provide meaningful clinical estimates, we calculated that the Duffy null genotype (homozygosity for the C-allele at rs2814778) was associated with a mean reduction of 0.76x109 WBC/L (P=0.004) and 0.84x109 neutrophils/L (P=1.6x10-6) in the CSSCD, and of 1.9x109 WBC/L (P=3.3x10-164) and 1.6x109 neutrophils/L (P=4x10-199) in the UK Biobank. When we considered both SCD status and the Duffy blood group, we found that: 1) SCD has the strongest impact on neutrophil count; 2) Duffy has a weaker effect on neutrophil count in SCD patients; and 3) the neutrophil PTS (without chromosome 1 variants) remains associated with neutrophil count in all groups (Figure 4C). To further illustrate how SCD modifies the effect of Duffy, we considered the neutrophil PTS (without chromosome 1 variants) quintiles and compared neutrophil count in Duffy-positive individuals with a PTS in the lowest quintile with Duffy-negative individuals with a PTS in the highest quintile (Figure 4D). Whereas in non-SCD UK Biobank participants Duffy outweighs the PTS effect, it is equivalent in SCD patients. We observed a similar effect when analyzing a subset of the non-SCD African-ancestry cohort from the UK Biobank with a sample size similar to the CSSCD, suggesting that the difference in effect size is not due to a difference in sample size (Online Supplementary Table S8, Online Supplementary Figure S3). Finally, we investigated whether sickle cell trait (heterozygosity for the hemoglobin S allele) also modifies the Duffy/DARC effect on neutrophil count. We did not find a significant interaction between these two genotypes in the UK Biobank (P=0.33), indicating that the epistatic effect is specific to SCD. Taken together, our data suggest that SCD partially masks the strong effect of the Duffy/DARC null variant (rs2814778) on neutrophil count. In non-SCD individuals, the Duffy null variant is the main determinant of neutrophil count, whereas in SCD individuals, its effect is equivalent to the effect of other neutrophil count-associated common variants (as captured by the neutrophil PTS without chromosome 1).
Genome-wide association studies of hematologic traits in SCD patients
To determine if new genetic variation could specifically modulate HT variation in SCD, we carried out GWAS for 11 blood-cell traits across all three SCD cohorts available. Given the relatively small sample size of the dataset, we restricted our analyses to variants with a minor allele frequency (MAF) >1%. We found little evidence of association, except for Ht and Hb levels (Online Supplementary Figure S4). We found 24 genome-wide significant (P<5x10-8) SNPHT associations, including 23 at the known HbF loci and associated with Hb levels, Ht or RBC count (Online Supplementary Table S9). The last variant, rs113819343, was associated with PLT count in the SCD meta-analyses (P=1.4x10-8). This variant is common in African-ancestry individuals (MAF=5.5%, gnomAD) but rarer in European-ancestry populations (MAF=0.098%). This variant is not associated with PLT count in the multi-ancestry (n=473,895; P=0.72) nor in the African-specific (n=15,171; P=0.27) meta-analyses from the Blood-Cell Consortium.19 Our attempt to replicate this association with PLT count in 333 SCD patients from the OMG cohort was unsuccessful (P=0.69), so it is not possible to conclude if this association is real or a false-positive result.
In this study, we explored the utility of PTS for HT in SCD patients. Our analyses highlight several important results, including some with clinical implications. First, we found that PTS for HT derived in non-SCD individuals largely underperformed when tested in SCD patients. This emphasizes the need to derive polygenic predictors directly in SCD patients before trying to implement them into precision medicine initiatives for this patient population. This will require larger SCD cohorts with comprehensive clinical and genetic information. Second, our characterization of the weak association between the Duffy/DARC null rs2814778 variant and neutrophil count in SCD patients suggested that the weaker impact of PTS may partly be due to an epistatic effect of SCD with other HT-associated variants. These results explain the unsuccessful attempts to use the Duffy/DARC null variant as an SCD genetic modifier,32,33 and stress the importance to use all neutrophil-associated variants (ideally identified by GWAS carried out in SCD patients) as potential predictors of SCD complications. Third, a small set of six HbF-associated variants (PTSHbF) were associated with stroke and ACS rate, and improved the statistical modeling of VOC rate. These results support the addition of PTSHbF in clinical efforts to stratify SCD patients based on risk of developing complications. Finally, PTSHbF allowed us to confirm and quantify the causal protective impact of HbF increase on stroke risk reduction, a controversial point in SCD literature (see below). Several non-mutually exclusive factors could explain why PTS for HT were not as strongly associated with HT variation in SCD patients. These patients have different baseline levels and distribution values for several HT such as Hb and WBC count. This is in part due to the direct hemolytic effect of hemoglobin S, but also to the broad consequences of SCD, such as the induction of a chronic inflammatory state that can lead to cytokine-driven higher WBC count.34,35 Moreover, the frequent intercurrent complications (e.g., VOC) experienced throughout the natural course of SCD could result in greater variability in HT values. Finally, SNP genotyping arrays do not capture all structural variants, which are the main alterations in α-thalassemia, a major determinant MCH and MCV. We showed here that adding α-thalassemia and PTSHbF to PTS for corresponding HT greatly increased the variance explained. Thus, the performance of HT PTS will improve once comprehensive whole-genome sequencing of large SCD cohorts becomes available.
The C-allele of the Duffy/DARC null variant (rs2814778) results in erythroid-specific loss of expression of the Duffy/DARC chemokine reservoir expression gene.36 It has been known for some time that the Duffy/DARC null variant is strongly associated with lower circulating neutrophil and WBC counts in SCD and non-SCD individuals due to extravasation to tissues,31,37,38 and in particular the spleen.39 Although it is unclear why SCD partially masks the effect of this variant on neutrophil count, we may speculate that the precocious and pervasive splenic atrophy observed in SCD patients could lead to a reduced reservoir size. In addition, various Duffy/DARC variants in non-SCD and SCD individuals have been shown to affect the binding of DARC with inflammatory markers such as interleukin 8.40-42 Definitive data are still lacking, but the proinflammatory state of SCD patients may contribute to the observed discrepancy in effect.43
Whether or not the Duffy phenotype is associated with complications in SCD patients remains unclear.32,33,37,44-46 The potential consequences of the Duffy/DARC status in SCD would be linked to circulating proinflammatory cytokines and, in particular, the resulting effect on neutrophil and WBC counts, which has been implicated in several SCD complications.2,6,7 However, our data showed that the Duffy/DARC negative phenotype is not a good proxy for neutrophil count in SCD patients, in contrast to non-SCD individuals in whom it accounts for a large fraction of the phenotypic variance.
Although complex multi-ancestry PTS for general HT that include hundreds of variants performed poorly in SCD patients, we showed that a simple PTS for HbF made of six variants at three loci can capture a large fraction of the phenotypic variance in this important SCD modifier, consistent with previous reports.23,24 One major difference between the general HT and HbF is that we developed PTSHbF using GWAS data from SCD patients. Interestingly, we observed an association between PTSHbF and ACS or stroke in the large CSSCD. We also validated previous observations that a PTS for HbF can improve a statistical model for VOC rate,29,30 and further discovered that this PTSHbF was useful in the subset of patients with HbF levels <10%.
Whether the increase in HbF levels has a beneficial effect on stroke has long been unclear due to data discrepancies. The pre-hydroxyurea (HU) CSSCD (still the largest SCD cohort to date) did not find that high HbF reduces ischemic stroke nor risk of silent infarct (changes in white matter).26,47 Of note, in the CSSCD, results for hemorrhagic stroke and “overall” stroke (considering both ischemic and hemorrhagic stroke) were not reported. A follow-up study of SCD patients identified through screening of newborn did not find a protective effect of HbF.48 In contrast, historical,49 as well as more recent data, found a protective effect of elevated HbF levels on “overall” stroke.28 This protective effect was also reported for incidence of silent infarct.27,50 Furthermore, some studies investigating the association between HbF-associated SNP and stroke reported an association,51,52 while another did not.53 Several factors could have led to such inconsistent results: 1) different studies used different definitions of stroke; 2) only the CSSCD separated ischemic and hemorrhagic stroke based on computed tomography findings (with the limitations associated with this technology to define stroke sub-type); 3) retrospective studies used history of stroke as endpoint; 4) small sample sizes with limited power to detect an effect of HbF on stroke; and 5) differences between SCD studies regarding the age at which HbF levels were measured. (HbF is physiologically higher in younger patients54). Finally, HU has clearly been shown to reduce stroke risk in prospective studies,55 but given the multiple mechanisms by which the drug is beneficial in SCD,56 its protective effect cannot be solely linked to HU-mediated HbF increase.
In this study, we used the large genotyped subset of the CSSCD and combined genetic information in a PTS to show that HbF associates with stroke in SCD patients. Our MR analyses revealed that HbF levels have a causal and protective effect on stroke, although we also acknowledge that the Winner’s curse could have artificially increased our estimate. In effect, while the MR Egger and weighted median MR approaches are sensitive methods to detect horizontal pleiotropy, they are also known to have less statistical power than the multiplicative random-effect IVW method used as the primary discovery method in our analyses.57,58 Thus, the non-significant results for the PTSHbF-stroke using MR-Egger and weighted median approaches should not be interpreted as a lack of validation. What this negative result means is that we cannot exclude the possibility that horizontal pleiotropy could have impacted our HbF-stroke MR result. However, because the MR instruments were selected at clear HbF loci with known regulatory mechanisms (BCL11A, HBS1LMYB, beta-globin), we suggest that the risk of confounding due to horizontal pleiotropy is minimal. Our MR analysis suggests that the impact of HU on stroke is,55 at least in part, mediated by HbF and distinguishes the effect of HU by specifically quantifying how genetically-determined (lifelong) HbF levels modulate stroke risk.
Additional large replication cohorts were not available to our group to validate our findings and we acknowledge that this is an important limitation of our study. Specifically, replication of the association and MR results for PTSHbF and stroke would require large SCD cohorts (approx. 2,800-5,180 SCD patients; see Online Supplementary Appendix) given the reduction in stroke incidence following the implementation of successful stroke primary prevention programs.48,59 Therefore, we encourage investigators to test our models in their own SCD cohorts, and applaud efforts to create new large collaborative studies of SCD to energize the field of SCD modifier genetics.
Our findings have implications beyond SCD. While PTS have been shown to modulate the penetrance of monogenic mutations in diseases such as coronary artery disease and familial breast and colorectal cancers,60 much less is known about their effect on expressivity (or disease severity).13 This distinction is important because, although they may not cause the disease, several clinical variables and other endophenotypes that are captured by PTS can strongly modify disease severity (e.g., PTS for kidney functions in the context of hypertension, PTS for retinopathy/cataract in diabetic patients). Our analyses indicate that simply translating the genetics of polygenic traits formulated in healthy individuals to diseased populations may not provide the expected gain in risk stratification in the context of precision medicine. Fortunately, large biobanks and other cohorts should soon be able to use powerful GWAS for genetic modifiers in >10,000 patients who all suffer from the same disease.
- Received April 5, 2022
- Accepted September 28, 2022
No conflicts of interest to disclose.
TP and GL designed the study. TP, KSL and MEG performed the analyses. TP, ALPHAO, MEG, CB, AEA-K, MJT, FG, PJ, PB and GL collected the clinical and genetic data. TP and GL drafted the paper. GL supervised the study. All authors contributed to data interpretation, revised the manuscript for critical content and approved the final manuscript.
The CSSCD genetic dataset is available on the database of Genotypes and Phenotypes (dbGaP: https://www.ncbi.nlm. nih.gov/gap/), accession phs000366.v1.p1. The UK Biobank dataset is publicly available (https://www.ukbiobank.ac.uk/). The GEN-MOD, Mondor/Lyon and BioMe datasets and code supporting the current study have not been deposited in a public repository because data are not public but are available from the corresponding author on request.
This work was funded by the Canadian Institutes of Health Research (PJT #156248), Bioverativ, a Sanofi Company, and the Canada Research Chair Program (to GL). GEN-MOD samples and data collection were supported by NIH grant HL-68922. AA-K, MJT and establishment and analysis of the OMG cohort have been funded by NHLBI (R01HL68959, HL79915, HL70769, HL87681). This research has been conducted using the UK Biobank Resource under Application Number 11707.
We thank all participants for their contribution to this project. We thank Gabrielle Boucher for statistical support. TP is a recipient of a Charles Bruneau Foundation fellowship award and merit scholarship program for foreign students from the Ministry of Education and Higher Education of Quebec.
- Kato GJ, Piel FB, Reid CD. Sickle cell disease. Nat Rev Dis Primers. 2018; 4:18010. https://doi.org/10.1038/nrdp.2018.10PubMedGoogle Scholar
- Platt OS, Brambilla DJ, Rosse WF. Mortality in sickle cell disease. Life expectancy and risk factors for early death. N Engl J Med. 1994; 330(23):1639-1644. https://doi.org/10.1056/NEJM199406093302303PubMedGoogle Scholar
- Platt OS, Thorington BD, Brambilla DJ. Pain in sickle cell disease. Rates and risk factors. N Engl J Med. 1991; 325(1):11-16. https://doi.org/10.1056/NEJM199107043250103PubMedGoogle Scholar
- Castro O, Brambilla DJ, Thorington B. The acute chest syndrome in sickle cell disease: incidence and risk factors. The Cooperative Study of Sickle Cell Disease. Blood. 1994; 84(2):643-649. https://doi.org/10.1182/blood.V84.2.643.643Google Scholar
- Steinberg MH, Chui DH, Dover GJ, Sebastiani P, Alsultan A. Fetal hemoglobin in sickle cell anemia: a glass half full?. Blood. 2014; 123(4):481-485. https://doi.org/10.1182/blood-2013-09-528067PubMedGoogle Scholar
- Sebastiani P, Nolan VG, Baldwin CT. A network model to predict the risk of death in sickle cell disease. Blood. 2007; 110(7):2727-2735. https://doi.org/10.1182/blood-2007-04-084921PubMedPubMed CentralGoogle Scholar
- Elmariah H, Garrett ME, De Castro LM. Factors associated with survival in a contemporary adult sickle cell disease cohort. Am J Hematol. 2014; 89(5):530-535. https://doi.org/10.1002/ajh.23683PubMedPubMed CentralGoogle Scholar
- Miller ST, Sleeper LA, Pegelow CH. Prediction of adverse outcomes in children with sickle cell disease. N Engl J Med. 2000; 342(2):83-89. https://doi.org/10.1056/NEJM200001133420203PubMedGoogle Scholar
- Vichinsky EP, Neumayr LD, Earles AN. Causes and outcomes of the acute chest syndrome in sickle cell disease. National Acute Chest Syndrome Study Group. N Engl J Med. 2000; 342(25):1855-1865. https://doi.org/10.1056/NEJM200006223422502PubMedGoogle Scholar
- Lewis CM, Vassos E. Polygenic risk scores: from research tools to clinical instruments. Genome Med. 2020; 12(1):44. https://doi.org/10.1186/s13073-020-00742-5PubMedPubMed CentralGoogle Scholar
- Sun J, Wang Y, Folkersen L. Translating polygenic risk scores for clinical use by estimating the confidence bounds of risk prediction. Nat Commun. 2021; 12:5276. https://doi.org/10.1038/s41467-021-25014-7PubMedPubMed CentralGoogle Scholar
- Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019; 51(4):584-591. https://doi.org/10.1038/s41588-019-0379-xPubMedPubMed CentralGoogle Scholar
- Oetjens MT, Kelly MA, Sturm AC, Martin CL, Ledbetter DH. Quantifying the polygenic contribution to variable expressivity in eleven rare genetic disorders. Nat Commun. 2019; 10:4897. https://doi.org/10.1038/s41467-019-12869-0PubMedPubMed CentralGoogle Scholar
- Farber MD, Koshy M, Kinney TR. Cooperative Study of Sickle Cell Disease: demographic and socioeconomic characteristics of patients and families with sickle cell disease. J Chronic Dis. 1985; 38(6):495-505. https://doi.org/10.1016/0021-9681(85)90033-5PubMedGoogle Scholar
- Bartolucci P, Brugnara C, Teixeira-Pinto A. Erythrocyte density in sickle cell syndromes is associated with specific clinical manifestations and hemolysis. Blood. 2012; 120(15):3136-3141. https://doi.org/10.1182/blood-2012-04-424184PubMedGoogle Scholar
- Ilboudo Y, Bartolucci P, Rivera A. Genome-wide association study of erythrocyte density in sickle cell disease patients. Blood Cells Mol Dis. 2017; 65:60-65. https://doi.org/10.1016/j.bcmd.2017.05.005PubMedGoogle Scholar
- Pincez T, Lee SSK, Ilboudo Y. Clonal hematopoiesis in sickle cell disease. Blood. 2021; 138(21):2148-2152. https://doi.org/10.1182/blood.2021011121PubMedPubMed CentralGoogle Scholar
- Xu JZ, Garrett ME, Soldano KL. Clinical and metabolomic risk factors associated with rapid renal function decline in sickle cell disease. Am J Hematol. 2018; 93(12):1451-1460. https://doi.org/10.1002/ajh.25263PubMedPubMed CentralGoogle Scholar
- Chen MH, Raffield LM, Mousas A. Trans-ethnic and ancestry-specific blood-cell genetics in 746,667 individuals from 5 global populations. Cell. 2020; 182(5):1198-1213. https://doi.org/10.1016/j.cell.2020.06.045PubMedPubMed CentralGoogle Scholar
- Bycroft C, Freeman C, Petkova D. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018; 562(7726):203-209. https://doi.org/10.1038/s41586-018-0579-zPubMedPubMed CentralGoogle Scholar
- Canver MC, Lessard S, Pinello L. Variant-aware saturating mutagenesis using multiple Cas9 nucleases identifies regulatory elements at trait-associated loci. Nat Genet. 2017; 49(4):625-634. https://doi.org/10.1038/ng.3793PubMedPubMed CentralGoogle Scholar
- Bauer DE, Kamran SC, Lessard S. An erythroid enhancer of BCL11A subject to genetic variation determines fetal hemoglobin level. Science. 2013; 342(6155):253-257. https://doi.org/10.1126/science.1242088PubMedPubMed CentralGoogle Scholar
- Galarneau G, Palmer CD, Sankaran VG, Orkin SH, Hirschhorn JN, Lettre G. Fine-mapping at three loci known to affect fetal hemoglobin levels explains additional genetic variation. Nat Genet. 2010; 42(12):1049-1051. https://doi.org/10.1038/ng.707PubMedPubMed CentralGoogle Scholar
- Gardner K, Fulford T, Silver N. g(HbF): a genetic model of fetal hemoglobin in sickle cell disease. Blood Adv. 2018; 2(3):235-239. https://doi.org/10.1182/bloodadvances.2017009811PubMedPubMed CentralGoogle Scholar
- Lettre G, Bauer DE. Fetal haemoglobin in sickle-cell disease: from genetic epidemiology to new therapeutic strategies. Lancet. 2016; 387(10037):2554-2564. https://doi.org/10.1016/S0140-6736(15)01341-0PubMedGoogle Scholar
- Ohene-Frempong K, Weiner SJ, Sleeper LA. Cerebrovascular accidents in sickle cell disease: rates and risk factors. Blood. 1998; 91(1):288-294. Google Scholar
- Calvet D, Tuilier T, Mélé N. Low fetal hemoglobin percentage is associated with silent brain lesions in adults with homozygous sickle cell disease. Blood Adv. 2017; 1(26):2503-2509. https://doi.org/10.1182/bloodadvances.2017005504PubMedPubMed CentralGoogle Scholar
- Sommet J, Alberti C, Couque N. Clinical and haematological risk factors for cerebral macrovasculopathy in a sickle cell disease newborn cohort: a prospective study. Br J Haematol. 2016; 172(6):966-977. https://doi.org/10.1111/bjh.13916PubMedGoogle Scholar
- Rampersaud E, Kang G, Palmer LE. A polygenic score for acute vaso-occlusive pain in pediatric sickle cell disease. Blood Adv. 2021; 5(14):2839-2851. https://doi.org/10.1182/bloodadvances.2021004634PubMedPubMed CentralGoogle Scholar
- Lettre G, Sankaran VG, Bezerra MAC. DNA polymorphisms at the BCL11A, HBS1L-MYB, and β-globin loci associate with fetal hemoglobin levels and pain crises in sickle cell disease. Proc Natl Acad Sci U S A. 2008; 105(33):11869-11874. https://doi.org/10.1073/pnas.0804799105PubMedPubMed CentralGoogle Scholar
- Visscher PM, Reich D, Nalls MA. Reduced neutrophil count in people of African descent is due to a regulatory variant in the Duffy antigen receptor for chemokines gene. PLoS Genetics. 2009; 5(1):e1000360. https://doi.org/10.1371/journal.pgen.1000360PubMedPubMed CentralGoogle Scholar
- Farawela HM, El-Ghamrawy M, Farhan MS, Soliman R, Yousry SM, AbdelRahman HA. Association between Duffy antigen receptor expression and disease severity in sickle cell disease patients. Hematology. 2016; 21(8):474-479. https://doi.org/10.1080/10245332.2015.1111643PubMedGoogle Scholar
- Schnog JB, Keli SO, Pieters RA, Rojer RA, Duits AJ. Duffy phenotype does not influence the clinical severity of sickle cell disease. Clin Immunol. 2000; 96(3):264-268. https://doi.org/10.1006/clim.2000.4884PubMedGoogle Scholar
- Hermand P, Azouzi S, Gautier EF. The proteome of neutrophils in sickle cell disease reveals an unexpected activation of interferon alpha signaling pathway. Haematologica. 2020; 105(12):2851-2854. https://doi.org/10.3324/haematol.2019.238295PubMedPubMed CentralGoogle Scholar
- Nader E, Romana M, Connes P. The red blood cell-inflammation vicious circle in sickle cell disease. Front Immunol. 2020; 11:454. https://doi.org/10.3389/fimmu.2020.00454PubMedPubMed CentralGoogle Scholar
- Tournamille C, Colin Y, Cartron JP, Le Van Kim C. Disruption of a GATA motif in the Duffy gene promoter abolishes erythroid gene expression in Duffy-negative individuals. Nat Genet. 1995; 10(2):224-228. https://doi.org/10.1038/ng0695-224PubMedGoogle Scholar
- Afenyi-Annan A, Kail M, Combs MR, Orringer EP, Ashley-Koch A, Telen MJ. Lack of Duffy antigen expression is associated with organ damage in patients with sickle cell disease. Transfusion. 2008; 48(5):917-924. https://doi.org/10.1111/j.1537-2995.2007.01622.xPubMedGoogle Scholar
- Schaefer BA, Flanagan JM, Alvarez OA. Genetic modifiers of white blood cell count, albuminuria and glomerular filtration rate in children with sickle cell anemia. PLoS One. 2016; 11(10):e0164364. https://doi.org/10.1371/journal.pone.0164364PubMedPubMed CentralGoogle Scholar
- Duchene J, Novitzky-Basso I, Thiriot A. Atypical chemokine receptor 1 on nucleated erythroid cells regulates hematopoiesis. Nat Immunol. 2017; 18(7):753-761. https://doi.org/10.1038/ni.3763PubMedPubMed CentralGoogle Scholar
- Nebor D, Durpes MC, Mougenel D. Association between Duffy antigen receptor for chemokines expression and levels of inflammation markers in sickle cell anemia patients. Clin Immunol. 2010; 136(1):116-122. https://doi.org/10.1016/j.clim.2010.02.023PubMedGoogle Scholar
- Moreno Velasquez I, Kumar J, Bjorkbacka H. Duffy antigen receptor genetic variant and the association with Interleukin 8 levels. Cytokine. 2015; 72(2):178-184. https://doi.org/10.1016/j.cyto.2014.12.019PubMedGoogle Scholar
- Schnabel RB, Baumert J, Barbalic M. Duffy antigen receptor for chemokines (Darc) polymorphism regulates circulating concentrations of monocyte chemoattractant protein-1 and other inflammatory mediators. Blood. 2010; 115(26):5289-5299. https://doi.org/10.1182/blood-2009-05-221382PubMedPubMed CentralGoogle Scholar
- van Beers EJ, Yang Y, Raghavachari N. Iron, inflammation, and early death in adults with sickle cell disease. Circ Res. 2015; 116(2):298-306. https://doi.org/10.1161/CIRCRESAHA.116.304577PubMedPubMed CentralGoogle Scholar
- Elliott L, Ashley-Koch AE, De Castro L. Genetic polymorphisms associated with priapism in sickle cell disease. Br J Haematol. 2007; 137(3):262-267. https://doi.org/10.1111/j.1365-2141.2007.06560.xPubMedGoogle Scholar
- Nolan VG, Adewoye A, Baldwin C. Sickle cell leg ulcers: associations with haemolysis and SNPs in Klotho, TEK and genes of the TGF-beta/BMP pathway. Br J Haematol. 2006; 133(5):570-578. https://doi.org/10.1111/j.1365-2141.2006.06074.xPubMedPubMed CentralGoogle Scholar
- Drasar ER, Menzel S, Fulford T, Thein SL. The effect of Duffy antigen receptor for chemokines on severity in sickle cell disease. Haematologica. 2013; 98(8):e87-89. https://doi.org/10.3324/haematol.2013.089243PubMedPubMed CentralGoogle Scholar
- Kinney TR, Sleeper LA, Wang WC. Silent cerebral infarcts in sickle cell anemia: a risk factor analysis. Pediatrics. 1999; 103(3):640-645. https://doi.org/10.1542/peds.103.3.640PubMedGoogle Scholar
- Bernaudin F, Verlhac S, Arnaud C. Impact of early transcranial Doppler screening and intensive therapy on cerebral vasculopathy outcome in a newborn sickle cell anemia cohort. Blood. 2011; 117(4):1130-1140. https://doi.org/10.1182/blood-2010-06-293514PubMedGoogle Scholar
- Powars DR, Schroeder WA, Weiss JN, Chan LS, Azen SP. Lack of influence of fetal hemoglobin levels or erythrocyte indices on the severity of sickle cell anemia. J Clin Invest. 1980; 65(3):732-740. https://doi.org/10.1172/JCI109720PubMedPubMed CentralGoogle Scholar
- van der Land V, Mutsaerts HJMM, Engelen M. Risk factor analysis of cerebral white matter hyperintensities in children with sickle cell disease. Br J Haematol. 2016; 172(2):274-284. https://doi.org/10.1111/bjh.13819PubMedGoogle Scholar
- Leonardo FC, Brugnerotto AF, Domingos IF. Reduced rate of sickle-related complications in Brazilian patients carrying HbF-promoting alleles at the BCL11A and HMIP-2 loci. Br J Haematol. 2016; 173(3):456-460. https://doi.org/10.1111/bjh.13961PubMedGoogle Scholar
- Saraf SL, Akingbola TS, Shah BN. Associations of α-thalassemia and BCL11A with stroke in Nigerian, United States, and United Kingdom sickle cell anemia cohorts. Blood Adv. 2017; 1(11):693-698. https://doi.org/10.1182/bloodadvances.2017005231PubMedPubMed CentralGoogle Scholar
- Arez AP, Wonkam A, Ngo Bitoungui VJ. Association of variants at BCL11A and HBS1L-MYB with hemoglobin F and hospitalization rates among sickle cell patients in Cameroon. PLoS One. 2014; 9(3):e92506. https://doi.org/10.1371/journal.pone.0092506PubMedPubMed CentralGoogle Scholar
- Maier-Redelsperger M, Noguchi CT, de Montalembert M. Variation in fetal hemoglobin parameters and predicted hemoglobin S polymerization in sickle cell children in the first two years of life: Parisian Prospective Study on Sickle Cell Disease. Blood. 1994; 84(9):3182-3188. https://doi.org/10.1182/blood.V84.9.3182.bloodjournal8493182Google Scholar
- Ware RE, Davis BR, Schultz WH. Hydroxycarbamide versus chronic transfusion for maintenance of transcranial doppler flow velocities in children with sickle cell anaemia—TCD With Transfusions Changing to Hydroxyurea (TWiTCH): a multicentre, open-label, phase 3, non-inferiority trial. Lancet. 2016; 387(10019):661-670. https://doi.org/10.1016/S0140-6736(15)01041-7PubMedPubMed CentralGoogle Scholar
- Platt OS. Hydroxyurea for the treatment of sickle cell anemia. N Engl J Med. 2008; 358(13):1362-1369. https://doi.org/10.1056/NEJMct0708272PubMedGoogle Scholar
- Schmidt AF, Dudbridge F. Mendelian randomization with Egger pleiotropy correction and weakly informative Bayesian priors. Int J Epidemiol. 2018; 47(4):1217-1228. https://doi.org/10.1093/ije/dyx254PubMedPubMed CentralGoogle Scholar
- Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016; 40(4):304-314. https://doi.org/10.1002/gepi.21965PubMedPubMed CentralGoogle Scholar
- Fullerton HJ, Adams RJ, Zhao S, Johnston SC. Declining stroke rates in Californian children with sickle cell disease. Blood. 2004; 104(2):336-339. https://doi.org/10.1182/blood-2004-02-0636PubMedGoogle Scholar
- Fahed AC, Wang M, Homburger JR. Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions. Nat Commun. 2020; 11(1):3635. https://doi.org/10.1038/s41467-020-17374-3PubMedPubMed CentralGoogle Scholar
Figures & Tables
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.