Abstract
Although childhood acute lymphoblastic leukemia is the most common pediatric cancer, its etiology remains poorly understood. In an attempt to replicate the findings of 2 recent genome-wide association studies in a French-Canadian cohort, we confirmed the association of 5 SNPs [rs7073837 (P=4.2 × 10−4), rs10994982 (P=3.8 × 10−4), rs10740055 (P=1.6 × 10−5), rs10821936 (P=1.7 × 10−7) and rs7089424 (P=3.6 × 10−7)] in the ARID5B gene with childhood acute lymphoblastic leukemia. We also confirmed a selective effect for B-cell acute lymphoblastic leukemia with hyperdiploidy and report a putative gender-specific effect of ARID5B SNPs on acute lymphoblastic leukemia risk in males. This study provides a strong rationale for more detailed analysis to identify the causal variants at this locus and to better understand the overall functional contribution of ARID5B to childhood acute lymphoblastic leukemia susceptibility.Introduction
Childhood acute lymphoblastic leukemia (ALL), the leading cause of cancer-related deaths among children, is a heterogeneous disease with subtypes that differ markedly in their cellular and molecular characteristics. Advances in our understanding of the pathobiology of ALL have led to risk-targeted treatment regimes and increased long-term survival rates.1 Yet approximately 20% of patients do not respond to current treatment protocols, and over two-thirds of the survivors experience long-term treatment-related health problems.2
The etiology of pediatric ALL remains poorly understood. Initiation of leukemogenesis occurs during fetal life or in early infancy and is likely caused by multiple environmental and genetic factors.3 The assertion that ALL may have a genetic basis has long been pursued through association studies based on candidate genes; genes involved in xenobiotic metabolism,4 oxidative stress response,5 DNA repair,6 folate metabolism7 and cell-cycle regulation8 have been associated with ALL. Two recent genome-wide association studies (GWAS) have provided convincing evidence that inherited genetic variation contributes to childhood ALL predisposition.9,10 Using different genotyping platforms, Illumina Infinium HD Human 370Duo BeadChips9 and Affymetrix 500K Arrays,10 both studies found strong associations between variants at 10q21.2 (ARID5B) and 7p12.2 (IKZF1) and childhood ALL risk. Both ARID5B and IKZF1 are involved in transcriptional regulation and differentiation of B-lymphocyte progenitors. These studies also pointed to CEBPE (14q11.2), DDC (7p12.2) and OR2C3 (1q44) as potential ALL susceptibility loci and indicated that common germline variants within the 5 loci identified may be associated with specific ALL subtypes. Follow-up studies confirmed that variants at loci 10q21.2, 7p12.2 and 14q11.2 are involved in B-cell ALL.11 Variation in ARID5B was shown to contribute to ALL risk across different racial groups,12 further highlighting the importance of this gene in the etiology of childhood ALL.
We attempted to replicate 15 of the initial GWAS signals from the Papaemmanuil et al. and Trevino et al. studies in a French-Canadian cohort consisting of 284 B-precursor ALL cases and 270 healthy controls from the Quebec Childhood ALL (QcALL) study. We replicated the association of 5 SNPs within the ARID5B gene, further confirming the implication of this gene in B-cell ALL. Our work provides a strong rationale for additional studies to identify the causal variants at this candidate risk locus. This is the first replication study that has attempted to replicate association signals from both initial GWASs in an independent population and the first to report a putative gender-specific effect of ARID5B on ALL risk in males.
Design and Methods
Study subjects
Our cohort consisted of 284 childhood B-cell ALL patients and 270 healthy controls. In addition, parental DNA was available for 203 of the probands. Study subjects were French-Canadians of European descent from the established Quebec Childhood ALL (QcALL) cohort.4,8 Briefly, incident childhood pre-B ALL cases were diagnosed in the Hematology-Oncology Unit of Sainte-Justine Hospital, Montreal, Canada, between October 1985 and November 2006. The current study sample includes 170 males and 114 females with a median age of 4.2 years. This patient cohort is representative of the childhood pre-B ALL population; patients’ clinical characteristics are shown in Table 1. Healthy controls, 152 males and 118 females with a median age of 30.1 years, consisted of a group of newborns and adults recruited through clinical departments other than the Hematology-Oncology Unit, Sainte-Justine Hospital. Peripheral blood or bone marrow (samples in remission) was collected from all participants and DNA was extracted as previously described.13 The Institutional Review Board approved the research protocol and informed consent was obtained from all participants and/or their parents.
SNP genotyping and quality control checks
SNPs were genotyped using the Luminex xMAP/Autoplex Analyser CS1000 system (Perkin Elmer, Waltham, MA, USA). The 15 selected SNPs were amplified in a single multiplexed assay and hybridized to Luminex MicroPlex –xTAG Microspheres14 for genotyping using allele-specific primer extension (ASPE). The PCR and TAG-ASPE primers are shown in the Online Supplementary Table S1; amplification and reaction conditions are available upon request. Allele calls were assessed and compiled using the Automatic Luminex Genotyping software (M. Bourgey et al., manuscript submitted, 2009). The average genotype call rate for the 15 SNPs was 97.0%. Hardy-Weinberg equilibrium (HWE) was tested using the χ goodness of fit test and PedCheck (Version 1.1) was used to identify genotype incompatibilities using the familial data;15 inconsistent case-parent trios were removed from the analysis.
Statistical analysis
Statistical analyses were performed using STATA/IC Version 10.1 (StataCorp, College Station, TX, USA). Pearson’s χ test or Fisher’s exact test, as appropriate, was used to compare allele/genotype/haplotype carriership in patients and controls. Crude odds ratios (ORs) were measured using logistic regression and are given with 95% confidence intervals (CIs). Pairwise linkage disequilibrium (LD) estimates were measured in STATA. We assessed gender-specific associations through stratified analysis comparing male cases to male controls or female cases to female controls, and the Mantel-Haenszel (MH) χ test of homogeneity was used to test for significant risk differences between males and females. Haplotype reconstruction was performed using the FAMHAP Software (Version 16) using parental data when available;16 incorporating genotype information of related individuals increases precision of haplotype reconstruction and frequency estimation.16–18 Logistic regression was used to estimate haplotype-specific odds ratios using the most common haplotype as reference and a likelihood ratio test implemented in FAMHAP was used to test for global haplotype association with disease status. Multiple testing corrections were performed using the Benjamini-Hochberg false discovery rate (FDR) method with a type I error rate of 5%; nominal P values are shown.
Results and Discussion
We genotyped the top 10 SNPs from Papaemmanuil et al. (GWA1)9 and 5 SNPs from Trevino et al. (GWA2)10 in a French-Canadian cohort of European descent. The distribution of genotype frequencies in all 15 SNPs were in HWE (P>0.05). Risk allele frequencies were similar to those observed in the European populations of both GWAs9,10 (Online Supplementary Table S2).
Univariate analysis showed highly significant allelic associations within chromosomal region 10q21.2. The 5 SNPs from this region annotated the ARID5B gene and were strongly associated with B-cell ALL risk in our cohort; odds ratio estimates were in the same direction and were of similar strength as those previously reported (Online Supplementary Table S2). SNPs rs10994982, rs10740055, rs10821936 and rs7089424 span a 42kb region in intron 3 of ARID5B whereas SNP rs7073837 is located in intron 2. rs10821936, the strongest association signal from GWA2, was the most significant signal in our study (P=1.7×10). rs10821936 (P=3.6×10), the second-strongest association signal in our study, is in strong LD with rs7089424 (r2 = 0.95). SNPs rs10994982 (GWA1) and rs10740055 (GWA2) were highly correlated (r = 0.93) and strongly associated with childhood B-cell ALL (P= 3.8×10 and P= 1.6×10, respectively). rs7073837 was in moderate LD with SNP pairs rs10821936-rs7089424 and rs10994982-rs10740055 (r of 0.65 and 0.72, respectively) and was also associated with disease (P= 4.2×10). These 5 SNP associations withstood multiple testing corrections and remained significant after controlling for a false discovery rate of 5%. Using subtype analysis, we confirmed that ARID5B SNPs were significantly associated with B-hyperdiploid ALL (P values ≤2.0×10)9 (Online Supplementary Table S2).
We were unable to replicate the reported associations with IKZF1, DDC, and CEBPE; nor did we find an association of OR2C3 with t(12;21)/ETV6-RUNX1 ALL (Online Supplementary Table S2). Lack of confirmation of association with chromosomal region 7p12.2 was surprising given the strong statistical association observed in both original GWAs and the convincing support of a recent follow-up study conducted in a large German case-control cohort.11 Risk allele frequencies in cases did not differ between cohorts, therefore, failure to replicate is unlikely due to genetic heterogeneity. The most likely explanation for the lack of replication is the limited power of our study to detect loci with weaker effects. With our limited sample size, we had 80% power at the 5% level to detect a minimum odds ratio of 1.8 with RAFs of 20% or over and of 2.1 with RAFs of 10% or over. Lack of replication could also partially reflect the complexity underlying ALL pathogenesis; for example, molecular characterization of the disease might differ across studies. Further association studies with larger case-control samples and detailed subgroup analysis are required to investigate whether the associations between 7p12.2 (IKZF1 and DDC) and 14q11.2 (CEBPE) hold true.
To further describe the observed ARID5B associations and capture associations under various genetic models, we measured the corresponding genotype odds ratios in all samples, as well as in males and females separately. Carriers of a homozygous risk genotype at SNPs rs7073837, rs10994982 and rs10740055 had over a 2-fold increase in B-cell ALL risk. A strong allele dose-dependent effect on risk was observed at loci rs10821936 and rs7089424 (P trend = 7.4×10 and 1.7×10, respectively) (Table 2). Significant risk differences were found between males and females at loci rs10994982 and rs10740055: a 3.8-fold and 4.4-fold increase in risk was observed in male carriers of the homozygous risk genotypes, respectively, while no significant effect was observed in females (MH P values < 0.03) (Table 2). Although the effects of rs10821936 and rs7089424 were more marked in males, the gender difference was not significant at these loci (P values ≥ 0.15).
We observed similar gender-biases among hyperdiploid ALL patients for variants rs10994982 (Males AA vs. GG: OR(95%C.I.)= 6.25(2.40–17.49); Females AA vs. GG: OR(95%C.I.)= 1.12(0.33–3.96); MH P=0.017), and rs10740055 (Males CC vs. AA: OR(95%C.I.)= 6.90(2.43–22.18); Females CC vs. AA: OR(95%C.I.)= 1.66(0.49–6.14); MH P= 0.059) (data not shown). However, the wide confidence intervals caused by overstratification of the data emphasize the uncertainty of the risk estimates.
Finally, we performed multivariate haplotype analysis for ARID5B (Online Supplementary Table S3). Fifteen different haplotypes could be inferred, but only 4 haplotypes had frequencies of 0.05 or over and represented approximately 96% of the observed haplotypes in our sample. The remaining 4% of the chromosomes carried 11 minor haplotypes. We found a significant difference in the overall distribution of the 15 ARID5B-derived haplotypes between B-cell ALL cases and controls (Global χ = 45.03, 14 degrees of freedom, P= 4.0×10). The most frequent haplotype among controls (50%) carried non-risk alleles at all 5 of the ARID5B loci (CGATT) whereas the complementary haplotype, formed by the risk alleles of these polymorphisms (AACCG) was the most abundant haplotype among cases (46.25%). Through haplotype-specific tests we showed that the risk haplotype AACCG was associated with a near 2-fold increase in B-cell ALL susceptibility (OR(95%CI) = 1.93(1.47–2.53), P=7.6×10) (Online Supplementary Table S3). And when stratified by gender, similar results were observed in the male subgroup only. The haplotype analysis further demonstrates that the associations observed at the 5 individual ARID5B loci are not independent and likely reflect a single association signal.
Our replication data confirms that ARID5B is a novel susceptibility factor for childhood B-cell ALL and corroborate previous findings of a putative selective effect for B-cell precursor ALL with hyperdiploidy. We also report a gender-specific effect of ARID5B SNPs on ALL risk in males. ARID5B plays a vital role in the regulation of embryonic development and cell growth and differentiation through tissue-specific repression of differentiation-specific gene expression.19,20 Aberrant ARID5B expression in the developing fetus could halt B-lymphocyte maturation and contribute to leukemogenesis. B-cell ALL incidence is higher in males and though our data suggest a gender bias in the effect of ARID5B variation on disease risk, the link between ARID5B and increased risk of leukemia among males remains to be determined. Given the combined statistical significance of association of this region, re-sequencing and functional analyses are now required to identify the causal variants at the 10q21.2 locus. Better elucidation of the mechanisms through which ARID5B variants are involved in childhood ALL could be of great diagnostic value and help guide risk-directed therapy, ultimately improving disease management and outcome.
Acknowledgments
the authors are indebted to all the patients and their parents who consented to participate in this study. Funding: this study was supported by research funds provided by the Canadian Institutes for Health Research. JH is the recipient of an NSERC Canada Graduate’s scholarship. MB is the recipient of a fellowship from the Cole Foundation. DS holds the François-Karl Viau Chair in Pediatric Oncogenomics and is a scholar of the Fonds de la Recherche en Santé du Québec.
Footnotes
- The online version of this article has a Supplementary Appendix.
- Authorships and Disclosures DS is the principal investigator and takes primary responsibility for the paper. JH, CR, EK and DS contributed to the conception and design of the study. CR performed the genotyping. JH performed the statistical analyses and wrote the paper. MB contributed to the interpretation of the data. EK was involved in critical manuscript revision and editing. All authors approved the final version.
- The information provided by the authors about contributions from persons listed as authors and in acknowledgments is available with the full text of this paper at www.haematologica.org.
- Financial and other disclosures provided by the authors using the ICMJE (www.icmje.org) Uniform Format for Disclosure of Competing Interests are also available at www.haematologica.org.
- Received January 14, 2010.
- Revision received March 12, 2010.
- Accepted March 17, 2010.
References
- Pieters R, Carroll WL. Biology and treatment of acute lymphoblastic leukemia. Pediatr Clin North Am. 2008; 55(1):ix-20. PubMedhttps://doi.org/10.1016/j.pcl.2007.11.002Google Scholar
- Mody R, Li S, Dover DC, Sallan S, Leisenring W, Oeffinger KC. Twenty-five-year follow-up among survivors of childhood acute lymphoblastic leukemia: a report from the Childhood Cancer Survivor Study. Blood. 2008; 111(12):5515-23. PubMedhttps://doi.org/10.1182/blood-2007-10-117150Google Scholar
- Greaves M. Molecular genetics, natural history and the demise of childhood leukaemia. Eur J Cancer. 1999; 35(14):1941-53. PubMedhttps://doi.org/10.1016/S0959-8049(99)00296-8Google Scholar
- Krajinovic M, Labuda D, Richer C, Karimi S, Sinnett D. Susceptibility to childhood acute lymphoblastic leukemia: influence of CYP1A1, CYP2D6, GSTM1, and GSTT1 genetic polymorphisms. Blood. 1999; 93(5):1496-501. PubMedGoogle Scholar
- Krajinovic M, Sinnett H, Richer C, Labuda D, Sinnett D. Role of NQO1, MPO and CYP2E1 genetic polymorphisms in the susceptibility to childhood acute lymphoblastic leukemia. Int J Cancer. 2002; 97(2):230-6. PubMedhttps://doi.org/10.1002/ijc.1589Google Scholar
- Batar B, Guven M, Baris S, Celkan T, Yildiz I. DNA repair gene XPD and XRCC1 polymorphisms and the risk of childhood acute lymphoblastic leukemia. Leuk Res. 2009; 33(6):759-63. PubMedhttps://doi.org/10.1016/j.leukres.2008.11.005Google Scholar
- Petra BG, Janez J, Vita D. Gene-gene interactions in the folate metabolic pathway influence the risk for acute lymphoblastic leukemia in children. Leuk Lymphoma. 2007; 48(4):786-92. PubMedhttps://doi.org/10.1080/10428190601187711Google Scholar
- Healy J, Belanger H, Beaulieu P, Lariviere M, Labuda D, Sinnett D. Promoter SNPs in G1/S checkpoint regulators and their impact on the susceptibility to childhood leukemia. Blood. 2007; 109(2):683-92. PubMedhttps://doi.org/10.1182/blood-2006-02-003236Google Scholar
- Papaemmanuil E, Hosking FJ, Vijayakrishnan J, Price A, Olver B, Sheridan E. Loci on 7p12.2, 10q21.2 and 14q11.2 are associated with risk of childhood acute lymphoblastic leukemia. Nat Genet. 2009; 41(9):1006-10. PubMedhttps://doi.org/10.1038/ng.430Google Scholar
- Trevino LR, Yang W, French D, Hunger SP, Carroll WL, Devidas M. Germline genomic variants associated with childhood acute lymphoblastic leukemia. Nat Genet. 2009; 41(9):1001-5. PubMedhttps://doi.org/10.1038/ng.432Google Scholar
- Prasad RB, Hosking FJ, Vijayakrishnan J, Papaemmanuil E, Kohler R, Greaves MF. Verification of the susceptibility loci on 7p12.2, 10q21.2 and 14q11.2 in precursor B-cell acute lymphoblastic leukemia of childhood. Blood. 2010; 115(9):1765-7. PubMedhttps://doi.org/10.1182/blood-2009-09-241513Google Scholar
- Yang W, Trevino LR, Yang JJ, Scheet P, Pui CH, Evans WE. ARID5B SNP rs10821936 is associated with risk of childhood acute lymphoblastic leukemia in blacks and contributes to racial differences in leukemia incidence. Leukemia. 2010; 24(4):894-6. PubMedhttps://doi.org/10.1038/leu.2009.277Google Scholar
- Baccichet A, Qualman SK, Sinnett D. Allelic loss in childhood acute lymphoblastic leukemia. Leuk Res. 1997; 21(9):817-23. PubMedhttps://doi.org/10.1016/S0145-2126(97)00075-1Google Scholar
- Koo SH, Ong TC, Chong KT, Lee CG, Chew FT, Lee EJ. Multiplexed genotyping of ABC transporter polymorphisms with the Bioplex suspension array. Biol Proced Online. 2007; 9:27-42. PubMedGoogle Scholar
- O’Connell JR, Weeks DE. PedCheck: a program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet. 1998; 63(1):259-66. PubMedhttps://doi.org/10.1086/301904Google Scholar
- Becker T, Knapp M. Maximum-likelihood estimation of haplotype frequencies in nuclear families. Genet Epidemiol. 2004; 27(1):21-32. PubMedhttps://doi.org/10.1002/gepi.10323Google Scholar
- Schaid DJ. Relative efficiency of ambiguous vs. directly measured haplotype frequencies. Genet Epidemiol. 2002; 23(4):426-43. PubMedhttps://doi.org/10.1002/gepi.10184Google Scholar
- Schouten MT, Williams CK, Haley CS. The impact of using related individuals for haplotype reconstruction in population studies. Genetics. 2005; 171(3):1321-30. PubMedhttps://doi.org/10.1534/genetics.105.042762Google Scholar
- Huang TH, Oka T, Asai T, Okada T, Merrills BW, Gertson PN. Repression by a differentiation-specific factor of the human cytomegalovirus enhancer. Nucleic Acids Res. 1996; 24(9):1695-701. PubMedhttps://doi.org/10.1093/nar/24.9.1695Google Scholar
- Wilsker D, Patsialou A, Dallas PB, Moran E. ARID proteins: a diverse family of DNA binding proteins implicated in the control of cell growth, differentiation, and development. Cell Growth Differ. 2002; 13(3):95-106. PubMedGoogle Scholar