Background: Childhood maltreatment is associated with poor mental and physical health. However, the mechanisms of gene–environment correlations and the potential causal effects of childhood maltreatment on health are unknown. Using genetics, we aimed to delineate the sources of gene–environment correlation for childhood maltreatment and the causal relationship between childhood maltreatment and health. Methods: We did a genome-wide association study meta-analysis of childhood maltreatment using data from the UK Biobank (n=143 473), Psychiatric Genomics Consortium (n=26 290), Avon Longitudinal Study of Parents and Children (n=8346), Adolescent Brain Cognitive Development Study (n=5400), and Generation R (n=1905). We included individuals who had phenotypic and genetic data available. We investigated single nucleotide polymorphism heritability and genetic correlations among different subtypes, operationalisations, and reports of childhood maltreatment. Family-based and population-based polygenic score analyses were done to elucidate gene–environment correlation mechanisms. We used genetic correlation and Mendelian randomisation analyses to identify shared genetics and test causal relationships between childhood maltreatment and mental and physical health conditions. Findings: Our meta-analysis of genome-wide association studies (N=185 414) identified 14 independent loci associated with childhood maltreatment (13 novel). We identified high genetic overlap (genetic correlations 0·24–1·00) among different maltreatment operationalisations, subtypes, and reporting methods. Within-family analyses provided some support for active and reactive gene–environment correlation but did not show the absence of passive gene–environment correlation. Robust Mendelian randomisation suggested a potential causal role of childhood maltreatment in depression (unidirectional), as well as both schizophrenia and ADHD (bidirectional), but not in physical health conditions (coronary artery disease, type 2 diabetes) or inflammation (C-reactive protein concentration). Interpretation: Childhood maltreatment has a heritable component, with substantial genetic correlations among different operationalisations, subtypes, and retrospective and prospective reports of childhood maltreatment. Family-based analyses point to a role of active and reactive gene–environment correlation, with equivocal support for passive correlation. Mendelian randomisation supports a (primarily bidirectional) causal role of childhood maltreatment on mental health, but not on physical health conditions. Our study identifies research avenues to inform the prevention of childhood maltreatment and its long-term effects. Funding: Wellcome Trust, UK Medical Research Council, Horizon 2020, National Institute of Mental Health, and National Institute for Health Research Biomedical Research Centre.
In this GWAS meta-analysis, we had access to individual-level data from 159 124 participants from four datasets: 143 473 participants from the UK Biobank (birth year 1936–70)28; 5400 from the Adolescent Brain Cognitive Development Study (birth year 2006–08)29; 8346 from the Avon Longitudinal Study of Parents and Children (ALSPAC; birth year 1991–92);30, 31 and 1905 from the Generation R Study (birth year 2002–06).32, 33 In addition, we obtained summary GWAS statistics for the Psychiatric Genomics Consortium (PGC_26K) dataset (n=26 290 from 18 cohorts9). All participants (N=185 414) were primarily of European genetic ancestry and provided information on childhood maltreatment. Further details are shown in the appendix (pp 5–7, 61–63). Ethical approval was obtained from the Human Biology Research Ethics Committee, University of Cambridge (Cambridge, UK), and from the local cohort ethics committees. We define childhood maltreatment here as consisting of emotional, sexual, and physical abuse, and emotional and physical neglect. All retrospective reports of childhood maltreatment were self-reported, whereas most prospective reports of childhood maltreatment were reported by a parent or caregiver. Participants in the UK Biobank completed the retrospectively reported five-item Childhood Trauma Screener.34 This assessment consists of one question for each of the five trauma subtypes, each ranging from 0 (never true) to 4 (very often true), with total scores ranging from 0–20. Participants in the PGC_26K completed different retrospective questionnaires on childhood maltreatment, which included questions only on sexual, physical, and emotional abuse. Total scores differed between the questionnaires, but all were coded continuously. In the Adolescent Brain Cognitive Development Study, parent-completed information on prospectively reported childhood maltreatment was used. This assessment comprised 13 questions on childhood maltreatment from the Kiddie Schedule for Affective Disorders and Schizophrenia-PTSD35 and the Children’s Report of Parental Behavior Inventory,36 with scores ranging from 0–13. We used continuous scores that were rank-based inverse normal transformed for the analyses. In ALSPAC, childhood maltreatment was prospectively recorded using multiple questionnaires at multiple instances (majority parent-report, several self-report), detailed elsewhere.16 Scores were binarised with any instance of childhood maltreatment indicated as 1 and no report of childhood maltreatment indicated as 0, in line with previous analyses.16 In Generation R, childhood maltreatment was prospectively measured using mother-completed questionnaires including items on the Life Event and Difficulty Schedule,37 with scores ranging from 0–2, which were continuously coded. Further details are provided in the appendix (pp 7–10, 27, 61–63). With the UK Biobank data, we investigated heritability of and genetic correlations among different subtypes and operationalisations of childhood maltreatment (appendix pp 7–10). For subtype analyses, we binarised the five phenotypes (sexual, emotional, and physical abuse, and emotional and physical neglect) due to skewness in the phenotypes. We defined four operationalisations of childhood maltreatment: a log-transformed sum-score of childhood maltreatment; a binary maltreatment score (0 vs any); a binary severe maltreatment score, in which scores 2–4 for each individual item were recorded as 1; and a binary severe childhood abuse score, which is the same as above but restricted to the three abuse items only. For each GWAS, age and sex were included as covariates. The primary GWAS in the UK Biobank was done using the BOLT-LMM, version 2.3.4, algorithm,38 with batch included as an additional covariate. GWAS in ALSPAC was done using BOLT-LMM, version 2.3.4,38 GWAS in Adolescent Brain Cognitive Development Study was done using FastGWA (using GCTA version 1.93.2),39 and we used linear regression using Plink,40 version 2.0, for Generation R. The linear mixed-effects models used in the Adolescent Brain Cognitive Development Study, UK Biobank, and ALSPAC account for both population stratification and relatedness. However, we included genetic principal components (up to 20) to accelerate the model identification process.38 In Generation R, we included five genetic principal components as covariates to account for population stratification (appendix pp 10–13). We confirmed that there was no evidence of inflation in the test statistics of the GWAS by using the linkage disequilibrium score regression (LDSC)-based intercept.41 Sample-size weighted meta-analysis was done in METAL, version 2011-03-25.42 We did three meta-analyses. First, we meta-analysed the UK Biobank and the PGC_26K datasets to obtain a GWAS of retrospectively reported childhood maltreatment (GWASretrospective). Next, we meta-analysed Adolescent Brain Cognitive Development Study, ALSPAC, and Generation R datasets to obtain a GWAS of prospectively reported childhood maltreatment (GWASprospective). Finally, we meta-analysed all five datasets to obtain a GWAS of childhood maltreatment (GWASchildhoodmaltreatment). Independent significant loci were identified at a GWAS threshold of p<5 × 10–8, after clumping (r2=0·1, 1000 kb), using the linkage disequilibrium weights generated from the European subset of the 1000 Genomes phase 3 dataset43 in Plink, version 1.9. Functional annotation of top loci was done using expression quantitative trait loci data from GTEx,44 BRAINEAC,45 CommonMind Consortium,46 and PsychEncode,47 Hi-C data, positional mapping, and associations with other health-related phenotypes all using FUMA.48 Gene identification was done using MAGMA,49 with the summary GWAS statistics. We identified significantly associated genes after Bonferroni correction. Enrichment for tissues and cell types was done using MAGMA and LDSC-specifically expressed genes (SEG),50 using summary GWAS statistics (appendix p 14). Specifically, using LDSC-SEG, we investigated enrichment for tissue-specific chromatin marks (ENCODE51 and Roadmap Epigenomics Project52) and gene expression (GTEx), and corrected each of these analyses for multiple testing using Benjamini-Hochberg false discovery rate correction to account for the correlated nature of the variables tested and the enrichment for gene expression using Bonferroni correction. Using MAGMA, we investigated enrichment for genes with tissue-type (GTEx44) and cell-type specific expression in neuronal cell types (PsychEncode47), and corrected each of these analyses for multiple testing using Bonferroni correction. SNP heritability of and genetic correlations between operationalisations and subtypes of childhood maltreatment were done using GCTA-GREML, version 1.93.253 in a random subset of 19 559 unrelated individuals from the UK Biobank (grm-cutoff for relatedness=0·05). For all analyses, we included year of birth, sex, genotyping batch, and the first 20 genetic principal components as covariates. Results were corrected for multiple testing using Benjamini-Hochberg false discovery rate correction owing to the correlated nature of the phenotypes. Heritability analyses of the meta-analysed GWAS and other genetic correlations were done using LDSC54 (appendix p 13), focusing on mental and physical health conditions, and psychological, behavioural, and anthropometric traits. We corrected for the 97 phenotypes tested using Bonferroni correction. We investigated the variance explained by polygenic score (PGS) in two cohorts—a hold-out sample from the UK Biobank and ALSPAC. PGSs were calculated using PRSice-255 (clump r2=0·1, 250kb, appendix pp 15–19). Only autosomes were included in the calculation of PGS as there is no consensus for how to handle sex chromosomes in PGS analyses.56 To identify variance explained by the GWAS of childhood maltreatment, we did PGS analyses using GWASretrospective (base sample) in a hold-out sample of 12 855 individuals from the UK Biobank (target sample). This process was done by doing a second GWAS of childhood maltreatment in the UK Biobank excluding the hold-out sample (log-transformed sum-score, n=130 618) and meta-analysing the results with the PGC_26K. PGSs were generated in 9924 unrelated individuals from this subset of 12 855 individuals at 11 p value thresholds. We included birth year, sex, genotyping batch, and the first 20 genetic principal components as covariates, and additionally Townsend Deprivation Index in a second model. We corrected for multiple testing using Benjamini-Hochberg false discovery rate correction. In ALSPAC (target sample), we regressed the PGS for GWASretrospective (base sample) against binarised childhood maltreatment (prospectively reported) at four age groups (0–17, 0–4·9, 5–10·9, 11–17 years), with age (in months), sex, and the first ten genetic principal components as ALSPAC (maximum N=7453). We used only ten principal components as ALSPAC is a geographically—and subsequently, genetically—more homogenous cohort than the UK Biobank, and previous research has found poor evidence for population stratification in ALSPAC.57 We restricted the PGS analyses in ALSPAC to PGS at a p value threshold of 1 as scores calculated at this threshold explained the highest variance in the hold-out sample from the UK Biobank. We corrected for the four different timepoints using Benjamini-Hochberg false discovery rate correction. We used three methods to delineate the contribution of different gene–environment correlation mechanisms to childhood maltreatment: (1) comparing between-sibling and between-family effects; (2) polygenic transmission disequilibrium tests in two autism cohorts; and (3) investigating the variance explained by PGS for retrospective childhood maltreatment in ALSPAC after accounting for well known familial risk factors for childhood maltreatment. To quantify the variance explained by passive gene–environment correlation and by active and reactive gene–environment correlation combined, we simultaneously investigated between-sibling and between-family effects of PGS (base sample GWASchildhoodmaltreatment)58, 59 in a hold-out sample of 12 855 individuals from the UK Biobank (including 2849 sibling pairs, target sample; appendix pp 15–19), using a mixed-effects regression model with the following equation: where βbsib is the between-sibling effect of PGS (representing reactive and active gene–environment correlation combined); βbfam is the between-family effects; and (PGSj) is the family-mean PGS. Covariates included were age, sex, genotyping batch, and 20 genetic principal components. Passive gene–environment correlation is estimated from the difference between βbsib and βbfam. SE was calculated by 10 000 bootstraps. In the UK Biobank, the family mean is the sibling mean. Although between-sibling PGS analysis assumes that a proportion of the familial environment is shared between siblings, this assumption might not always be true. An example is siblings who are discordant for neurodevelopmental conditions such as autism or ADHD. Because of the different support needs of the siblings, parental response to the two siblings will be different. This difference in familial environment between siblings indexes reactive gene–environment correlation, and to an extent active gene–environment correlation. To quantify this difference in gene–environment correlation, we did polygenic transmission disequilibrium tests60 in two cohorts: Simons Simplex Collection61 (n=2234 autistic individuals and n=1829 non-autistic siblings) and SPARK62 (n=2957 autistic individuals and n=1567 non-autistic siblings; appendix p 19) to investigate over-transmission of PGS for childhood maltreatment to autistic individuals versus non-autistic siblings.15, 60 We use the term autistic as identity-first language is preferred by many autistic individuals. Finally, we repeated the PGS analyses in ALSPAC, as outlined earlier, after including in separate models four parental risk factors5 of childhood maltreatment (ie, smoking, alcohol consumption, depression, and parental maltreatment). The risk factors were all measured prenatally to minimise the influence of a child's behaviour on parental phenotypes, leading to reactive gene–environment correlation (N=5988 to N=4508; see appendix p 70 for individual sample sizes). We corrected for multiple testing using Benjamini-Hochberg false discovery rate correction for each of the risk factors tested. Incomplete attenuation of the effects of PGS after accounting for known parental risk factors provided support for the active and reactive gene–environment correlation. We adjusted for eight phenotypes: parental depression assessed in the first trimester of pregnancy; parental depressive symptoms assessed in the second trimester; maternal depressive symptoms assessed in the third trimester and paternal depressive symptoms assessed in the second trimester; parental alcohol consumption assessed in the second trimester; parental alcohol consumption assessed in the third trimester; parental smoking assessed in the second trimester; parental smoking assessed in the third trimester; and parental history of childhood maltreatment, assessed across all trimesters. We did two-sample, bidirectional Mendelian randomisation analyses63, 64 (appendix p 20) between childhood maltreatment and selected mental health outcomes (schizophrenia,65 major depressive disorder,66 bipolar disorder,67 ADHD,68 and autism69), physical health conditions (coronary artery disease70 and type 2 diabetes71), and C-reactive protein as a marker of inflammation,72 and corrected with the Bonferroni correction. These phenotypes have all been associated with childhood maltreatment, and their GWAS does not include the UK Biobank, reducing bias in Mendelian randomisation estimates due to sample overlap.73 Bidirectional Mendelian randomisation was done using the following methods: inverse variance-weighted Mendelian randomisation, which assumes that all SNPs are valid instruments; median-weighted, which provides valid estimates even if up to 50% of the instruments are invalid;74 Mendelian randomisation-Egger, which accounts for pleiotropy by including an intercept term in the inverse variance-weighted model;75 and Mendelian randomisation-PRESSO, which accounts for pleiotropy by detecting and removing outliers.76 We additionally did leave-one-out analysis as a sensitivity check to investigate it the effects are driven by a subset of the variants. Scripts and Summary GWAS statistics are available online. The funders of the study had no role in study design, data analysis, data interpretation, or writing of the report.