Background: The African continent hosts many industrial mining projects, and many more are planned due to recent prospecting discoveries and increasing demand for various minerals to promote a low-carbon future. The extraction of natural resources in sub-Saharan Africa (SSA) represents an opportunity for economic development but also poses a threat to population health through rapid urbanisation and environmental degradation. Children could benefit from improved economic growth through various channels such as access to high-quality food, better sanitation, and clean water. However, mining can increase food insecurity and trigger local competition over safe drinking water. Child health can be threatened by exposure to mining-related air, noise, and water pollution. To assess the impact of mines on child health, we analyse socio-demographic, health, and mining data before and after several mining projects were commissioned in SSA. Results: Data of 90,951 children living around 81 mining sites in 23 countries in SSA were analysed for child mortality indicators, and 79,962 children from 59 mining areas in 18 SSA countries were analysed for diarrhoea, cough, and anthropometric indicators. No effects of the launch of new mining projects on overall under-five mortality were found (adjusted Odds Ratio (aOR): 0.88; 95% Confidence Interval (CI): 0.68–1.14). However, activation of mining projects reduced the mortality risk among neonates (0–30 days) by 45% (aOR: 0.55; 95% CI: 0.37–0.83) and risk for a child to develop diarrhoeal diseases by 32% (aOR: 0.68; 95% CI: 0,51–0.90). The timing analysis of observed changes showed that there is a significant decline in the risk for childhood diarrhoea (aOR: 0.69; 95% CI: 0.49–0.97), and the mean height-for-age z-scores by 28 percentage points, during the prospection and construction phase; i.e., within four years to the initiation of extraction activity. No effects were found for cough and weight-for-height. Conclusion: The results presented suggest that the impacts of mining on child health vary throughout the mine’s life cycle. Mining development likely contributes positively to the income and livelihoods of the impacted communities in the initial years of mining operations, particularly the prospection and construction phase; these potential benefits are likely to be at least partially offset by food insecurity and environmental pollution during early and later mining stages, respectively. Further research is warranted to better understand these health impacts and to identify policies that can help sustain the positive initial health impacts of mining projects in the long term.
This study was conducted by combining two different georeferenced data sources, namely: (i) the socio-demographic and health data from Demographic and Health Survey (DHS) and (ii) mining data from the Standard & Poor’s Global Market Intelligence (S&P GMI) Mining Database [3]. Both data sets were restricted to SSA. The DHS program conducts nationally and regionally representative household survey data in over 70 low- and middle-income countries. The DHS surveys are conducted following a two-stage cluster random sampling strategy, randomly selecting households within randomly selected enumeration areas. In most countries, DHS surveys are conducted every 4–6 years. The survey datasets are available on request on the website of the DHS program (www.dhsprogram.com). For this study, we use data from all DHS standard surveys from SSA for which geographic coordinates were available as of March 2020 (see Fig. 1, panel A). All household and child datasets were combined with the corresponding geographic data to merge with the mining data. Of note, the DHS program introduced random noise to the cluster coordinates to ensure the privacy of the respondents: in urban settings, clusters’ coordinates are shifted up to 2 km (km), and in rural areas, clusters are typically displaced by 5 km. Spatial distribution of mines (panel A) and visualisation selected DHS clusters (panel B) The proprietary mining dataset was accessed through a subscription to the S&P Global Market Intelligence platform (www.spglobal.com) [3]. The mining data comprises four primary indicators: geographic point location (Global positioning system, GPS) coordinates, extracted commodities, and historic mining activities between 1980 and 2019 (e.g., mine opening and closure years). We set the year of mine activation (i.e., initiation of exploration and evaluation activities) at 10 years before the reported extraction onset, i.e., the earliest year of the operation phase with reported extraction or production. We did this, aiming to include the prospection and construction phase of the project. We created a sub-sample of mines that opened within the period during which DHS data were available (i.e. 1986–2019). Finally, mines located closer than 20 km from another mine were excluded to avoid overlapping impact areas (see Fig. Fig.1,1, panel B). Panel A of Fig. Fig.11 shows the 81 mines analysed by primary commodity extracted (coal (N = 5), diamonds (N = 7), metals (N = 59), and other mines (N = 10)). The GPS coordinates for each DHS survey cluster and the mine point locations were used to match all surveyed households and children to one or several mines. DHS clusters within 50 km of the distance of each mine were selected. Based on previous studies showing that impacts are centralised within 10 km from a mining, project we set the treatment group within this distance range [4, 6–8, 14, 17, 30, 31]. Hence, clusters within 10 km from the mine were classified as “impacted clusters” (or treated), while clusters at 10–50 km distance were classified as “comparison clusters” (or controls). To assess the impact of mine opening events on child health outcomes, we restricted our analysis to mines with DHS records before and after the mine opening year. Figure 1 exemplifies the selection of data around mining projects in Sierra Leone. Figure 2 summarises the overall data set construction process. Data merging was done using ArcGIS Pro (Version 2.2.4, Environmental Systems Research Institute, Redlands, CA, USA). Dataset merging strategy. Note: children can occur in multiple comparisons. See a description of the matching process below This is a quasi-experimental difference-in-difference (DiD) study comparing child health outcomes in areas directly surrounding mines to more distant locations from the same regions before and after mine activation [32, 33]. The primary parameter of interest is the interaction term between the DHS cluster’s proximity to a mine and the post, i.e., observations made after the mine was activated. The interaction term estimates the additional change (improvement) in health outcomes seen in areas close to the mines relative to other areas nearby but outside of the direct influence of the mines. The resulting estimates can be given causal interpretation as long as the common trend assumption holds; i.e., as long as the treatment (within 10 km) and control areas (10–50 km from mines) would have experienced the same changes in health outcomes in the absence of the mining project. In the present study, our centred attention is three primary child health outcomes. Firstly, we analysed child mortality indicators. All DHS surveys record all children born to the mothers in the last five years and the time point of any child death. Based on the information for age-at-death included in the DHS data, we computed a dummy variable indicating age-specific survival status (i.e. died or alive) for neonates (0–30 days), post-neonates (1–11 months), and children (12–60 months). While we kept the original DHS definition for under-five and child mortality [34], we computed neonatal and post-neonatal mortalities as children who died before reaching the age of 1 and 12 months, respectively. To calculate post-neonatal and child mortality rates, we only included children that had survived the first month or the first year, respectively. Missing data for children’s age at death was imputed using a hot deck approach by taking the same age at death as the last child encountered the same birth order in the data file [35]. Secondly, we analysed child morbidity indicators. The DHS datasets include morbidity data for all children under-5 years living at the survey time. We used information on whether a child experienced diarrhoeal or cough episodes in the last two weeks before the survey date. Of note, “don’t know” responses were recoded into “missing values”. Thirdly, we analysed child anthropometrics data: to compute the z-scores of height-for-age, weight-for-height, and weight-for-age, DHS surveys collect data on height (in centimetres) and weight (in kilograms) for all living children aged under-five years in the household and the age of the child in months. Height-for-Age (HAZ), Weight-for-Age (WAZ), and Weight-for-Height (WHZ) z-scores were then calculated using standardised reference growth curves [35]. The primary exposure variable in our analyses was the interaction of the distance to the mine (impacted and comparison clusters) and the mine’s activity status at the year of childbirth (for child mortality) and the year of DHS survey (for morbidity and anthropometric indicators). Two variable definitions were used to determine the mine’s activity status. For the primary analyses, mine activation (including the planning, exploration, prospection, and construction activities) was assumed to be at 10 years before the launch year (year zero) of mineral extraction (from now on referred to as “extraction onset”). Therefore, children born or surveyed less than 10 years before the extraction onset or later were considered exposed to an active mine, while children born/surveyed before were used as the reference group. The active mining phase was further divided into four phases corresponding to 5-year intervals for secondary analysis. These phases were defined relative to the year of extraction onset, namely: (i) the planning phase – 9 to 5 years before the extraction onset, (ii) prospection and construction phase – 4 years to extraction onset year, (iii) early extraction phase – between 1 to 5 years after the extraction onset and (iv) advanced extraction phase – more than 5 years after the extraction onset. The last phase was summarised in one category due to the low sample size. As for the dichotomous temporal categories, the time before mining activation (i.e., 10 years or more before the extraction onset) was used as the reference group. Many covariates were included in the analysis to adjust for child, maternal, and household characteristics. Child-level covariates included sex, age in completed months, twin birth, and a child’s birth order. Child age and birth order variables were recoded into 5 and 6 categories. At the maternal level, the included covariates were the highest education level, maternal age in five year-groups, and the total number of children born to women. We merged the “higher education” with “secondary education” responses and dichotomised the number of children at a cut-off value of five and above. Lastly, the household characteristics included were wealth index quintile and household location (i.e., rural vs urban). Beyond covariates, we included the mine fixed effect term in all models to account for spatial (i.e., mine location) and year fixed effect to account for temporal (i.e., year of the survey and year of childbirth) variability. The descriptive statistics for child health outcomes and covariate variables were double stratified by mine activation status and the distance between the DHS cluster and the mine. Logistic maximum likelihood models for binary outcomes variables (i.e., mortality, diarrhoeal, and cough episodes) and ordinary least-squares linear regression models for continuous outcome variables (i.e., anthropometric z-scores) were estimated. The regressions control for child-, maternal- and household-level factors. In addition, mine and year (childbirth year for mortality outcomes and survey year for morbidity and anthropometric outcomes) are included as fixed effects, respectively. We assume that there are similar trends in the outcome variables across years in the absence of a causal effect induced by the presence of the mine activation [6, 7, 14, 33] and that the location of the mine projects and their activity status are not systematically correlated with other factors affecting our main outcome variables [33]. We tested this assumption by plotting child health outcomes stratified by DHS cluster’s proximity to the mine and mine activity status against the mine life stages periods. In the main analysis, we investigated the child health impact of mine activation using the interaction between the clusters’ distance to the mine and the dichotomous mine’s activity status at the year of childbirth for mortality analysis and the DHS survey year for morbidity and anthropomentric analysis (i.e., active vs non-active mine). This approach allowed us to compare the change in the prevalence of child health outcomes between the treatment group (interaction term takes the value one) and the control group (interaction term takes the value zero). For the secondary analysis, an alternative specification was used to investigate child health impact throughout the mine life stages (time-varying effects of mine exposure). For this purpose, the interaction term between the clusters’ distance to the mine and the four-phased mine’s activity status (planning, prospection and construction, early extraction, and advanced extraction phases) was used. In this approach, the prevalence of child health outcomes of each treatment group (interaction terms take values between 1 and 4) is compared against a unique control group (interaction term takes the value zero). Given that mines may affect populations beyond the predefined 10 km boundary, we explore alternative exposure definitions in our sensitivity analysis. Specifically, we exclude these areas from the analyses by introducing an increasingly large buffer of potentially affected areas (i.e. 10–15 km, 10–20 km, and 10–25 km) around our treatment areas. This should also reduce misclassification concerns related to up to 5 km random noise added to DHS cluster coordinates. The regression models were estimated using the statistical software STATA version 14.2 (Stata Corporation, LLC, College Station, TX, USA). Statistics are reported as Odds Ratio (OR; logistic regression) and beta coefficients (linear regression) where applicable, with 95% Confidence Intervals (95% CI) clustered at the survey-cluster level. P-values lower than 0.05 were considered significant.