Background Across low-income and middle-income countries (LMICs), one in ten deaths in children younger than 5 years is attributable to diarrhoea. The substantial between-country variation in both diarrhoea incidence and mortality is attributable to interventions that protect children, prevent infection, and treat disease. Identifying subnational regions with the highest burden and mapping associated risk factors can aid in reducing preventable childhood diarrhoea. Methods We used Bayesian model-based geostatistics and a geolocated dataset comprising 15 072 746 children younger than 5 years from 466 surveys in 94 LMICs, in combination with findings of the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2017, to estimate posterior distributions of diarrhoea prevalence, incidence, and mortality from 2000 to 2017. From these data, we estimated the burden of diarrhoea at varying subnational levels (termed units) by spatially aggregating draws, and we investigated the drivers of subnational patterns by creating aggregated risk factor estimates. Findings The greatest declines in diarrhoeal mortality were seen in south and southeast Asia and South America, where 54·0% (95% uncertainty interval [UI] 38·1-65·8), 17·4% (7·7-28·4), and 59·5% (34·2-86·9) of units, respectively, recorded decreases in deaths from diarrhoea greater than 10%. Although children in much of Africa remain at high risk of death due to diarrhoea, regions with the most deaths were outside Africa, with the highest mortality units located in Pakistan. Indonesia showed the greatest within-country geographical inequality; some regions had mortality rates nearly four times the average country rate. Reductions in mortality were correlated to improvements in water, sanitation, and hygiene (WASH) or reductions in child growth failure (CGF). Similarly, most high-risk areas had poor WASH, high CGF, or low oral rehydration therapy coverage. Interpretation By co-analysing geospatial trends in diarrhoeal burden and its key risk factors, we could assess candidate drivers of subnational death reduction. Further, by doing a counterfactual analysis of the remaining disease burden using key risk factors, we identified potential intervention strategies for vulnerable populations. In view of the demands for limited resources in LMICs, accurately quantifying the burden of diarrhoea and its drivers is important for precision public health.
Diarrhoea episodes were defined as three or more loose stools over a 24-h period.4 Diarrhoea prevalence was defined as the point prevalence of children younger than 5 years with diarrhoea. Incidence was defined as the number of cases of diarrhoea in children younger than 5 years per child per year. Mortality was defined as the number of deaths among children younger than 5 years due to diarrhoea per child per year. Rates per 1000 are presented in the figures and represent prevalence, incidence, or mortality rates per child multiplied by 1000). Diarrhoea burden is used throughout this Article to refer to the combined burden of prevalence, incidence, and mortality. We included 94 LMICs in our analysis; these countries were defined according to the Socio-demographic Index (SDI), which assesses development based on education, fertility, and income.24 Where appropriate, we use designated ISO 3166-1 alpha-3 codes for countries. Our study complies with the Guidelines for Accurate and Transparent Health Estimates Reporting (GATHER) recommendations (appendix 1 pp 84–85).25 We compiled 466 household surveys (including the Demographic and Health Survey [DHS], Multiple Indicator Cluster Survey [MICS], and other country-specific surveys) from 2000 to 2017 with geocoded information from 207 021 coordinates corresponding to survey clusters and 17 954 subnational polygon boundaries. We included surveys that asked if children younger than 5 years had diarrhoea, typically within the preceding 2 weeks. Potential bias attributable to seasonal variation in diarrhoea was addressed, as described in appendix 1 (p 5). Data were vetted for representativeness at the national level and subnational level, as appropriate. Data inclusion, coverage, and validation are further described in appendix 1 (pp 3, 9). We compiled 15 covariates that were indexed at the subnational level and could possibly be related to diarrhoea prevalence, including access to roads, ratio of child dependents (aged 0–14 years) to working-age adults (aged 15–64 years), distance from rivers or lakes, night-time lights (time-varying covariate), elevation, population ratio of women of maternal age to children, population (time-varying covariate), aridity (time-varying covariate), urban or rural (time-varying covariate), urban proportion of the location (time-varying covariate), irrigation, number of people whose daily vitamin A needs could be met, prevalence of under-5 stunting (time-varying covariate), prevalence of under-5 wasting (time-varying covariate), and diphtheria-tetanus-pertussis immunisation coverage (time-varying covariate). We also included the Healthcare Access and Quality Index,26 percentage of the population with access to improved toilet types, and percentage of the population with access to improved water sources (as defined by WHO and UNICEF’s Joint Monitoring Programme) as national-level time-varying covariates. We filtered these covariates for multicollinearity in each modelling region (appendix 1 pp 5–6) using variance inflation factor (VIF) analysis with a VIF threshold of 3.27 Covariate information, including plots of all covariates, is detailed in the appendix 1 (pp 25–26, 90–96). Prevalence data were used as inputs to a Bayesian model-based geostatistical framework. Briefly, this framework uses a spatially and temporally explicit hierarchical logistic regression model to predict prevalence. Potential interactions and non-linear relations between covariates and diarrhoea prevalence were incorporated using a stacked generalisation technique.28 Posterior distributions of all parameters and hyperparameters were estimated using R-INLA version 19.05.30.9000.29, 30 Uncertainty was calculated by taking 250 draws from the estimated posterior joint distribution of the model, and each uncertainty interval (UI) reported represents the 2·5th and 97·5th percentiles of those draws. Models were run independently in 14 geographically distinct modelling regions based on the GBD 2010 study,31 and one country-specific model in India. Analyses were done using R version 3.5.0. Maps were produced using ArcGIS Desktop 10.6. Additional details are provided in appendix 1 (pp 6–8). Estimated prevalence was converted into incidence using an average duration of a diarrhoea episode of 4·2 days4 (appendix 1 p 9). We converted incidence surfaces to mortality surfaces by multiplying the incidence values by country-specific and year-specific case-fatality rates (which did not vary subnationally). We calibrated our continuous prevalence estimates to those of prevalence, mortality, and incidence from GBD 2017. However, we did not calibrate prevalence or incidence in South Africa because of unreasonably low estimates in this location in the GBD 2017 study. We then calculated population-weighted aggregations of the 250 draws of diarrhoea prevalence, mortality, and incidence estimates at the country level, first administrative-level unit, and second administrative-level unit (hereafter referred to as unit). This calculation resulted in estimates for 24 143 units within 94 countries. Geographical inequalities were quantified as the relative difference between each unit and the respective country average. We also estimated inequality using the Gini coefficient,32 which summarises the distribution of each indicator across the population, with a value of 0 representing perfect equality and 1 representing maximum inequality (appendix 1 p 12). Following the GAPPD framework, we did a post-hoc counterfactual analysis using subnational estimates of risk factors according to GBD 2017, including reducing prevalence of childhood stunting and childhood wasting (protect), access to improved sanitation and improved water (prevent), and increasing ORS coverage (treat). Some known diarrhoea risk factors (eg, low coverage of rotavirus vaccine, or no or partial breastfeeding) were not included because subnational estimates are currently not available for all 94 LMICs included in this study. We used the counterfactual analysis to estimate the number of deaths averted because of changes in CGF and WASH risk factors (appendix 1 pp 61–62). Models were validated using source-stratified five-fold cross validation. Holdout sets were created by combining randomised sets of second administrative unit cluster-level datapoints. Model performance was summarised by the bias (mean error), total variance (root-mean-square error), 95% data coverage within prediction intervals, and correlation between observed data and predictions. When possible, estimates were compared against existing estimates. All validation procedures and corresponding results are provided in appendix 1 (p 9). The funder had no role in study design, data collection, data analysis, data interpretation, or writing of the report. RCR had full access to all data in the study and had final responsibility for the decision to submit for publication.