Background: What explains the underlying causes of educational inequalities in diarrhoea among under-five children in low- and middle-income countries (LMIC) is poorly exploited, operationalized, studied and understood. This paper aims to assess the magnitude of educational-related inequalities in the development of diarrhoea and decompose risk factors that contribute to these inequalities among under-five children (U5C) in LMIC. Methods: Secondary data of 796,150 U5C from 63,378 neighbourhoods in 57 LMIC was pooled from the Demographic and Health Surveys (DHS) conducted between 2010 and 2019. The main determinate variable in this decomposition study was mothers’ literacy levels. Descriptive and inferential statistics comprising of bivariable analysis and binary logistic multivariable Fairlie decomposition techniques were employed at p = 0.05. Results: Of the 57 countries, we found a statistically significant pro-illiterate odds ratio in 6 countries, 14 showed pro-literate inequality while the remaining 37 countries had no statistically significant educational-related inequality. The countries with pro-illiterate inequalities are Burundi (OR = 1.11; 95% CI: 1.01–1.21), Cameroon (OR = 1.84; 95% CI: 1.66–2.05), Egypt (OR = 1.26; 95% CI: 1.12–1.43), Ghana (OR = 1.24; 95% CI: 1.06–1.47), Nigeria (OR = 1.80; 95% CI: 1.68–1.93), and Togo (OR = 1.21; 95% CI: 1.06–1.38). Although there are variations in factors that contribute to pro-illiterate inequality across the 6 countries, the overall largest contributors to the inequality are household wealth status, maternal age, neighbourhood SES, birth order, toilet type, birth interval and place of residence. The widest pro-illiterate risk difference (RD) was in Cameroon (118.44/1000) while the pro-literate risk difference was widest in Albania (− 61.90/1000). Conclusions: The study identified educational inequalities in the prevalence of diarrhoea in children with wide variations in magnitude and contributions of the risk factors to pro-illiterate inequalities. This suggests that diarrhoea prevention strategies is a must in the pro-illiterate inequality countries and should be extended to educated mothers as well, especially in the pro-educated countries. There is a need for further studies to examine the contributions of structural and compositional factors associated with pro-educated inequalities in the prevalence of diarrhoea among U5C in LMIC.
We pooled the Demographic and Health Surveys (DHS) data from 57 LMIC in this study using the most recent successive DHS conducted within the last ten years (2010–2019) which captured information on diarrhoea experience among U5C and available as of April 2020. The DHSs are cross-sectional and nationally representative population-based household survey conducted periodically across the LMIC. The DHS uses a multi-stage stratified sampling design based on the states/divisions/regions, district and clusters peculiar to each country. In each of the countries, the households are the sampling units and are selected from the clusters which are the primary sampling units (PSU) [19, 20]. The DHS computes sampling weights to account for unequal selection probabilities within each cluster as a result of unequal sample sizes of the clusters. The application of the sampling weights ensured that the survey findings fully represent the target populations. A similar set of protocols, standardized questionnaires, similar interviewer training, supervision, and implementation were used in all the countries. The full details of the sampling methodologies are available at dhsprogram.com. Amongst others, the DHS collects data on children health care including common diseases, treatments, and care for all U5C of the sampled women. In all, we extracted the data of 796,150 U5C from 63,378 neighbourhoods in 57 LMIC across the globe. The outcome variable in this study is the recent experience of diarrhoea. Diarrhoea is defined as “passage of liquid stools three or more times a day” [21, 22] and “recent experience of diarrhoea” as having any of the symptoms of diarrhoea within two weeks before the date of the interview [23]. The mothers were asked if any of their U5C had diarrhoea within two weeks preceding the survey. The responses were binary: Yes or No. The main determinate variable in this decomposition study is mothers’ literacy levels: illiterate or literate. We used mothers’ reported education as a proxy for literacy in this study. Literacy, a key skill and an important measure of a population’s level of education, is the ability to both read and write a short, simple statement about one’s own life [24]. We, therefore, categorized education as having no formal education (Illiterate) and educated (can read and write: have a minimum of completed primary education – Literate). The independent variables consist of individual-level and neighbourhood-level factors. The individual-level factors comprises children, mothers’ and the households’ characteristics. Childs’ characteristics: sex (male versus female), age in years (< 12 months and 12–59 months), weight at birth (average+, small and very small), birth interval (firstborn, =36 months) and birth order (1, 2, 3 and 4+). Mothers’ characteristics: maternal age (15–24, 25–34, 35–49), marital status (never, currently and formerly married), employment status (working or not working). Households’ characteristics: access to media (at least one of radio, television or newspaper), sources of drinking water (improved or unimproved), toilet type (improved or unimproved), cooking fuel (clean fuel or biomass), housing materials (improved or unimproved). The clusters are the PSUs in the DHS sampling technique. Typically, people of the same cluster share similar contextual factors [19, 20]. We used the word “neighbourhood” to describe the clustering of the children within the same geographical cluster and “neighbours” as the members of the same cluster. The PSUs were identified using the most recent census in the respective countries. In this study, we considered living in rural areas and neighbourhood socioeconomic status (SES) as community-level variables. We computed the neighbourhood SES using the principal component analysis method, comprising of the proportion of respondents within the same neighbourhood who are from poor households and are not currently employed. Descriptive and inferential statistics comprising of bivariable analysis and binary logistic multivariable Fairlie decomposition techniques were used for this study. The Z-test for equality of prevalence of diarrhoea among the children of illiterate and literate mothers within each country and region was conducted and reported in Table 1 while chi-square test of association between the explanatory variables and the outcome variable among the two groups of children were reported in Table 2. The risk difference (RD) in having diarrhoea was measured between U5C from illiterate mothers and those from literate mothers. An RD > 0 suggests that diarrhoea is more prevalent among children born to illiterate mothers (pro-illiterate inequality). Whereas, a negative RD (< 0) indicates that diarrhoea is prevalent among children born to literate mothers (pro-literate inequality). A meta-analysis of the prevalence of diarrhoea among both groups of children in each of the countries was carried out. We estimated the fixed effects as the weighted country-specific RD and the random effect as the overall RD irrespective of a child’s country (Fig. 1). Charts were used to show the distributions of the RDs (Figs. 2 and and3).3). Test of heterogeneity to ascertain that the 57 countries were different with regards to the odds ratio of having diarrhoea among children from illiterate and literate mothers was carried out, and a test of homogeneity of ORs among the 6 countries (with a significant odds ratio of having diarrhoea) to determine if the odds of having diarrhoea in those countries are homogenous or not. The heterogeneity in meta-analysis, measured by I2, refers to the variation in study outcomes between locations or countries; where a low I2 is an indication of low variability among locations. Lastly, the adjusted binary logistic regression method was applied to the 6 pro-illiterate countries to carry out a Fairlie decomposition analysis (FDA) on factors associated with the inequality and results presented in Fig. 4. description of demographic and health surveys data by countries, educational and diarrhoea prevalence among under-five children in LMIC, 2010–2018 **significant at 5% chi-square test *significant at 5% test of equality of proportions between a and b Summary of pooled sample characteristics of the studied children and prevalence of diarrhoea in 57 LMIC *significant at 5% chi-square test Risk difference in the prevalence of diarrhoea between children from illiterate and literate mothers by countries Risk difference between children born to illiterate and literate mothers in the prevalence of diarrhoea by countries Scatter plot of the rate of diarrhoea and risk difference between children born to illiterate and literate mothers in LMIC Contributions of differences in the distribution ‘compositional effect’ of the determinants of diarrhoea to the total gap between children from illiterate and literate mothers by countries Multivariable decomposition was used to quantify the contributions of risk factors to the differences in the prediction of an outcome of interest between two distinct groups in multivariate models [25]. The outputs from such regression models of group differences are partitioned into two components attributable to; (i) compositional differences between the two groups (endowments or explained differences) and (ii) a second component which is attributable to differences in the effects of the characteristics (coefficients or unexplained differences) [25]. The Blinder-Oaxaca Decomposition Analysis (BODA) [26–28] for linear regression models is the most famous of the models but it is not so reliable for non-linear models such as binary logistic regression [25, 29]. Other methods include the multivariate decomposition [25] and the Fairlie methods [29–33]. The Fairlie decomposition method is an extension of the BODA purposively developed for non-linear regression models including the logit and probit models. It was first developed in 1999 [34], updated in 2007 [30] with more simplifications to address path dependence issues and the method of incorporating sample weights in the technique in 2017 [29]. The Fairlie method has been reported to have more reliable estimates of non-linear regression models especially for the logit and probit models [29, 30, 33, 34] and was used in this study. The decomposition analysis was carried out by calculating the difference between the predicted probability for one group (say the group of children from illiterate mothers (Group I)) using the other group’s (say the group of children from literate mothers (Group L)) regression coefficients and the predicted probability for group I using its regression coefficients [29]. The Fairlie decomposition technique is operationalized to constrain the predicted probability between 0 and 1. The standard BODA of the two groups in linear regression is the average value of the dependent variable, Y, can be expressed as: Where Y¯J is the average probability of the binary outcome variable with a particular group J. The X¯J is a row vector of the average values of the explanatory variables and β^J is a vector of coefficient estimates for a particular group J. The numerical details have been reported [27, 35]. Fairlie et al. showed that the alternative decomposition in eq. (1) for a nonlinear equation Y = F(X), where F is the logistic cumulative distribution function, can be expressed as: Where NJ is the sample size for group J [34]. Unlike in BODA where F(Xiβ) = Xiβ. Y¯ In eq. (2) is not necessarily the same as FX¯β^. In eqs. (1, 2), the 1st term is the part of the gap in the binary outcome variable that is due to group differences in distributions of X, and the 2nd term is the part due to differences in the group processes determining levels of Y . The 2nd term also captures the portion of the binary outcome variable gap due to group differences in unmeasurable or unobserved endowments. The compliment of eq. (2) is also valid. The estimation of the total contribution is the difference between the average values of the predicted probabilities. Using coefficient estimates from a logit regression model for a pooled sample, β^∗, the independent contribution of X1 and X2 to the group’s gap is expressed as and respectively. The contribution of each variable to the gap is thus equal to the change in the average predicted probability from replacing the group L distribution with the group I distribution of that variable while holding the distributions of the other variable(s) constant. To obtain an accurate decomposition estimate, Fairlie et al. recommended the replication of the decomposition from a minimum of 1000 subsamples and finding the mean values of estimates from each separate decomposition [29]. Further numerical details have been reported [29, 31, 33, 34, 36]. We invoked the “Fairlie” Ado file in STATA 16 (StataCorp, College Station, Texas, United States of America) to carry out the decomposition analysis using the generalized structure of the model. We specified random ordering of the variables, sample weights and 10,000 replications of the decomposition to obtain optimal results that fully reflect the study population. Model fit was assessed using the Wald chi-square statistics and the log-likelihood ratio test. The R statistical software was used to draw all the Figures. All statistical tests were performed at 5% significance level. The results of this study are presented in Tables and Figures. All our estimates were weighted.
N/A