Background Birth interval duration is an important and modifiable risk factor for adverse child and maternal health outcomes. Understanding the spatial distribution of short birth interval, an interbirth interval of less than 33 months, and its predictors are vital to prioritize and facilitate targeted interventions. However, the spatial variation of short birth interval and its underlying factors have not been investigated in Ethiopia. Objective This study aimed to assess the predictors of short birth interval hot spots in Ethiopia. Methods The study used data from the 2016 Ethiopia Demographic and Health Survey and included 8,448 women in the analysis. The spatial variation of short birth interval was first examined using hot spot analysis (Local Getis-Ord Gi∗ statistic). Ordinary least squares regression was used to identify factors explaining the geographic variation of short birth interval. Geographically weighted regression was used to explore the spatial variability of relationships between short birth interval and selected predictors. Results Statistically significant hot spots of short birth interval were found in Somali Region, Oromia Region, Southern Nations, Nationalities, and Peoples’ Region and some parts of Afar Region. Women with no education or with primary education, having a husband with higher education (above secondary education), and coming from a household with a poorer wealth quintile or middle wealth quintile were predictors of the spatial variation of short birth interval. The predictive strength of these factors varied across the study area. The geographically weighted regression model explained about 64% of the variation in short birth interval occurrence. Conclusion Residing in a geographic area where a high proportion of women had either no education or only primary education, had a husband with higher education, or were from a household in the poorer or middle wealth quintile increased the risk of experiencing short birth interval. Our detailed maps of short birth interval hot spots and its predictors will assist decision makers in implementing precision public health.
The study was conducted in Ethiopia, which is located in the Horn of Africa (30–150 N latitude and 330–480 E longitude) [15]. The country occupies an area of 1.1 million square kilometres with an altitude that ranges from the highest peak at Ras Dashen (4,620 metres above sea level) down to the Dallol Depression, about 148 metres below sea level [32, 33]. Administratively, Ethiopia is divided into nine regions and two administrative cities [15]. This analysis was based on the 2016 Ethiopia Demographic and Health Survey (EDHS) data. The EDHS sample was derived using a stratified, two-stage cluster design where Enumeration Areas (EAs) were the sampling units for the first stage and households for the second stage. The detailed methodologies of the surveys are presented in the full EDHS report [15]. The current study included 8,448 women from 620 clusters, who had reported at least two live births during the five years preceding the 2016 survey. Women who had never been married (n = 12) were not included in the study since women who have multiple births out of wedlock are unlikely to plan their births in the same way as married women. When women had more than two births in the five years preceding the survey, birth interval of their most recent two births (i.e., the birth interval between the index child and the immediately preceding child) was uniformly considered for all the study participants. Global Positioning System (GPS) receivers were used to collect the location data (geographic coordinates) of each survey cluster. The GPS reading was made at the centre of each cluster. The GPS data collectors ensured the centre was relatively open, away from tall buildings, and out from under tree canopy in order to receive adequate satellite signal strength. To maintain respondents’ confidentiality, GPS latitude/longitude positions for all survey clusters were randomly displaced. The maximum displacement for urban clusters was two kilometres (km) and five km for 99% of rural clusters. The remaining 1% of the rural clusters were displaced a maximum of 10 km. The displacement was restricted to the country’s second administrative level (DHS survey region) so that the points stay within the country [34]. In addition, the administrative polygons of Ethiopia, which were obtained from the Natural Earth [35] has been used to develop the map of hot and/or cold spots of short birth interval. The country’s administrative polygons reflect administrative boundaries, such as regions, zones, and districts of Ethiopia. The outcome variable, short birth interval, was defined as an interval of less than 33 months between two successive live births [3]. Women’s birth interval data were collected through reviewing the date of birth of their biological children from children’s birth /immunization certificate and/or asking information regarding their children’s date of birth from the women. Birth interval data of women for all their children born live irrespective of their survival status at the time of the interview were collected. For children who had birth certificates, their mothers were asked to confirm the accuracy of the information prior to documenting children’s date of birth. This was done to avoid errors because in some cases the information on the document may be the date when the birth was recorded and not the date when the child was born. When children did not have a birth certificate, information regarding their date of birth were obtained from their mothers. Then, the length of birth interval was computed in months and the data were accessible for further analysis in this form. Further explanation about how birth interval data were collected can be found in the Demographic and Health Survey Interviewer’s Manual [36]. The candidate explanatory variables included in the Exploratory Regression of the current study are presented online (see S1 Table). These were maternal age at first marriage, maternal age at birth of the preceding child, polygyny status, maternal education level, husband’s/partner’s education level, maternal occupation, husband’s/partner’s occupation, wealth quintile, sex of the preceding child, survival status of the preceding child, total number of children born before the index child, exposure to mass media, and perceived distance to the health facility. Variables were selected based on reviewed literature [2, 14, 20–28]. An Exploratory Regression tool, discussed below under the spatial regression analysis section, was used to identify properly specified Ordinary Least Squares (OLS) models. Descriptive analyses were performed using Stata version 14 statistical software (StataCorp. Stata Statistical Software: Release 14. College Station, TX: StataCorp LP. 2015). The spatial analysis was performed using ArcGIS 10.3.1(ESRI. ArcGIS Desktop: Release 10. Redlands, CA: Environmental Systems Research Institute. 2011). Before performing spatial analysis, the weighted proportion (using sample weight) of short birth interval and candidate explanatory variables (see S1 Table) data were exported to ArcGIS. A detailed explanation of the weighting procedure can be found elsewhere [15]. Participant characteristics were described using frequency with percent. Pearson’s chi-squared tests were used to assess differences in short birth interval frequencies between place (urban/rural) and regions of residence. The global Moran’s I statistic was computed to test for the presence of spatial autocorrelation. This statistic indicates whether the pattern of short birth interval in the study area is clustered, dispersed, or random. When the z-score or p-value indicates statistical significance, a positive Moran’s I index value indicates a tendency toward clustering while a negative Moran’s I index value indicates a tendency toward dispersion. Based on this, a decision was made about whether to reject the null hypothesis that short birth intervals are randomly distributed across the study area [37]. The Getis-Ord General G statistic was used to measure the degree of clustering, which may be high or low. The higher (or lower) the z-score, the stronger the intensity of the clustering. A z-score near zero indicates no apparent clustering within the study area. A positive z-score indicates clustering of high values and a negative z-score indicates clustering of low values [38]. Subsequently, Incremental Spatial Autocorrelation was assessed to calculate an appropriate distance threshold for identifying spatial processes that promote clustering [39]. Hot spot analysis using local Getis-Ord Gi* statistics [40] was used to depict short birth interval variation in the study area. This statistic produces a hot and/or cold spot map using short birth interval rate as the input. It compares the local mean rate (the rates for a cluster and its nearest neighboring clusters) to the global mean rate (the rates for all clusters). A z-score and p-value are produced for each cluster, allowing assessment of the significance of differences between local and global means. A high positive z-score and a small p-value for a feature (cluster in this case) indicate a spatial clustering of high values (a hot spot). A low negative z-score and a small p-value indicate a spatial clustering of low values (a cold spot). A z-score near zero indicates no apparent spatial clustering [40–43]. Getis-Ord Gi* statistic is given as [42]: where xj is the attribute value for feature (cluster in the current study) j, wi,j is the spatial weight between feature i and j, n is equal to the total number of features and When estimating local Getis-Ord Gi* statistics, a False Discovery Rate (FDR) correction method was applied to account for multiple, dependent tests [44–46]. This helps to identify true clusters by estimating the number of false positives for a given confidence level and adjusting the critical p-value accordingly. Thus, statistically significant p-values are ranked from smallest (strongest) to largest (weakest), and based on the false positive estimate, the weakest are removed from the list [44, 46]. The importance of considering the FDR correction method in DHS data has been documented elsewhere [45]. The Ethiopian Polyconic Projected Coordinate System, based on the World Geodetic System 84 (WGS84) coordinate reference system (CRS), was used to produce a flattened map of the country. After identifying short birth interval hot spots, spatial regression modeling was performed to identify predictors of the observed spatial patterns of short birth interval. Findings from ordinary least squares (OLS) regression are only reliable if the regression model satisfies all of the assumptions that are required by this method [47]. The coefficients of explanatory variables in a properly specified OLS model should be statistically significant and have either a positive or negative sign. In addition, there should not be redundancy among explanatory variables (free from multicollinearity). The model should be unbiased (heteroscedasticity or non-stationarity). The residuals should be normally distributed and revealed no spatial patterns. The model should include key explanatory variables. The residuals must be free from spatial autocorrelation [47–49]. The OLS regression equation [50] is given as: where i = 1,2,…n; β0, β1, β2, …βp are the model parameters, yi is the outcome variable for observation i, xik are explanatory variables and ε1, ε2, … εn are the error term/residuals with zero mean and homogenous variance σ2. To identify a model that fulfills the assumption of the OLS method, Exploratory Regression, a data-mining tool, was used. Similar to Stepwise Regression, Exploratory Regression identifies models with high Adjusted R2 values. Moreover, unlike Stepwise Regression, Exploratory Regression identifies models that meet all of the assumptions of the OLS method [47, 51, 52]. The model was validated using internal cross-validation. Cross-validation provides an idea of how well a model built in the training dataset predicts unknown values in a validation dataset. For a model that provides accurate predictions, the mean error should be close to 0, the root-mean-square error and average standard error should be as small as possible (this is useful when comparing models), and the root-mean-squared standardized error (RMSE) should be close to 1 [53]. The model in the current study fulfilled the above statistical requirements. A variable that is a strong predictor in one cluster may not necessarily be a strong predictor in another cluster. This type of cluster variation (non-stationarity) can be identified through the use of GWR. In this context, GWR can help to answer the question: “Does the association vary across space?” Unlike OLS that fits a single linear regression equation to all of the data in the study area, GWR creates an equation for each DHS cluster. While the equation in OLS is calibrated using data from all features (cluster in this case), GWR uses data from nearby features. Thus, the GWR coefficient takes different values for each cluster [54, 55]. Maps of the coefficients associated with each explanatory variable, which are produced using the GWR, provide guidelines for targeted interventions. The GWR model [56] can be written as: where yi are observations of response y, (uivi) are geographical points (longitude, latitude), βk(uivi) (k = 0, 1, … p,) are p unknown functions of geographic locations (uivi), xik are explanatory variables at location (uivi), i = 1,2,…n and εi are error terms/residuals with zero mean and homogenous variance σ2. Fig 1 presents a summary of the model’s framework. OLS = Ordinary Least Squares; GWR = Geographically Weighted Regression. Ethical approval was obtained from the Human Research Ethics Committee (H-2018-0332), The University of Newcastle. The 2016 EDHS was approved by the National Research Ethics Review Committee of Ethiopia (NRERC) and ICF Macro International. Permission from The DHS Program was obtained to access the datasets.