Background Caesarean section has a significant role in reducing maternal and neonatal mortality. A linked analysis of population and health facility data is valuable to map and identify caesarean section use and associated factors. This study aimed to identify geographic variation and associated factors of caesarean delivery in Ethiopia. Method Linked data analysis of the 2016 Ethiopia Demographic and Health Survey (EDHS) and the 2014 Ethiopian Service Provision Assessment Plus (ESPA+) survey was performed. Spatial analysis was conducted to identify geographic variations and factors associated with caesarean delivery. Hierarchical Bayesian analysis was also performed to identify factors associated with caesarean delivery using the SAS MCMC procedure. Results Women’s age and education, household wealth, parity, antenatal care (ANC) visits, and distance to caesarean section facility were associated with caesarean delivery use. Women who had ≥4 ANC visits were 4.67 (95% Credible Interval (CrI): 2.17, 9.43) times more likely to have caesarean delivery compared to those who had no ANC visits. Women who had education and were from rich households were also 2.80 (95% CrI: 1.83, 4.19) and 1.80 (95% CrI: 1.08, 2.84) times more likely to have caesarean deliveries relative to women who had no education and were from poor households, respectively. A one-kilometer increase in distance to a caesarean section facility was associated with an 88% reduction in the odds of caesarean delivery (Adjusted Odds Ratio (AOR) = 0.12, 95% CrI: 0.01, 0.78). Hotspots of high caesarean section rates were observed in Addis Ababa, Dire Dawa, and the Harari region. In addition, women’s age at first childbirth and ≥4 ANC visits showed significant spatially varying relations between caesarean delivery use across Ethiopia. Conclusion Caesarean section is a lifesaving procedure, and it is essential to narrow disparities to reduce maternal and neonatal mortality and avoid unnecessary procedures.
This study used two data sets: the 2016 EDHS and the 2014 ESPA+ survey. The 2016 EDHS was used to obtain national population data. It contained population socio-demographic characteristics, health service use and morbidity/mortality data. The survey used a stratified multistage sampling technique. EDHS clusters, which were census Enumeration Areas in Ethiopia, were used as sampling frames. In addition to population information, the latitude and longitude coordinates of each EDHS clusters was recorded. Data were collected from 645 EDHS clusters [15]. In this analysis, we included 6,954 women who gave birth in the five years preceding the survey from 622 EDHS clusters. We excluded 239 women who gave birth during this period from 23 EDHS clusters since they had no geographic coordinates for data linkage. The 2014 ESPA+ survey provided national health facility data covering different aspects of the availability and readiness of health services. In addition, the data set contained the latitude and longitude coordinates of each health facility. The survey used a combination of census and simple random sampling techniques [16]. Due to their importance and limited numbers, the survey included all hospitals. Details of the sampling procedure are discussed elsewhere [12]. In this analysis, we included only hospitals proving caesarean sections. Thus, we included only 179 of the 214 hospitals that reported providing caesarean sections. This study used an Euclidean buffer link to link population and health facility data [12]. This is regarded as the method of choice for linking a census of health facilities with population data [11, 12, 17]. Details of this method are discussed elsewhere [12]. The administrative boundaries of Ethiopia, obtained from Natural Earth [18], were used. Using the ESPA+ survey, four health service environment variable scores were computed. These were the average distance to the nearest caesarean delivery facilities, comprehensive obstetric care availability score, readiness to provide comprehensive obstetric care score, and a general health facilities readiness score. Using principal component analysis, availability and readiness scores were computed for the nearest hospital providing caesarean delivery. The SCORE procedure in SAS was used to compute availability and readiness scores. The comprehensive obstetric care indices were created using the World Health Organization’s ‘Service Availability and Readiness Indicators’ [19, 20]. Details of computing these scores were discussed elsewhere [12]. Regarding general service readiness, six service readiness dimensions were used: communication equipment, external supervision, client opinion and feedback, quality assurance, emergency transport, and client latrine. The first two principal components (health facility management system and infrastructure) were used to compute two general service readiness scores [12]. For hospitals providing caesarean delivery, indices of comprehensive obstetric care availability and readiness scores were created. Two comprehensive obstetric care availability scores (basic and comprehensive components) were created using seven variables. Similarly, two comprehensive obstetric care readiness scores (equipment and supplies, and skilled personnel) were computed using nine dichotomous variables [12]. The outcome variable of this study was caesarean delivery use. A woman was considered to have used caesarean delivery if her most recent birth in the five years preceding the survey was via caesarean section. Sociodemographic and obstetric characteristics and health facility variables were considered independent variables for this study. Since this study is a multilevel analysis, all independent variables included are of three types: level 1, level 2 and level 3. Level 1 or individual-level variables include women’s age, education, and occupation, women’s autonomy in healthcare decision-making, husband’s education and occupation, household wealth, age at first childbirth, parity, antenatal care (ANC) visits, and nature of recent pregnancy (planned or unplanned). Level 2 or EDHS cluster-level variables were general service readiness, comprehensive obstetric care availability and service readiness, and average distance to the nearest caesarean delivery facility. One level 3 or region-level (number of caesarean sections providing hospitals per region) was considered in the analysis. Respondents’ educational status was grouped into two categories: had no education and had an education. Similarly, occupational status was grouped as having work and no work. Respondents who responded not working at the time of the interview or did not work in the last 12 months before the survey were grouped as having no work. The EDHS household wealth quintiles were also grouped into two categories: poor (including lowest, second and middle quintiles) and rich (including fourth and highest quintiles). ANC visits were grouped as no ANC visits, 1–3 ANC visits and ≥4 ANC visits. The EDHS survey employed a multistage cluster sampling technique: women were nested within EDHS clusters and then clusters were nested within regions. Considering this hierarchical nature of the data, a three-stage Bayesian hierarchical model was used. Women’s data within the respective EDHS clusters were also linked with the health facility data. In this model, the probability of caesarean delivery is allowed to vary both between EDHS clusters and between regions. To model the between-EDHS cluster and between-region variability, the logarithms of the odds ratios of caesarean delivery was assumed to have a normal distribution. The means of the normal distribution of log odds ratios across EDHS clusters and regions, therefore, represent the average effect in the EDHS clusters and regions, and the variances represent the variability among EDHS clusters and regions. Bayesian inference allows for the combination of existing knowledge with new information according to established rules of probability. In this study, a non-informative prior that would not have an influence on the posterior distribution was chosen [21, 22]. For each fixed effect, a non-informative prior that follows a normal distribution with large variance (σ2 = 1000) and a zero mean (μ = 0) was used. Similarly, for the random effects, non-informative priors that were set up to follow a normal distribution with zero mean and different variances were used. The variances for the random effects were set up to follow an inverse gamma distribution with a shape parameter of α = 0.01 and a scale parameter β = 0.01. The MCMC procedure (PROC MCMC) in SAS was used to estimate the Bayesian hierarchical generalized linear mixed models. The study used a simulation size of 175,000 iterations along with 25,000 burn-in iterations. One of every 25 samples was kept, which gave the Markov chain Monte Carlo (MCMC) sample size of 7,000 for computing the posterior quantities of interest. This analysis procedure enabled the identification of potential factors associated with the rate of caesarean delivery with an equal-tail 95% credible interval. The model building process was started with the unconditional model (a model containing no predictors) and more complex models were gradually built by checking improvements in model fit after each model is estimated [23]. A Deviance Information Criterion (DIC) was calculated using the posterior mean estimates of the parameters to select the best fitting model. The unconditional (empty) model was used to calculate the intra-class correlation coefficient (ICC), which estimates how much variation in the use of caesarean delivery exists between EDHS clusters (level-2 units) and between regions (level-3 units) using posterior samples. Three variance estimates were used to calculate EDHS cluster ICC and region ICC. The formula for calculating each of these ICCs is similar to the formula used for two-level ICC calculations. However, the denominator now has three elements (residual variance, cluster variance and region variance) which form the total variance in the model. The two equations for the EDHS cluster ICC (ICCC) and region ICC (ICCR) are below. The ICCC tells us how much of the total variation in caesarean delivery exists between EDHS clusters. On the other hand, ICCR indicates the total variation in caesarean delivery that exists between regions [24]. In hierarchical generalized linear mixed models, it is assumed that there is no level-1 error variance; however, to calculate the intra-class correlation coefficient, a slight modification is made. The level-1 residual variance (ε) follows a logistic distribution and is standardized with a mean of zero and variance = π23 [25]. Therefore, for three-level random intercept hierarchical generalized linear mixed model with an intercept variance of σC2 and σR2, the intra-class correlation coefficient (Rho) is given by [25]; Using posterior samples from the MCMC simulation, proportional change in variance (PCV), for EDHS clusters and regions, was also estimated to measure changes in area level variance between empty and individual level models and between successive models [26, 27]. This was computed using a mathematical equation: PCV=Vn−1−Vn−2Vn−1; where Vn−1 is the neighbourhood variance in the empty model and Vn−2 is the neighbourhood variance in the subsequent model. Based on the model fit statistics, allowing EDHS cluster-level variables to vary across regions and adding a region-level variable did not show improvement in model fit statistics. Therefore, a model with six individual-level variables women’s age, age at first childbirth, education, household wealth, parity, and the number of ANC visits, and one EDHS cluster level variable (average distance to the nearest caesarean delivery facility) was selected. In this model, individual-level variables were allowed to vary across EDHS clusters and regions. Furthermore, we did sensitivity analyses, especially on household wealth, ANC visits, women’s education and age, and multicollinearity checks. ArcGIS 10.6.1 and R version 4.2.0 were used to carry out the spatial analysis. The Ethiopian Polyconic Projected Coordinate System was used to flatten the Ethiopian map [12]. Hotspot analysis and spatial regression were performed to identify spatial clusters and factors associated with spatial variations of caesarean delivery use. The unit of spatial analyses were EDHS DHS clusters. Three procedures were followed to carry out the spatial analysis as discussed elsewhere [12]. First, the Global Moran’s I statistic, which is a global measure of spatial autocorrelation [28] was carried out. Second, the Incremental Spatial Autocorrelation was used to determine the critical distance at which there is maximum clustering [12]. The average distance (15 kilometres) at which a feature has at least one neighbour was calculated. This gave 155 kilometres at which clustering of caesarean delivery use peaked. Lastly, the Getis-Ord Gi* statistic was used to identify statistically significant spatial clusters of high caesarean delivery rates (hotspots) and low caesarean delivery rates (cold spots). A False Discovery Rate (FDR) correction method was applied to account for multiple and spatial dependence tests in Local Statistics of Spatial Association [12, 29]. Statistical significance was determined based on z-scores and p-values [12]. Furthermore, spatial regression was performed to identify key factors behind the observed spatial patterns of caesarean delivery use. Moran eigenvector spatially and non-spatially varying coefficient (M-SNVC) model was used to model spatially and non-spatially varying relations. Eigenvector spatial filtering regression model removes spatially autocorrelated residuals and improves model fit [30, 31]. M-SNVC model is also stable and quite robust against spurious correlations compared to the Moran eigenvector spatially varying coefficient model [32]. Since the spatial regression was done at the EDHS cluster level, the number of caesarean deliveries per EDHS cluster was first computed. Then we used a transformation-based generalized spatial regression using the spmoran package implemented in R [33]. Since the count data did not obey the Poisson distribution, the log-Gaussian approximation was applied first to roughly normalize the data, and then the SAL transformation to identify the most likely distribution [33]. Finally, the transformed result from the generalized spatial regression model was used in the M-SNVC model to model the spatial relationship between caesarean delivery use and explanatory variables. This study used secondary data sets: 2016 EDHS and 2014 ESPA+ collected previously where confidentiality information was maintained. Hence, as we did not collect data directly from participants, recruitment and contacting participants were not required. Ethical approval was obtained from The University of Newcastle, the Ethiopian Public Health Institute and the Measure DHS program to access the datasets.
N/A