Background: Caesarean section is recommended in situations in which vaginal birth presents a greater likelihood of adverse maternal or perinatal outcomes than normal. However, it is associated with a higher risk of complications, especially when performed without a clear medical indication. Since labour attendants have no standardised clinical method to assist in this decision, statistical tools developed based on multiple labour variables may be an alternative. The objective of this paper was to develop and evaluate the accuracy of models for caesarean section prediction using maternal and foetal characteristics collected at admission and through labour. Method: This is a secondary analysis of the World Health Organization’s Better Outcomes in Labour Difficulty prospective cohort study in two sub-Saharan African countries. Data were collected from women admitted for labour and childbirth in 13 hospitals in Nigeria as well as Uganda between 2014 and 2015. We applied logistic regression to develop different models to predict caesarean section, based on the time when intrapartum assessment was made. To evaluate discriminatory capacity of the various models, we calculated: area under the curve, diagnostic accuracy, positive predictive value, negative predictive value, sensitivity and specificity. Results: A total of 8957 pregnant women with 12.67% of caesarean births were used for model development. The model based on labour admission characteristics showed an area under the curve of 78.70%, sensitivity of 63.20%, specificity of 78.68% and accuracy of 76.62%. On the other hand, the models that applied intrapartum assessments performed better, with an area under the curve of 93.66%, sensitivity of 80.12%, specificity of 89.26% and accuracy of 88.03%. Conclusion: It is possible to predict the likelihood of intrapartum caesarean section with high accuracy based on labour characteristics and events. However, the accuracy of this prediction is considerably higher when based on information obtained throughout the course of labour.
We conducted a secondary analysis of the database from the Better Outcomes in Labour Difficulty (BOLD) project, a World Health Organization’s multicentre study aimed at accelerating the reduction of maternal, foetal and neonatal mortality and morbidity related to intrapartum. The researchers collected this database from the prospective cohort performed as part of the BOLD project. A methods paper presents a detailed description of the project study protocol [26]. In summary, BOLD researchers collected data from women admitted for labour care in 9 and 4 hospitals in Nigeria and Uganda, respectively, from 2014 to 2015. To be selected, hospitals must have a minimum of 1000 births per year, be the major health care facility in its region, and not a primary care unit. Intrapartum care was provided by skilled birth attendants, with stable access to CS, augmentation of labour, assisted vaginal delivery and good obstetric care practices [27, 28]. BOLD cohort considered eligible women admitted for spontaneous or induced vaginal delivery, with a single foetus, during the first stage of labour (both in the latent phase and in the active phase), with cervical dilatation less than 7 cm. The following women were excluded: pregnant women diagnosed with foetal death, cervical dilatation ≥7 cm, multiple gestation, gestational age less than 34 weeks, elective or pre-labour CS, with an indication of emergency CS or laparotomy on admission, failed induction of labour, false labour, unemancipated minors without a legal guardian, and women who were not able to give consent. Trained nurses carried out the recruitment process. The main outcome of the present analysis was the occurrence of CS and the predictors are the maternal characteristics of admission and intrapartum variables evaluated in the first and second stages of labour. As the CS can be objectively measured, we reduce the potential detection bias in the context of a multicentre study. The BOLD project also recorded the dates and times of the interventions performed, as well as maternal and perinatal outcomes. The research team used a standardized collection form developed for the BOLD project and they collected the records during childbirth according to routine obstetric care protocols of the hospitals. They calculated the sample size based on the set of maternal and perinatal outcomes defined in the BOLD and considered 20 possible predictors. Based on initial assumptions, a minimum required sample size of 7812 women was calculated. The BOLD research team was concerned with avoiding potential biases throughout the development of the project, in steps such as: choosing the study design, developing the data collection instrument and managing the data to ensure its quality [26]. To describe the demographic characteristics of the women in the study, we present mean and standard deviation of quantitative variables and percentage and absolute frequencies for qualitative variables. For the study of the main hypothesis, we developed logistic regression models in which the dependent variable was CS, and the independent variables were baseline (fixed) and intrapartum (dynamic) measurements of the pregnant women. The logistic regression equation based in one independent variable is given by where β0 is the equation intercept and β1 is the coefficient related to the independent variable X1 [29]. To calculate the probability of CS for a specific woman, when the independent variable is continuous, simply replace X1 with the value observed for that woman. When the independent variable is categorical, X1 represents the presence of a characteristic and must be replaced by 1 or 0 if the characteristic is present or absent, respectively. If the prediction model has more than one independent variable, the equation is analogous: We present three types of models, which differ from one another in terms of the moment of recording the intrapartum variables used: the labour admission model, the interval models and the maximum score model. To obtain the labour admission model, we first selected variables that presented p-values lower than 5% in the chi-square and Student t-tests in the complete sample (bivariate analysis). We considered the admission records of the following variables in bivariate analysis: We categorized the following quantitative predictors: Maternal BMI, SBP, DBP, MHR, Axillary temperature, Number of uterine contractions in 10 min. For Maternal BMI, we considered the categorization according to the gestational age [30]. The MHR and blood pressure were categorized according to the Modified Early Obstetric Warning Score [31]. We used the foetal heart rate (FHR) values to define the suspected foetal distress variable (i.e. values above 160 beats or below 120 beats were considered as suspected foetal distress [32]). We considered as abnormal axillary temperature, values below 35.5 °C and equal or greater than 37.5 °C. Additional file 1: Table S1 and Additional file 1: Table S2, respectively, present more details about the fixed and dynamic variables considered in the bivariate analysis step. The final admission model (Model 1) included those independent variables selected using a stepwise selection method among the significant variables in bivariate analysis. The stepwise is a method that sequentially adds variables into the model that most improves the fit [33]. To evaluate the model improvement at each step, we considered de Akaike information criterion [34]. Interval models include the same predictors of final admission model, however, we considered updated values for the last measured measure of the dynamic variables, at three time intervals, from 4 cm of dilatation: 0 to 2 h (Model 2), 2 to 4 h (Model 3) and 4 to 6 h (Model 4). In these models, the elapsed time between admission and the first record of dilatation equal to 4 cm was inserted as an additional independent variable. For the interval models, only women with a record of a 4 cm dilatation were included. To obtain the maximum score model (Model 5), we considered the same fixed variables from final admission model. Regarding dynamic variables, we used bivariate analysis to select records related to conditions presented throughout the labour among the described below: Additional file 1: Table S3 presents more details about the variables listed above. We also present an abridged version of each interval model and of maximum score model, that was obtained using a stepwise selection method. Figure Figure11 presents a schematic flow diagram including a summary of the models studied. In summary, Model 1 returns a probability of CS at admission, Models 2 to 4 return the probability of CS at 2, 4 and 6 hours after the start of the active phase and Model 5 returns the probability of CS throughout the labour. Schematic flow diagram of model building For the model estimation, we used random samples containing 70% of the data set that did not present missing data. This segment was called the training sample, while the remaining 30% (test sample) was used in the validation step. It is important to divide the dataset, because it allows us to evaluate the model discriminatory capacity using a sample that was not the same used to fit the model. To evaluate the discriminatory capacity, we present the ROC Curve and the area under the ROC curve (AUC). The ROC curve graphically represents the values of sensitivity and specificity for an entire range of prediction cut-off points. Sensitivity reflects the proportion of individuals for whom the model correctly indicated by performing CS among all those who actually undergone to CS. Specificity is the proportion of individuals that the model indicates for not performing CS among all those who actually did not. The higher the discriminatory capacity of a model, the higher its AUC, which can reach a maximum of 1 (or 100%). We also presented the diagnostic accuracy (DA), positive predictive value (PPV) and negative predictive value (NPV) related to the cut-off point that maximizes sensibility and specificity. DA is the proportion of individuals in which the model is correct among all individuals. PPV and NPV are the proportion of observations in which the model is correct among that one the model indicates to perform and to not perform the CS, respectively. To evaluate the adequacy of fitted models to the data, we used the Hosmer-Lemeshow test, which evaluates the null hypothesis that the logistic model is the correct choice. The analyses were performed using the R software version 3.5.1 [35].