Background Improving the quality of facility-based births is a critical strategy for reducing the high burden of maternal and neonatal mortality and morbidity across all settings. Accurate data on childbirth care is essential for monitoring progress. In northeastern Nigeria, we assessed the validity of childbirth care indicators in a rural primary health care context, as documented by health workers and reported by women at different recall periods. Methods We compared birth observations (gold standard) to: (i) facility exit interviews with observed women; (ii) household follow-up interviews 9-22 months after childbirth; and (iii) health worker documentation in the maternity register. We calculated sensitivity, specificity, and area under the receiver operating curve (AUC) to determine individual-level reporting accuracy. We calculated the inflation factor (IF) to determine population- level validity. Results Twenty-five childbirth care indicators were assessed to validate health worker documentation and women’s self-reports. During exit interviews, women’s recall had high validity (AUC≥0.70 and 0.75 < IF < 1.25) for 9 of 20 indicators assessed; six additional indicators met either AUC or IF criteria for validity. During follow-up interviews, women's recall had high validity for one of 15 indicators assessed, placing the newborn skinto- skin; two additional indicators met IF criteria only. Health worker documentation had high validity for four of 10 indicators assessed; three additional indicators met AUC or IF criteria. Conclusions In addition to standard household surveys, monitoring of facility-based childbirth care should consider drawing from and linking multiple data sources, including routine health facility data and exit interviews with recently delivered women.
Study approvals were obtained from the London School of Hygiene & Tropical Medicine (reference 14091) and the Health Research Ethics Committees for Nigeria (reference NHREC/01/01/2007) and Gombe State (reference ADM/S/658/Vol. II/66). Gombe State, northeastern Nigeria, has high maternal and newborn mortality at 814 per 100 000 live births and 35 per 1000 live births, respectively; nationally, maternal mortality estimates are also 814 per 100 000 live births and neonatal mortality estimates are 39 per 1000 live births [3,4,14,15]. Gombe is predominantly rural and 44% of the population have some primary school education. Most women access maternity care through public facilities. Seventy-two percent of women reported at least one antenatal care visit during their last pregnancy and 29% gave birth in a health facility [15]. In 2018, over 70% of facility deliveries took place in rural primary health facilities [32]. Twenty-five indicators were selected, focusing on the content of childbirth care (Table 1): skilled birth attendance and companionship during labor and delivery; care for the woman (maternal background characteristics, provider practices and respectful care, clinical care); and care for the newborn (immediate postnatal care and newborn outcomes). To select these indicators, we referred to the Ending Preventable Maternal Mortality and Every Newborn Action Plan strategy documents for priority indicators to monitor progress towards Sustainable Development Goals targets [33,34]. We also sought to complement indicators collected in the Nigeria Demographic and Health Survey as well as earlier studies validating childbirth care indicators [14,16-20]. Childbirth care indicators and data recording methods compared with birth observations (gold standard) for validation analyses *Observed women were interviewed before discharge from the facility (exit interview) and at home nine to 22 mo after childbirth (follow-up interview). Health workers documented childbirth events in facility maternity registers. †For validation analyses, the following indicators were converted into binary variables: age at delivery (adolescent births); prior parity (prior parity, four or more births); and baby’s birthweight (low birthweight, <2500 g). ‡In the facility maternity register, essential newborn care is a composite indicator for (i) immediate initiation of breastfeeding and (ii) baby kept warm. In Gombe, maternity registers defined essential newborn care as the immediate initiation of breastfeeding and the baby being kept warm within 30 minutes of birth [35]. To determine if the maternity register provided a sufficient approximation to globally-defined indicators, we compared the maternity register’s essential newborn care data to being kept warm and the initiation of breastfeeding within the first hour of birth [34]. For validation analyses, the following indicators were converted into binary variables: maternal age at delivery (adolescent births); prior parity (prior parity, four or more births); and baby’s birthweight (low birthweight, <2500 g). As part of an initiative to improve care in Gombe State, data were collected between 2016-2018, including facility-based birth observations [36]. A summary of each data recording method is provided in Figure 1; detailed descriptions follow. Data recording methods, data collection rounds, and the number of women observed and interviewed. *Of the 1889 women observed, 115 (6%) did not participate in an exit interview: 11 (0.5%) were discharged with their newborn and refused to be interviewed; 104 (5.5%) women were not interviewed (21 were referred to another facility; 61 still births with 1 maternal death; 22 newborn deaths). †A total of 445 women were followed up at home in March 2018, 9-22 months after their observed childbirth: 147 women from deliveries in June 2016; 146 women from deliveries in March 2017; 152 women from deliveries in August 2017. Starting in June 2016, five rounds of birth observations took place in 10 primary health facilities. Each round took place roughly every six months and lasted three weeks. To select the facilities for birth observations, a state-wide random sample of 107 facilities was drawn in November 2015 from approximately 500 government-owned primary health facilities. The maternity registers were reviewed to determine the volume of births occurring in the previous six months. The 10 facilities with the highest number of births were selected for birth observations [37]. An average of 15.7 births (standard deviation SD = 12.0) occurred per month in the 10 primary health facilities, compared to the state-level average of 4.3 births (SD = 6.3) per month in primary health facilities [38]. All women attending the facility for delivery were invited to participate, excluding women admitted for monitoring before the onset of labor. Women were given a description of the study and the procedures, including the right to withdraw participation at any time. A trained observer (local midwives, not employees of the assigned facility) stayed in the same room to continuously document labor and delivery processes through the first hour after birth, using a structured checklist. Labor and delivery took place in the same room. The mother and newborn were usually kept together until discharged from the facility. Two observers and one clinical supervisor were assigned per facility to work in shifts and cover all deliveries. Although observers were trained midwives, they had no legal right to intervene in clinical care during the observation period because they were not employed in the same facilities where they were doing the observations. At all times during the observation, the observer prioritized safety of the mother and newborn over data collection; protocols were established on how to seek help in the event of any life-threatening event. Priorities for the supervisor were (i) to ensure that consenting procedures were carried out; (ii) to observe data collection and carry out interrater reliability checks; (iii) to assist in the case of a query from facility employees or from clients and families; (iv) to collect and check digital data at the end of each day. Before each round, observers underwent four days of practical training to conduct unobtrusive observations, train on safety and confidentiality protocols, and ensure consistency of rating between observers. Observations were recorded onto a Lenovo A3300 tablet using CSPro version 7.0 (United States Census Bureau and ICF Macro, Suitland, MD, USA). Each observed woman was assigned a unique observation number to facilitate linking information to other data sets. Following the birth observation, regardless of newborn outcome, the observer extracted data about the woman from the maternity register. Data extraction took place on the same day as the observed birth after the first hour of birth. Data were directly entered into the tablet. Women were usually discharged within 24 hours of delivery. Each observed woman leaving the facility with a live newborn was invited to participate in an exit interview. The exit interview covered information recorded during the observation and harmonized with questions asked in the DHS and MICS. Each interview was conducted in Hausa by a member of the observation team assigned to the facility. Interview questions are available in Table S1 in Online Supplementary Document. In addition to recall during exit interviews, it was of interest to understand the validity of women’s recall in the context of household surveys, such as DHS and MICS. For this purpose, we conducted household-level follow-up interviews with a subset of the observed women to recall childbirth events. To represent a range of recall periods that may be encountered during a household survey, in March 2018 we selected approximately 150 women from each of the first three rounds of birth observations which occurred in June 2016 (22 months recall), March 2017 (15 months recall), and August 2017 (9 months recall); this selection was done by a simple random sample of a de-identified list of women observed per round. Each interview was conducted in Hausa and the women were asked the same questions as in the exit interview. To estimate the sample size, 50% prevalence from clinical observations (gold standard) was set for all indicators as we expected variability in the frequency of indicators. Sensitivity was set at 60% ± 7% precision and specificity at 70% ± 7% precision. Type 1 error was set at 0.05, assuming a normal approximation to a binomial distribution. Thus, a minimum sample size of 400 was required for observed women at exit interviews, at follow-up interviews, and in the maternity register. To combine the data from five rounds of data collection, we tested for marginal homogeneity using Yang’s chi-square test for clustered binary matched pair data using the clust.bin.pair package in R [39,40]. Of the 45 matched pairs analyzed (see Table 1), one indicator showed evidence of clustering across time when comparing birth observations and women’s self-reports at exit and follow-up interviews: birth attendant washed hands with soap before examinations. Given the number of matched pairs analyzed, we considered there to be sufficient evidence that the data collection rounds could be combined. Validation analyses were performed using Stata 14.2 (Stata Corp, College Station, TX, USA) [41]. Using birth observations as the gold standard, we assessed each indicator’s validity at the individual- and population-level. To measure individual-level reporting accuracy, we constructed three two-by-two tables for each indicator which compared the birth observation to each data recording method [16,18-20,23]. Missing and “don’t know” responses were excluded from the two-by-two tables. We calculated percent agreement between the birth observation and each data recording method. For two-by-two tables with at least five observations per cell, we calculated the sensitivity (true positive rate) and specificity (true negative rate) for each indicator. We quantified the area under the receiver operating characteristic curve (AUC) and estimated 95% confidence intervals (CI) assuming a binomial distribution. AUC values range from 0 to 1, with 0.5 representing a random guess and 1 representing complete accuracy. An AUC value of 0.7 or higher was chosen as the cutoff criteria for high individual-level reporting accuracy [23]. To measure the population-level validity, we calculated each indicator’s inflation factor (IF), which is the ratio of the estimated population-based survey prevalence to the gold standard’s prevalence. The IF reflects the degree to which an indicator would be over- or under-estimated in a population-based survey. To estimate the population-based survey prevalence, we used the following equation [42]: estimated population survey prevalence = (gold standard prevalence × sensitivity) + [(1 – gold standard prevalence) × (1 – specificity)]. An IF value between 0.75 and 1.25 was the chosen cut-off criteria for low population-level bias [23].