This study tests the group-level causal relationship between the expansion of Kenya’s Safe Motherhood voucher program and changes in quality of postnatal care (PNC) provided at voucher-contracted facilities.We compare facilities accredited since program inception in 2006 (phase I) and facilities accredited since 2010-2011 (phase II) relative to comparable non-voucher facilities. PNC quality is assessed using observed clinical content processes, as well as client-reported outcome measures. Two-tailed unpaired t-tests are used to identify differences in mean process quality scores and client-reported outcome measures, comparing changes between intervention and comparison groups at the 2010 and 2012 data collection periods. Difference-in-differences analysis is used to estimate the reproductive health (RH) voucher program ‘ s causal effect on quality of care by exploiting group-level differences between voucher-accredited and non-accredited facilities in 2010 and 2012. Participation in the voucher scheme since 2006 significantly improves overall quality of postnatal care by 39% (p=0.02), where quality is defined as the observable processes or components of service provision that occur during a PNC consultation. Program participation since phase I is estimated to improve the quality of observed maternal postnatal care by 86% (p=0.02), with the largest quality improvements in counseling on family planning methods (IRR 5.0; p=0.01) and return to fertility (IRR 2.6; p=0.01). Despite improvements in maternal aspects of PNC, we find a high proportion of mothers who seek PNC are not being checked by any provider after delivery. Additional strategies will be necessary to standardize provision of packaged postnatal interventions to both mother and newborn. This study addresses an important gap in the existing RH literature by using a strong evaluation design to assess RH voucher program effectiveness on quality improvement
The study uses a quasi-experimental design to evaluate the impact of the Kenyan OBA voucher scheme on increasing access to, and quality of, selected reproductive health services by comparing voucher-accredited health facilities with non-voucher facilities in counties with similar characteristics (defined later). Data were collected at health facilities in 2010 and 2012. Fig 2 describes data collection activities. Comparison and intervention facilities were sampled from different counties in order to improve the validity of the comparison group by minimizing facility and client selection effects that would be expected to arise if comparison facilities were sampled from voucher counties. Phase I comprised facilities participating since the program’s inception in 2006; these facilities had participated in the program for at least 4 years at the time of the first-round of data collection in 2010. Phase II is comprised of voucher facilities that began participating in the voucher scheme in 2010–2011. The 2010 data therefore represents a true pre-intervention baseline only for the Phase II voucher facilities. Quality of PNC was assessed using 2010 and 2012 data from 19 randomly selected 2006 voucher-accredited health facilities in four counties, 16 randomly selected 2010/2011 voucher-accredited health facilities in three counties, and 18 non-voucher health facilities in three comparable counties [13]. Comparison facilities were selected using a pair-wise matching sampling design; for each sampled treatment facility, a non-voucher facility in a comparison district with similar characteristics in 2010 was sampled. Facility matching characteristics included facility level, sector, staffing levels and types, urban/rural location, fees, and average client profile [32]. The evaluation design is described elsewhere [13, 32]. We conducted analyses of quantitative data collected from 53 health facilities (34 public, 19 private, non-governmental, or faith-based) in eight counties in Kenya. Health facility assessments obtained PNC client information through observations of client-provider interactions of postnatal consultations matched with client exit interviews with observed postnatal clients. During the 2010 round of data collection, a total of 394 PNC clients were observed in the 2006-accredited voucher facilities, 310 clients at 2010/2011-accredited voucher facilities, and 230 clients at comparison facilities, for a total of 934 observations in 2010. In 2012, 259 PNC observations were used for the 2006-accredited voucher facilities, 169 clients for 2010/2011-accredited voucher facilities, and 141 clients at comparison facilities, for a total of 569 post-rollout observations. We test for differences in quality improvements in PNC provision between voucher-accredited facilities and comparable non-voucher facilities. Quality of PNC is first evaluated using a composite score based on the observed technical and interpersonal content of a PNC consultation and client-reported outcomes data. Pearson’s chi-square tests were used to evaluate differences in distribution of facility characteristics and self-reported client socioeconomic status. Statistical significance of differences in client-reported outcome measures, which include receipt of maternal and newborn postnatal checkup, timing of maternal and newborn first checkup relative to delivery, and client-reported satisfaction, were evaluated using two-tailed unpaired t-tests with unequal variance. Two-tailed unpaired t-tests with unequal variance were also used to evaluate group differences in mean process quality scores comparing intervention and comparison groups. Difference-in-differences (DD) analysis is used to estimate the voucher program’s causal effect by exploiting group-level differences across two or more dimensions. The DD estimator equals the average change in outcomes in the treatment group, after the average change in outcomes in the comparison group is subtracted. The DD approach adjusts for time invariant differences between the two groups, as well as time-varying influences affecting both groups equivalently. The difference-in-differences approach to isolating program effect rests upon the usual assumptions of Ordinary Least Squares (OLS), as well as the additional identification assumption of parallel trends: Internal validity rests upon the premise that changes in PNC quality over time in the group of comparison facilities are equivalent to the changes in PNC quality over time that would have been observed in the intervention facilities, had the voucher program not been implemented. DD estimators for process outcomes are estimated using negative binomial models, with each of the 16 individual and summative process quality score outcomes modeled individually. Negative binomial models are appropriate for modeling dependent variables within a discrete, restricted range, and were selected to account for over-dispersion of the count (score) process outcomes. Negative binomial estimates are presented as incidence rate ratios. We additionally report estimates using the OLS estimator in S1 Table as a robustness check. DD estimators of program impact on the dichotomous outcome measures were estimated using logistic regression models. DD estimates are shown for three model specifications, the first including data sampling time (2010 vs. 2012 post-rollout) and treatment type (phase I or phase II, vs. comparison) in addition to the DD estimator, the second adding facility- and client-level covariates, and the final adding facility-level fixed effects. Covariates in the latter two model specifications include facility type, facility sector, and mean client socioeconomic status (by quintile). Facility type is defined categorically by level, with the lowest level comprised of dispensaries, nursing homes, and clinics, the second level health centers, and the third level hospitals and sub-district hospitals. Facility sector is defined as public, non-governmental, private, or mission/faith-based. Individual-level client socioeconomic status is included as a ordinal variable in models 2 and 3, as the voucher program limits eligibility for vouchers to individuals below a specific poverty threshold. Client-level socioeconomic quintiles were generated using principal components analysis of the following household characteristics and assets: source of drinking water, toilet type, cooking fuel type, electricity, radio, television, telephone, refrigerator, solar power, lantern, bicycle, motorcycle, car/truck, boat with motor, boat without motor, and animal- or human-drawn cart. Estimates from all three models are presented to assess sensitivity of the results to model specification. Standard errors are clustered at the health facility level, the unit of the intervention. The general difference-in-differences model can be described as: where Tf is a dummy variable for voucher accreditation status of facility f, Pf is a dummy variable representing intervention time (2010 or 2012), Xi is a vector of individual-level covariates, Zf is a vector of facility-level covariates, and β 3 is the difference-in-difference estimator, the OLS coefficient for the interaction of being in the voucher facility group at the post-rollout time period. y if represents the processes and outcome indicators, each modeled separately. y if is therefore the 16 individual and summative process scores and 5 PNC outcome indicators that were examined as dependent variables of PNC quality for individual i at facility f. Outcome variables were specified prior to analysis according to the definition of postnatal care quality presented in Fig 1. Pre-specification of the outcomes of interest is one method to reduce the probability of finding statistical significance by chance when multiple outcomes are being tested. A second method is to report p-values adjusted for the total number of hypotheses tested. In addition to cluster-adjusted p-values, we present q-values adjusted for the false detection rate (FDR), which is the likelihood of finding a statistically significant effect by chance for any one of the individual outcomes in the group of outcomes being tested. False detection rate-adjusted q-values were calculated using the method proposed by Benjamini, Krieger, and Yekutieli (2006) and operationalized by Anderson (2008) [33–34]. Analyses were performed using Stata software, version 12.1 (StataCorp, College Station, TX). Ethical approval for the evaluation was granted by Population Council’s institutional review board (IRB) No. 470 and the Kenya Medical Research Institute (KEMRI) SCC 174. Informed consent was obtained prior to all observations of PNC consultations and client exit interviews. Interviews were conducted in settings that ensured privacy and confidentiality. Data collectors received training on ethical conduct prior to data collection. All participants provided their written informed consent prior to participation in the survey, with one copy of the written consent retained by the research team and one copy retained by the participant.