Responding to policy makers’ evaluation needs: Combining experimental and quasi-experimental approaches to estimate the impact of performance based financing in Burkina Faso

BMC Health Services Research, Volume 19, No. 1, Year 2019

Interpretação

Background: The last two decades have seen a growing recognition of the need to expand the impact evaluation toolbox from an exclusive focus on randomized controlled trials to including quasi-experimental approaches. This appears to be particularly relevant when evaluation complex health interventions embedded in real-life settings often characterized by multiple research interests, limited researcher control, concurrently implemented policies and interventions, and other internal validity-threatening circumstances. To date, however, most studies described in the literature have employed either an exclusive experimental or an exclusive quasi-experimental approach. Methods: This paper presents the case of a study design exploiting the respective advantages of both approaches by combining experimental and quasi-experimental elements to evaluate the impact of a Performance-Based Financing (PBF) intervention in Burkina Faso. Specifically, the study employed a quasi-experimental design (pretest-posttest with comparison) with a nested experimental component (randomized controlled trial). A difference-in-differences approach was used as the main analytical strategy. Discussion: We aim to illustrate a way to reconcile scientific and pragmatic concerns to generate policy-relevant evidence on the intervention’s impact, which is methodologically rigorous in its identification strategy but also considerate of the context within which the intervention took place. In particular, we highlight how we formulated our research questions, ultimately leading our design choices, on the basis of the knowledge needs expressed by the policy and implementing stakeholders. We discuss methodological weaknesses of the design arising from contextual constraints and the accommodation of various interests, and how we worked ex-post to address them to the best extent possible to ensure maximal accuracy and credibility of our findings. We hope that our case may be inspirational for other researchers wishing to undertake research in settings where field circumstances do not appear to be ideal for an impact evaluation. Trial registration: Registered with RIDIE (RIDIE-STUDY-ID-54412a964bce8) on 10/17/2014. In order to understand the intervention and the design decisions we made within the framework of our impact evaluation, we first provide some background information on the country and its health system. Burkina Faso is a landlocked country located in West Africa, with a population of 18.5 million. At the time the study was planned, the country’s GDP per capita stood at USD 1560 (2013, adjusted for purchasing power parity) placing it among the poorest countries in the world [20]. The 2014 Human Development Index ranked Burkina Faso 185 out of 188 countries [21]. In spite of substantial improvements over the course of the last few years, health indicators still largely lag behind regional averages. Life expectancy is at 58 years. Maternal and under-five mortality are estimated at 371/100,000 [22] and 102/1000 [23], respectively. Malaria, acute respiratory infections, and diarrhea still account for the largest proportion of child mortality, often coupled with an underlying situation of malnutrition, with nearly 40% of all children being classified as stunted. Health service delivery is organized in a three-tier system, with primary facilities (Centre de Santé et Promotion Sociale – CSPS) located in rural areas; district hospitals located in each district capital; and regional and national referral hospitals located in the regional capitals and in the national capital Ouagadougou [24]. Public facilities provide the vast majority of health services [25]. The health sector suffers from a generalized lack of resources. In 2013, total per capita health expenditure was estimated at 6.4% of GDP, equivalent to Purchasing Power Parity USD 109. Government expenditure amounted to 58% of total health expenditure, including contributions by development partners being estimated at 23% of this total. Private health expenditure is substantial as user charges continue to be applied across a variety of essential healthcare services, with more than 80% of all private expenditure on health not being channeled through pre-paid and pooled mechanisms [26, 27]. The poor health outcomes described above are, to a large extent, the result of poor access to services, with people largely under-utilizing the care they need. The literature has consistently reported that geographical barriers, due to scarcity of health facilities, and financial barriers, due to user charges, continue to hamper access to healthcare services [28–33]. The PBF program at the core of our impact evaluation rests on the experience and knowledge acquired during the implementation of a pre-pilot PBF intervention, managed by the Ministry of Health (MoH) with financial and technical assistance from the World Bank in the period 2011–2013 in three districts (Titao, Leo, and Boulsa). Within the framework of this early PBF intervention, health facilities and the MoH entered a contractual agreement whereby the MoH would reward the provision of a defined service package according to a case-based payment modality, adjusted for quality of service provision and following verification. An independent evaluation detected a positive effect of the intervention across maternal healthcare services [34]. Program data showed improvements in the quality of services provided [35]. In light of the pre-pilot experiences, the MoH, again with financial and technical assistance from the World Bank, decided to scale up the PBF intervention to an additional 12 districts in early 2014. In doing so, it was decided to test the combination of PBF with other policies under consideration for scale-up or integration into the national health system, notably community-based health insurance (CBHI) and user fee exemption for the ultra-poor [36, 37]. To understand this decision, it is important to locate the PBF program in Burkina Faso within the broader context of PBF programs supported by the World Bank over the course of the past decade through the Health Results Innovation Trust Fund (HRITF), including impact evaluations in over 20 countries, of which the majority in Sub-Saharan Africa. Emerging preliminary findings across the existing pilots suggested PBF’s potential to improve service provision, but pointed at PBF’s inability to reduce inequality in service access, specifically when implemented exclusively as a supply-side intervention [38]. It is also in the light of these considerations that the MoH and its development partners opted to implement PBF in conjunction with a series of equity measures, aimed at maximizing the potential of PBF to act as a catalyst for equity changes. It is at this point, building on the knowledge generated in other settings and looking at the specific need to address equity gaps in the country, that the World Bank realized the potential to use the case of Burkina Faso to test novel financing, purchasing and targeting mechanisms, combining elements of supply and demand side interventions into a single program. It is also at this point that knowledge generation was conceptualized as an intrinsic component of the PBF program implementation and that the decision to contract an independent academic institution to carry out the impact evaluation was made. It ought to be noted, as described in detail in the sections that follow, that the research team was not involved in key intervention design decisions (i.e. selection of intervention districts, test of equity measures), but did contribute to shaping the evaluation design as well as to elaborating the intervention design (i.e. exact design of the equity measures). Similarly to the pre-pilot, the primary objective of the PBF program was to improve the utilization and quality of maternal and child health (MCH) services, in particular among vulnerable populations such as the ultra-poor. Effectively, however, the PBF benefit package was very comprehensive and comprised a broad range of primary- and secondary-level care services beyond MCH, including also general adult curative consultations and HIV and tuberculosis services [39]. To address the abovementioned equity concerns, the MoH decided to implement four different models of PBF, three of which included special provisions to improve access to care for the ultra-poor by explicitly alleviating the financial burden imposed by user charges. In detail, the following four PBF models were implemented (Table 1): “Intervention design” PBF1: Standard PBF. PBF contracts between the MoH and the health facilities defined the services purchased by PBF (quantity), quality targets, and payment modalities. External reviewers assessed facility reports on quantity indicators on a monthly basis. Based on these verified results, contracted facilities received case-based payments for the services delivered, in addition to all preexisting financing. PBF unit prices were calculated a priori by the implementation team, on the basis of the relative cost and frequency of the services provided. Service quality was assessed with comprehensive quality checklists, verified on a quarterly basis by the District Health Management Teams. Facilities received an additional bonus calculated on the basis of quantity outcomes and service quality if they achieve a quality score of initially at least 50%, later 60%. PBF payments were intended to supplement other facility revenue in the complex pre-existing mixed payment system [40], to fund expenditure, increase savings, and allow bonus payments to staff. Facilities were initially completely autonomous in their spending decisions. In 2017, ceilings as to how much could be disbursed to staff were introduced in response to undesirable practices in some facilities. Facilities were further provided with criteria for the distribution of bonuses among staff, including cadre, position, seniority, work time, and performance. PBF1 did not include any specific provision to facilitate access to care for the ultra-poor. PBF2: Standard PBF + systematic targeting and health service subsidization for the ultra-poor. PBF2 operated according to the same contractual model as PBF1, but combined it with specific provisions to facilitate access to care for extremely vulnerable individuals residing in the health facility catchment area. These provisions included: PBF3: Standard PBF + systematic targeting and subsidization for the ultra-poor + provider incentive to offer services to the ultra-poor. PBF3 operated the same PBF contracts as PBF1 and PBF2 and included the same targeting procedures and provisions to care for the ultra-poor as PBF2. The core difference relates to the unit prices applied in PBF3, in that services provided to the ultra-poor were reimbursed at a higher rate than in PBF2 (initially at around 150% of the PBF2 prices, reduced to an average value of 115% in 2016, with variation across services). The idea was to compensate for loss of income from user fees, while at the same time offering providers an additional financial incentive to attract and provide services to the ultra-poor. As in PBF2, these higher reimbursement rates only pertained to services normally offered against payment of direct user charges at point of use. PBF4: Traditional PBF + Community-based health insurance (CBHI), including targeting and subsidization for the ultra-poor. In this case, PBF1, applying the same contractual model as described above, was introduced in parallel to CBHI. The insurance scheme was rolled out with support from the NGO ASMADE, which developed a scheme based on the model that the government had envisioned for a future universal health insurance model (Régime d’Assurance Maladie Universelle) [36]. ASMADE was selected as implementing agency given its prior experience with insurance implementation. Insurance was offered to the entire population at a yearly premium of 3900 FCFA (~ 7 USD) per person. Targeting took place following procedures similar to the ones used in PBF2 and PBF3 areas and the insurance premium for the ultra-poor was fully subsidized by the program. The insurance benefit package included a wide range of primary and secondary healthcare services. Payments to providers were made by both the insurance (in place of user charges) and by the PBF program, as case-based rewards as in PBF1. Across the four PBF models, adjustments to the quantity case-based payments were made according to the remoteness of the catchment population, staffing levels, and distance from the district capital, so that remote and disadvantaged facilities received higher case-based payments than easily accessible and better-equipped facilities. This approach resulted in nine different possible prices for the services incentivized by PBF, beyond the adjustments made in PBF2 and PBF3 for services provided to the targeted ultra-poor. The PBF program was rolled out in six regions (Centre Nord, Centre Ouest, Nord, Sud Ouest, Boucle du Mouhoun, and Centre Est), purposely selected by the MoH and its development partners as having health indicators below the national median at the onset of the intervention [42]. Within each region, the MoH purposely selected two districts to receive PBF on the basis of particularly poor outcomes on four key indicators: (i) contraceptive prevalence rate; (ii) assisted deliveries; (iii) antenatal consultations; and (iv) post-natal consultations. Identifying an adequate study design for the impact evaluation required an engagement of the main stakeholders (including MoH, the World Bank, and the independent impact evaluation research team) in an iterative discussion in which policy interests were appraised against both scientific considerations and pragmatic implementation concerns. The primary interest of the technical partners, particularly the World Bank, was to test the added benefit of moving from the standard PBF model to one that combined PBF with specific equity measures. The MoH, on the other hand, was primarily interested in identifying the overall impact of introducing PBF to generate evidence to inform future decisions on health financing in the country. The most straightforward design to accommodate both interests would have been a random allocation of all health facilities to one of the four different PBF intervention models or to a control group (i.e. status quo). This design was judged unfeasible from the policy makers’ perspective for four reasons. First, policy makers feared that randomization to either PBF or control within single districts was unfeasible since districts represent the primary operational units in the Burkinabè decentralized healthcare system and could have therefore not applied different purchasing strategies across the facilities they controlled. Second, policy makers feared that randomization to PBF and control within a single district might lead to conflict since the intervention would be highly visible and people from control groups could feel left behind. In addition, results from similarly complex health interventions with full-scale randomization report on the presence of spillover effects, due to consumers’ mobility across facilities, to shared management structures at the district level, and to intra-district competition, posing a challenge to the internal validity of the evaluations design [43, 44]. Third, the implementation of a CBHI scheme appeared to be too complex an intervention to be allocated randomly across facilities in twelve districts. The level of know-how necessary to facilitate insurance implementation was absent in most districts, hence from the very onset of the discussions, the government made clear its intention to test the insurance model in conjunction with PBF exclusively in one region, Boucle du Mouhoun, where prior experience with insurance implementation was present [45, 46]. Fourth, it became apparent that the implementation of the targeting component would be quite costly and that funds would not cover its implementation across all twelve districts. Hence, the research team and the policy stakeholders agreed to employ a quasi-experimental design with a nested experimental component (Fig. 1). In practice, this meant that within each region, two additional districts, judged by policy stakeholders to be comparable in terms of health indicators and health system structures, were selected as controls. The twelve control districts received no PBF intervention at all. Within the twelve intervention districts, the four PBF packages described earlier were implemented as follows: “Study design” Within the concerned intervention districts, facilities were allocated across the different PBF models in ‘randomization ceremonies’, attended by all health facility in-charges, district health managers, and other important district and regional stakeholders to maximize transparency. During the ceremonies, after a brief introduction of the program, health facility in-charges then took turns drawing facility names from a box containing all health facility names in the respective district. Starting with a predefined PBF model, facilities were then assigned in the order in which they were drawn from the box (i.e. 1st facility: PBF1, 2nd facility: PBF2, 3rd facility: PBF3, 4th facility: PBF1, etc.). As referral facilities to all health centers in the districts, the twelve district hospitals as well as the few regional hospitals were not randomized, but rather assigned to the PBF2 intervention package and therefore reimbursed for user fee-free treatment of targeted poor. It should be noted that although secondary-level facilities were included in the intervention to ensure continuity of care, the study focus was clearly on primary-level facilities. This design accommodated the concerns related to implementation complexity raised by policy makers, while still granting the possibility to answer their research questions. As explained in greater detail later, the quasi-experimental element of the design was used to assess the overall impact of the PBF program (irrespective of the specific intervention package) vis à vis status quo health service provision. The experimental element of the design was used to assess the specific added benefit of introducing equity measures (as in PBF2, PBF3, and PBF4) alongside the implementation of standard PBF (PBF1). It should be noted that the decision not to randomize across all covered health facilities came at the cost of having the study design’s statistical power to detect the impact of PBF compared to status quo reduced, due to a low number of clusters (defined as the districts for the quasi-experimental component) [47]. This concern was discussed among stakeholders already during the design phase. An expansion of the pilot intervention to a larger number of clusters, however, was not feasible due to financial and pragmatic constraints. Researchers and policy stakeholders therefore accepted this limitation, also in light of the fact that the primary research interest was related to the experimental component. The overall objective of the impact evaluation was to assess the impact of the PBF program on health service utilization and quality of service delivery across a wide range of targeted services. As described above, the specific focus of the impact evaluation, when compared to existing studies [48–51], was on estimating the added benefit of combining PBF with equity measures (i.e. either the ultra-poor targeting and health service subsidization or the CBHI). The specific research questions fitting the abovementioned objectives were: To address the equity implications embedded in the program, for each of the abovementioned three primary research questions, we further asked: In line with the existing conceptual literature on PBF [52–54], we postulated that the standard PBF intervention (i.e. the performance contracted implemented across all four PBF intervention packages) would affect both quantity and quality of service provision (Fig. 2). Specifically, we expected the combination of increased revenues due to performance rewards, enhanced managerial autonomy fostered by PBF, and intensified supervision on service provision to motivate healthcare providers to actively engage to increase health service provision (“will do”), while at the same time empowering them with the financial and managerial means necessary to improve quality of service delivery (“can do”). In turn, we expected improved quality of service delivery to encourage communities to seek healthcare more promptly (in spite of the fact that the Burkinabè health system still largely relied on direct user payments at point of use throughout the duration of the study), further increasing quantity of service delivery. We expected the standard PBF intervention to be equity-neutral, i.e. not to favor nor disfavor any specific socio-economic segment of society. “PBF theory of change” We expected the combination of the standard PBF1 and equity measure as applied by PBF2 and PBF3 to stimulate increases in quantity of service provision both overall and specifically for the ultra-poor. Across both the PBF2 and the PBF3 intervention arms, we expected the targeted exemptions to motivate the ultra-poor to seek healthcare more promptly and we expected this increased health service utilization to narrow existing gaps in the quantity of service delivery provided across socio-economic groups. We expected the socio-economic gap in quantity of service delivery to decrease more drastically in the PBF3 intervention arm, given the additional financial incentive offered to providers to actively reach out to the ultra-poor. In addition, given that services were purchased at a higher value than their estimated cost, we expected PBF3 facilities to increase their revenues substantially more than PBF1 and PBF2 facilities and in turn, we expected these additional revenues to be deployed towards infrastructural investments, further contributing to quality improvements in PBF3 facilities. Similar to what we described above in relation to PBF2 and PBF3, we postulated that the PBF4 intervention package would stimulate larger increases in the quantity of service delivery than the standard PBF1. We expected that increased health service utilization linked to the removal of financial barriers at point of use (thanks to insurance) would result in higher quantities of service delivery [55–59]. We also expected to observe larger improvements in the quality of service delivery in PBF4 facilities compared to PBF1 facilities, because we expected the CBHI scheme to act as an additional independent purchaser, exercising pressure on healthcare providers and hence acting as an additional stimulus for quality improvement. Given that targeting was implemented in a similar fashion across PBF2, PBF3, and PBF4 catchment areas, we expected to observe similar equity impacts, although somewhat more pronounced in PBF3 due to the additional provider incentive to reach out to the targeted ultra-poor. As addressed earlier when describing the intervention design, the list of incentivized services was identical across all four intervention models and it was very comprehensive, effectively covering almost the entirety of the essential package of services at primary and secondary levels. Hence, we did not expect healthcare providers in PBF1 and PBF4 to focus their efforts on specific services, simply because the payment structure was not set in a way as to offer any incentive to privilege some services above others. In the PBF2 and PBF3 intervention arm, however, the additional payments to compensate for loss of revenues from user fees and the additional financial incentive to treat the ultra-poor in PBF3 were exclusively tied to services normally offered against payment of direct user charges at point of use. Consequently, we expected that, when treating the very poor, providers would have an incentive to provide services for which they receive both the standard PBF payment and the additional compensation related to the attended loss of income from user fees/additional financial incentive (such as curative consultations) rather than services for which they received only the standard PBF payment (such as vaccination or antenatal care (ANC)). Therefore, in PBF2 and PBF3 facilities, we expected to observe an overall greater increase in case volume as well as a more remarkable equity improvement specifically for these services for which additional compensation was offered. Across the four PBF models, we expected increases in the quantity and quality of service provision to result in improved health status, especially among women and children, given that most indicators targeted MCH services. We did not expect the intervention to produce measurable changes in mortality patterns over its short implementation period, but we did expect the improved health service delivery resulting from PBF to be able to produce changes in simpler health status indicators, such as those related to anemia and acute malnutrition. Table 2 contains the set of main outcome indicators selected for our impact evaluation. In their selection, we took into account the above-described theory of change, including indicators at the different levels, the list of incentivized services, national and international standards, as well as data availability, quality and baseline values. In alignment with the focus of the study, the set of main outcome indicators pertains exclusively to the primary level of care. Main impact evaluation outcome indicator set An equity analysis was possible for all health care utilization and health status indicators for which data stems from the household survey, but not for indicators based on HMIS data. No equity analysis was possible for quality of care or human resources indicators The impact evaluation relied on two main sources of data: i. a household survey, conducted at baseline (October 2013–March 2014) and endline (April–June 2017); ii. a facility-based survey, also conducted at baseline and endline, including different data collection tools: a health facility assessment, a health worker survey, direct provider-patient observations (ANC and curative services for children under the age of 5 (U5)), and patient exit interviews (ANC, U5 curative services, and curative services to patients aged five or older). In addition, we used data from the routine health management information system (HMIS) to triangulate results obtained with primary data, and to estimate impact on indicators for which our sampling strategy did not generate a sufficiently large sample, namely health care utilization for acute illness and ART for prevention of mother-to-child transmission. We pooled the respective six months of routine data before the primary baseline and endline data collections and treated the resulting data points the same way as the primary data in the analyses described in the following. To collect primary data, we used a slightly revised version of the data collection tool set included in the HRITF impact evaluation toolkit [60] tailored to the needs of this specific impact evaluation and to the Burkinabé context. Table 3 illustrates the content of each data collection tool employed in our impact evaluation. Data collection tools and sources Immunization status, use of growth monitoring services Data collection teams spent one day at each sampled health facility and village, collecting data partially on paper (facility-based survey) and partially electronically (household survey) at baseline, and fully electronically at endline. In order to ensure efficient data collection, field supervisors travelled ahead of their teams to observe social protocols, finalize sampling before arrival of the data collection team, and ensure availability of interview partners. An independent quality assurance control of the endline data collection process was commissioned to the Institut de Recherche en Sciences de la Santé (IRSS), Burkina Faso. Our sampling strategy seeked to cover all health facilities and their attached catchment areas included in the intervention at baseline. Ideally, we would have liked panel data for all indicators, with the same units measured at baseline and endline, to ensure a maximally robust impact estimation. However, this was only partially possible, as explained below. The health facility survey was conducted in all public and private primary- and secondary-level health facilities in the intervention districts providing essential reproductive and child care services, as well as a random sample of health facilities in the control districts for an intervention-control facility ratio of approximately 3:1. This amounted to a total of 537 primary- and 24 secondary-level facilities in the 24 intervention and control districts surveyed at both baseline and at endline. The health facility sample was therefore a fully balanced panel. Health facilities newly opened in intervention districts between baseline and endline were included in the PBF intervention, but not in the impact evaluation sample. Health facility assessments were conducted in all health facilities included in the sample. In primary care facilities, the health worker survey was applied to all clinical skilled healthcare staff available on the day of the visit. Random sampling from all staff on payroll or a census was not feasible for reasons of time and budget. Due to high staff turnover and for the same budgetary and time reasons, it was not possible to systematically reinterview baseline respondents at endline. At secondary facilities, the health worker survey was administered to a random sample of three health workers with maternal and child health service delivery responsibilities. At all facilities where ANC services were provided on the day of the interviewer team visit, five direct provider-patient observations of ANC consultations were performed. Specifically, interviewers were instructed to observe the first five consultations as random sampling was not feasible for pragmatic reasons. Similarly, five direct provider-patient observations for U5 curative consultations presenting with a new condition were performed at all health facilities included in the sample. We further performed exit interviews with the observed patients (or their caregivers) for both service categories (ANC, U5 curative services), as well as with five curative consultation patients aged five and above at each sampled health facility. As observations and exit interviews depended on the patients naturally presenting on the day of data collection, the construction of a panel at this level was not possible. One village was randomly selected from the catchment area of each of the 523 public primary health care facilities included in the intervention. Since the secondary and few private facilities do not have an own catchment area, we could not sample a specific village at their level. The same villages were visited at baseline and endline, resulting in a fully balanced panel at the village level. In each selected village, at baseline, we randomly selected 15 households among all households with at least one currently pregnant woman or at least one woman having ended a pregnancy (irrespective of outcome) within the prior 24 months. Eligible households were identified through a preliminary comprehensive listing of all households in the selected villages, conducted by the same field enumerators who later administered the survey. This sampling criterion was chosen to allow obtaining sufficiently large sample sizes for the key indicators of interest (i.e. utilization and quality of maternal and child health care services) at much lower cost than with a fully representative population sample, which would have exceeded the budget available. We are aware that such a sample, which is not fully representative of the population in the concerned districts, limits our ability to approach research questions not directly pertaining to MCH. At endline, we organized data collection at the household level in such a way as to create, at least to the extent possible, a partial panel. Hence, we first returned to baseline households at endline. If they still fulfilled the abovementioned sampling criterion, they were included in the endline survey. If not, they were replaced with the nearest household meeting the above-outlined eligibility criteria. Our expectation was that a relatively large share of baseline households would still be eligible at endline, given high fertility rates in Burkina Faso and a three-year lag period between baseline and endline. However, we could identify and include in the endline household sample only 53% of baseline households, while 38% were no longer eligible, 5% could not be traced again and 4% were still eligible, but refused to participate again. Additional analyses showed that while the partial panel introduced some selection bias to our sample (slightly higher fertility in panel households), the bias was constant across study groups, and no other observable systematic differences between panel and non-panel households and individuals were apparent. While keeping alignment with the overall HRITF analytical framework and strategy, our analytical approach had to accommodate the specific nature of our study design, research questions, and data structure. In line with what was described earlier in this manuscript, we nested an experimental component within a broader quasi-experimental design to deliberately address the knowledge needs raised by the different policy stakeholders. As we had a balanced panel only at the health facility/village level, but not at the health worker, patient, household, and household member levels, where the majority of key indicators was measured, we could not conduct observation-level panel analyses, and therefore treated the baseline and endline samples as repeated cross-sections. Important to note is that this is not necessarily a specific weakness of our study, but rather the norm in studies pertaining to health system interventions [61]. We employed a Difference-in-Differences (DID) approach to identify the impact of PBF compared to status quo and the added benefit of the equity measures in PBF2, PBF3, and PBF4 compared to the standard PBF1. In DID, the intervention effect is estimated as the difference between the baseline-endline change in the intervention group and the control group. DID thereby isolates the intervention effect from baseline differences between the study groups, as well as from secular trends over time [47]. In contrast to our nested experiment, where impact estimates can be identified through a simple difference approach, causal inference in our quasi-experimental setting makes a DID approach necessary. In our specific case, we indeed observed significant differences between the intervention and control group on certain indicators at baseline, which are likely driven by the purposive way in which intervention and control districts were selected. We also observed secular positive and negative trends on many indicators, in light of general developments in the country and diverse national efforts and policies, most notably the exemption policy targeting pregnant and lactating women and children under five (the ‘politique de gratuité’) introduced nationally in June 2016. DID relies on two main assumptions: 1) the ‘parallel trend assumption’ that intervention and control units of observations would have in fact developed in the same manner in the absence of the treatment; and 2) the ‘stable unit treatment value assumption’ that each unit of observation was clearly either exposed to the treatment or not (i.e. no spillover) and that treatment was uniform across all units assigned to it (i.e. no contamination) [47]. To the extent possible, we aimed to validate these two assumptions with additionally collected information and data. In regards to 1), using HMIS, we found uniformity of pre-intervention trends among the different study groups on the indicators pertaining to health care utilization, strengthening our confidence in the control districts as an appropriate approximation of the counterfactual. In regards to 2), we systematically collected information on other interventions on-going and newly introduced in the intervention period. While the collected information did not allow us to adjust the models directly, it helped us in the interpretation of resulting effect estimates. For each of the indicators included in our impact evaluation, we specified three different DID models to address research questions 1 to 3. All analyses described below were applied exclusively to primary-level facilities. The twelve secondary-level facilities included in our sample were excluded from analysis due to lack of comparable controls. The panel structure of the data at the facility/village level allowed us to strengthen the precision of our estimations through the inclusion of facility/village fixed effects in the models, thereby controlling for time-invariant unobserved differences across health facilities/villages. Moreover, as an additional robustness test, we also estimated all effects relying only on the partial panel subsamples at the individual and household level. Given the specific equity focus of the impact evaluation (research question 4), for all indicators relying on household data we additionally performed all analyses with the subsample that contains only units from the lowest socio-economic quintile. This reflects an Intention-to-Treat (ITT) approach, measuring changes at the population level (e.g. “How has health service utilization changed in areas where a targeting of the ultra-poor had taken place, as opposed to areas where only the standard PBF1 was implemented?”) rather than at the individual level (e.g. “How has being selected as indigent changed health service utilization behavior?”). The latter would have been very interesting to explore, but was not possible within the standard HRITF methodology for PBF impact evaluations. Specifically, it would have required a full panel at the household member level and a substantially larger sample size to capture a sufficiently high number of individuals in the demographic groups relevant to the main indicators of interest at baseline who would later be selected as ultra-poor as the project rolled out. First, to answer research question 1, we relied on the quasi-experimental design component and compared all PBF facilities/villages (pooled across all intervention arms) to control facilities/villages, leading to the following regression equation: where Ydfit is the outcome variable for individual i from facility/village f in district d at time t with t = {2014, 2017}. 2017t is a dummy variable indicating endline observations, thus coefficient β gives the time fixed effect. PBFd is a dummy variable that equals one for individuals in PBF districts and zero for individuals in control districts. αf are facility/village fixed effects, and Xit is a set of additional individual-level covariates as relevant to the respective indicator (i.e. health worker, patient, household member, and/or caregiver characteristics). ϵdfit is the error term. Following common practice, standard errors were clustered at the district level, which is the level of treatment assignment for the quasi-experimental component of the study design [47]. The coefficient δ gives the DID estimate for the effect of being located in a PBF district when compared to non-PBF districts. Second, to answer research question 2, we relied on the quasi-experimental design component, but compared single PBF intervention arms to the controls. This leads to the following regression equation: where PBFdk is are dummy variables that equal one for individuals from facilities/village in treatment arm PBFk, where k = {1,2,3,4}. Individuals from facilities/village in control districts provide the comparison group. The DID estimates δk give the effects of PBFk in comparison to status quo (control districts). The remaining equation components are equal to specification 1. Note that as in specification 1, standard errors were clustered at the district level, the level of treatment assignment. Third, to answer research question 3, we relied on the experimental design component (randomized controlled trial) nested within the quasi-experiment and compared the PBF arms with built-in equity measures (PBF2, PBF3, and PBF4) to the standard PBF (PBF1). For this exercise we had to estimate two separate regression equations; eq. 2b to compare PBF4 to PBF1 in the two districts of Boucle du Mouhoun, where the combination of insurance and PBF and PBF alone had been randomly assigned across facilities, and eq. 2a to compare PBF2 and PBF3 to PBF1 in the eight remaining districts: where Yfit is the outcome variable for individual i from facility/village f at time t with t = {2013, 2017} in the intervention districts. In this set of analyses, standard errors were clustered at health facility/village level, the level at which random assignment into the four PBF models took place. δ2 and δ3 give the DID estimate for the effect of being located in PBF2 compared to PBF1 and PBF3 compared to PBF1, respectively. δ4 gives the DiD estimate for the effect of being located in PBF4 compared to PBF1. By design, the quasi-experimental study component is challenged by a relatively low number of clusters (24 districts), as it was not possible to randomize intervention and control facilities at facility level. Too few clusters might lead to the estimation of downward-biased standard errors and, consequently, to an over-rejection of the null hypothesis that there is no program effect. Thus, there is an elevated risk of postulating significant program effects when there is actually no effect detectable with the design. There appears to be no consensus in the literature yet as to which number of clusters is sufficient, but 24 clusters are on the lower end of the spectrum of sufficiency in available simulation studies [62, 63]. Further, studies have shown that the implications of too few clusters are considerably worse when clusters are strongly imbalanced in terms of within-cluster sample sizes as it is unfortunately the case in our study design [64, 65]. The available literature proposes several robustness tests [62, 66]. In a simulation study, Cameron, Gelbach and Miller [63] investigated different recently suggested bootstrapping methods to obtain asymptotic refinement in a scenario with as few as five clusters. They found that the ‘wild bootstrap’ can considerably improve statistical inference of the coefficient estimate and produces much lower over-rejection rates of the H0 than, for instance, the common practice of bootstrapping standard errors. Following this literature, as a robustness test, we applied the ‘wild bootstrap’ to all specification 1 models. In contrast to bootstrapping standard errors, the ‘wild bootstrap’ involves a bootstrap t-procedure [67], where the Wald statistic is bootstrapped, and where the resulting distribution of the Wald statistic is used to confirm or reject inference on the original Wald statistic obtained in the DID regressions.

AI Digest

Study Justification:

– The study aims to evaluate the impact of a Performance-Based Financing (PBF) intervention in Burkina Faso.
– The study design combines experimental and quasi-experimental approaches to generate policy-relevant evidence on the intervention’s impact.
– The study addresses the need for expanding the impact evaluation toolbox beyond randomized controlled trials to include quasi-experimental approaches.
– The study considers the complex health interventions embedded in real-life settings, characterized by multiple research interests and limited researcher control.

Highlights:

– The study design combines a quasi-experimental design with a nested experimental component (randomized controlled trial).
– The study focuses on evaluating the impact of the PBF intervention on health service utilization and quality of service delivery.
– The study examines the added benefit of combining PBF with equity measures, such as targeting and subsidization for the ultra-poor and community-based health insurance.
– The study aims to provide evidence on the impact of PBF on health outcomes, particularly for women and children.

Recommendations:

– The study recommends the implementation of PBF programs with equity measures to improve access to healthcare services for vulnerable populations.
– The study suggests the integration of community-based health insurance schemes with PBF to remove financial barriers to healthcare.
– The study highlights the importance of considering context-specific factors and stakeholder input in designing and implementing PBF interventions.
– The study emphasizes the need for rigorous impact evaluations to inform future decisions on health financing and service delivery.

Key Role Players:

– Ministry of Health (MoH)
– World Bank
– Independent academic institution (research team)
– District Health Management Teams
– Health facility in-charges
– Development partners

Cost Items for Planning Recommendations:

– Funding for PBF intervention implementation
– Financial and technical assistance from development partners
– Costs for targeting and subsidization measures for the ultra-poor
– Costs for implementing community-based health insurance schemes
– Budget for research team conducting the impact evaluation
– Costs for data collection, analysis, and reporting

Força da prova

The strength of evidence for this abstract is 8 out of 10.
The evidence in the abstract is strong because it describes a study design that combines experimental and quasi-experimental approaches to evaluate the impact of a performance-based financing intervention in Burkina Faso. The study employed a quasi-experimental design with a nested experimental component and used a difference-in-differences approach as the main analytical strategy. The abstract also provides detailed information on the background, methods, and objectives of the study. To improve the evidence, the abstract could include more information on the sample size, data collection methods, and statistical analysis techniques used in the study.

Resumo

In order to understand the intervention and the design decisions we made within the framework of our impact evaluation, we first provide some background information on the country and its health system. Burkina Faso is a landlocked country located in West Africa, with a population of 18.5 million. At the time the study was planned, the country’s GDP per capita stood at USD 1560 (2013, adjusted for purchasing power parity) placing it among the poorest countries in the world [20]. The 2014 Human Development Index ranked Burkina Faso 185 out of 188 countries [21]. In spite of substantial improvements over the course of the last few years, health indicators still largely lag behind regional averages. Life expectancy is at 58 years. Maternal and under-five mortality are estimated at 371/100,000 [22] and 102/1000 [23], respectively. Malaria, acute respiratory infections, and diarrhea still account for the largest proportion of child mortality, often coupled with an underlying situation of malnutrition, with nearly 40% of all children being classified as stunted. Health service delivery is organized in a three-tier system, with primary facilities (Centre de Santé et Promotion Sociale – CSPS) located in rural areas; district hospitals located in each district capital; and regional and national referral hospitals located in the regional capitals and in the national capital Ouagadougou [24]. Public facilities provide the vast majority of health services [25]. The health sector suffers from a generalized lack of resources. In 2013, total per capita health expenditure was estimated at 6.4% of GDP, equivalent to Purchasing Power Parity USD 109. Government expenditure amounted to 58% of total health expenditure, including contributions by development partners being estimated at 23% of this total. Private health expenditure is substantial as user charges continue to be applied across a variety of essential healthcare services, with more than 80% of all private expenditure on health not being channeled through pre-paid and pooled mechanisms [26, 27]. The poor health outcomes described above are, to a large extent, the result of poor access to services, with people largely under-utilizing the care they need. The literature has consistently reported that geographical barriers, due to scarcity of health facilities, and financial barriers, due to user charges, continue to hamper access to healthcare services [28–33]. The PBF program at the core of our impact evaluation rests on the experience and knowledge acquired during the implementation of a pre-pilot PBF intervention, managed by the Ministry of Health (MoH) with financial and technical assistance from the World Bank in the period 2011–2013 in three districts (Titao, Leo, and Boulsa). Within the framework of this early PBF intervention, health facilities and the MoH entered a contractual agreement whereby the MoH would reward the provision of a defined service package according to a case-based payment modality, adjusted for quality of service provision and following verification. An independent evaluation detected a positive effect of the intervention across maternal healthcare services [34]. Program data showed improvements in the quality of services provided [35]. In light of the pre-pilot experiences, the MoH, again with financial and technical assistance from the World Bank, decided to scale up the PBF intervention to an additional 12 districts in early 2014. In doing so, it was decided to test the combination of PBF with other policies under consideration for scale-up or integration into the national health system, notably community-based health insurance (CBHI) and user fee exemption for the ultra-poor [36, 37]. To understand this decision, it is important to locate the PBF program in Burkina Faso within the broader context of PBF programs supported by the World Bank over the course of the past decade through the Health Results Innovation Trust Fund (HRITF), including impact evaluations in over 20 countries, of which the majority in Sub-Saharan Africa. Emerging preliminary findings across the existing pilots suggested PBF’s potential to improve service provision, but pointed at PBF’s inability to reduce inequality in service access, specifically when implemented exclusively as a supply-side intervention [38]. It is also in the light of these considerations that the MoH and its development partners opted to implement PBF in conjunction with a series of equity measures, aimed at maximizing the potential of PBF to act as a catalyst for equity changes. It is at this point, building on the knowledge generated in other settings and looking at the specific need to address equity gaps in the country, that the World Bank realized the potential to use the case of Burkina Faso to test novel financing, purchasing and targeting mechanisms, combining elements of supply and demand side interventions into a single program. It is also at this point that knowledge generation was conceptualized as an intrinsic component of the PBF program implementation and that the decision to contract an independent academic institution to carry out the impact evaluation was made. It ought to be noted, as described in detail in the sections that follow, that the research team was not involved in key intervention design decisions (i.e. selection of intervention districts, test of equity measures), but did contribute to shaping the evaluation design as well as to elaborating the intervention design (i.e. exact design of the equity measures). Similarly to the pre-pilot, the primary objective of the PBF program was to improve the utilization and quality of maternal and child health (MCH) services, in particular among vulnerable populations such as the ultra-poor. Effectively, however, the PBF benefit package was very comprehensive and comprised a broad range of primary- and secondary-level care services beyond MCH, including also general adult curative consultations and HIV and tuberculosis services [39]. To address the abovementioned equity concerns, the MoH decided to implement four different models of PBF, three of which included special provisions to improve access to care for the ultra-poor by explicitly alleviating the financial burden imposed by user charges. In detail, the following four PBF models were implemented (Table 1): “Intervention design” PBF1: Standard PBF. PBF contracts between the MoH and the health facilities defined the services purchased by PBF (quantity), quality targets, and payment modalities. External reviewers assessed facility reports on quantity indicators on a monthly basis. Based on these verified results, contracted facilities received case-based payments for the services delivered, in addition to all preexisting financing. PBF unit prices were calculated a priori by the implementation team, on the basis of the relative cost and frequency of the services provided. Service quality was assessed with comprehensive quality checklists, verified on a quarterly basis by the District Health Management Teams. Facilities received an additional bonus calculated on the basis of quantity outcomes and service quality if they achieve a quality score of initially at least 50%, later 60%. PBF payments were intended to supplement other facility revenue in the complex pre-existing mixed payment system [40], to fund expenditure, increase savings, and allow bonus payments to staff. Facilities were initially completely autonomous in their spending decisions. In 2017, ceilings as to how much could be disbursed to staff were introduced in response to undesirable practices in some facilities. Facilities were further provided with criteria for the distribution of bonuses among staff, including cadre, position, seniority, work time, and performance. PBF1 did not include any specific provision to facilitate access to care for the ultra-poor. PBF2: Standard PBF + systematic targeting and health service subsidization for the ultra-poor. PBF2 operated according to the same contractual model as PBF1, but combined it with specific provisions to facilitate access to care for extremely vulnerable individuals residing in the health facility catchment area. These provisions included: PBF3: Standard PBF + systematic targeting and subsidization for the ultra-poor + provider incentive to offer services to the ultra-poor. PBF3 operated the same PBF contracts as PBF1 and PBF2 and included the same targeting procedures and provisions to care for the ultra-poor as PBF2. The core difference relates to the unit prices applied in PBF3, in that services provided to the ultra-poor were reimbursed at a higher rate than in PBF2 (initially at around 150% of the PBF2 prices, reduced to an average value of 115% in 2016, with variation across services). The idea was to compensate for loss of income from user fees, while at the same time offering providers an additional financial incentive to attract and provide services to the ultra-poor. As in PBF2, these higher reimbursement rates only pertained to services normally offered against payment of direct user charges at point of use. PBF4: Traditional PBF + Community-based health insurance (CBHI), including targeting and subsidization for the ultra-poor. In this case, PBF1, applying the same contractual model as described above, was introduced in parallel to CBHI. The insurance scheme was rolled out with support from the NGO ASMADE, which developed a scheme based on the model that the government had envisioned for a future universal health insurance model (Régime d’Assurance Maladie Universelle) [36]. ASMADE was selected as implementing agency given its prior experience with insurance implementation. Insurance was offered to the entire population at a yearly premium of 3900 FCFA (~ 7 USD) per person. Targeting took place following procedures similar to the ones used in PBF2 and PBF3 areas and the insurance premium for the ultra-poor was fully subsidized by the program. The insurance benefit package included a wide range of primary and secondary healthcare services. Payments to providers were made by both the insurance (in place of user charges) and by the PBF program, as case-based rewards as in PBF1. Across the four PBF models, adjustments to the quantity case-based payments were made according to the remoteness of the catchment population, staffing levels, and distance from the district capital, so that remote and disadvantaged facilities received higher case-based payments than easily accessible and better-equipped facilities. This approach resulted in nine different possible prices for the services incentivized by PBF, beyond the adjustments made in PBF2 and PBF3 for services provided to the targeted ultra-poor. The PBF program was rolled out in six regions (Centre Nord, Centre Ouest, Nord, Sud Ouest, Boucle du Mouhoun, and Centre Est), purposely selected by the MoH and its development partners as having health indicators below the national median at the onset of the intervention [42]. Within each region, the MoH purposely selected two districts to receive PBF on the basis of particularly poor outcomes on four key indicators: (i) contraceptive prevalence rate; (ii) assisted deliveries; (iii) antenatal consultations; and (iv) post-natal consultations. Identifying an adequate study design for the impact evaluation required an engagement of the main stakeholders (including MoH, the World Bank, and the independent impact evaluation research team) in an iterative discussion in which policy interests were appraised against both scientific considerations and pragmatic implementation concerns. The primary interest of the technical partners, particularly the World Bank, was to test the added benefit of moving from the standard PBF model to one that combined PBF with specific equity measures. The MoH, on the other hand, was primarily interested in identifying the overall impact of introducing PBF to generate evidence to inform future decisions on health financing in the country. The most straightforward design to accommodate both interests would have been a random allocation of all health facilities to one of the four different PBF intervention models or to a control group (i.e. status quo). This design was judged unfeasible from the policy makers’ perspective for four reasons. First, policy makers feared that randomization to either PBF or control within single districts was unfeasible since districts represent the primary operational units in the Burkinabè decentralized healthcare system and could have therefore not applied different purchasing strategies across the facilities they controlled. Second, policy makers feared that randomization to PBF and control within a single district might lead to conflict since the intervention would be highly visible and people from control groups could feel left behind. In addition, results from similarly complex health interventions with full-scale randomization report on the presence of spillover effects, due to consumers’ mobility across facilities, to shared management structures at the district level, and to intra-district competition, posing a challenge to the internal validity of the evaluations design [43, 44]. Third, the implementation of a CBHI scheme appeared to be too complex an intervention to be allocated randomly across facilities in twelve districts. The level of know-how necessary to facilitate insurance implementation was absent in most districts, hence from the very onset of the discussions, the government made clear its intention to test the insurance model in conjunction with PBF exclusively in one region, Boucle du Mouhoun, where prior experience with insurance implementation was present [45, 46]. Fourth, it became apparent that the implementation of the targeting component would be quite costly and that funds would not cover its implementation across all twelve districts. Hence, the research team and the policy stakeholders agreed to employ a quasi-experimental design with a nested experimental component (Fig. 1). In practice, this meant that within each region, two additional districts, judged by policy stakeholders to be comparable in terms of health indicators and health system structures, were selected as controls. The twelve control districts received no PBF intervention at all. Within the twelve intervention districts, the four PBF packages described earlier were implemented as follows: “Study design” Within the concerned intervention districts, facilities were allocated across the different PBF models in ‘randomization ceremonies’, attended by all health facility in-charges, district health managers, and other important district and regional stakeholders to maximize transparency. During the ceremonies, after a brief introduction of the program, health facility in-charges then took turns drawing facility names from a box containing all health facility names in the respective district. Starting with a predefined PBF model, facilities were then assigned in the order in which they were drawn from the box (i.e. 1st facility: PBF1, 2nd facility: PBF2, 3rd facility: PBF3, 4th facility: PBF1, etc.). As referral facilities to all health centers in the districts, the twelve district hospitals as well as the few regional hospitals were not randomized, but rather assigned to the PBF2 intervention package and therefore reimbursed for user fee-free treatment of targeted poor. It should be noted that although secondary-level facilities were included in the intervention to ensure continuity of care, the study focus was clearly on primary-level facilities. This design accommodated the concerns related to implementation complexity raised by policy makers, while still granting the possibility to answer their research questions. As explained in greater detail later, the quasi-experimental element of the design was used to assess the overall impact of the PBF program (irrespective of the specific intervention package) vis à vis status quo health service provision. The experimental element of the design was used to assess the specific added benefit of introducing equity measures (as in PBF2, PBF3, and PBF4) alongside the implementation of standard PBF (PBF1). It should be noted that the decision not to randomize across all covered health facilities came at the cost of having the study design’s statistical power to detect the impact of PBF compared to status quo reduced, due to a low number of clusters (defined as the districts for the quasi-experimental component) [47]. This concern was discussed among stakeholders already during the design phase. An expansion of the pilot intervention to a larger number of clusters, however, was not feasible due to financial and pragmatic constraints. Researchers and policy stakeholders therefore accepted this limitation, also in light of the fact that the primary research interest was related to the experimental component. The overall objective of the impact evaluation was to assess the impact of the PBF program on health service utilization and quality of service delivery across a wide range of targeted services. As described above, the specific focus of the impact evaluation, when compared to existing studies [48–51], was on estimating the added benefit of combining PBF with equity measures (i.e. either the ultra-poor targeting and health service subsidization or the CBHI). The specific research questions fitting the abovementioned objectives were: To address the equity implications embedded in the program, for each of the abovementioned three primary research questions, we further asked: In line with the existing conceptual literature on PBF [52–54], we postulated that the standard PBF intervention (i.e. the performance contracted implemented across all four PBF intervention packages) would affect both quantity and quality of service provision (Fig. 2). Specifically, we expected the combination of increased revenues due to performance rewards, enhanced managerial autonomy fostered by PBF, and intensified supervision on service provision to motivate healthcare providers to actively engage to increase health service provision (“will do”), while at the same time empowering them with the financial and managerial means necessary to improve quality of service delivery (“can do”). In turn, we expected improved quality of service delivery to encourage communities to seek healthcare more promptly (in spite of the fact that the Burkinabè health system still largely relied on direct user payments at point of use throughout the duration of the study), further increasing quantity of service delivery. We expected the standard PBF intervention to be equity-neutral, i.e. not to favor nor disfavor any specific socio-economic segment of society. “PBF theory of change” We expected the combination of the standard PBF1 and equity measure as applied by PBF2 and PBF3 to stimulate increases in quantity of service provision both overall and specifically for the ultra-poor. Across both the PBF2 and the PBF3 intervention arms, we expected the targeted exemptions to motivate the ultra-poor to seek healthcare more promptly and we expected this increased health service utilization to narrow existing gaps in the quantity of service delivery provided across socio-economic groups. We expected the socio-economic gap in quantity of service delivery to decrease more drastically in the PBF3 intervention arm, given the additional financial incentive offered to providers to actively reach out to the ultra-poor. In addition, given that services were purchased at a higher value than their estimated cost, we expected PBF3 facilities to increase their revenues substantially more than PBF1 and PBF2 facilities and in turn, we expected these additional revenues to be deployed towards infrastructural investments, further contributing to quality improvements in PBF3 facilities. Similar to what we described above in relation to PBF2 and PBF3, we postulated that the PBF4 intervention package would stimulate larger increases in the quantity of service delivery than the standard PBF1. We expected that increased health service utilization linked to the removal of financial barriers at point of use (thanks to insurance) would result in higher quantities of service delivery [55–59]. We also expected to observe larger improvements in the quality of service delivery in PBF4 facilities compared to PBF1 facilities, because we expected the CBHI scheme to act as an additional independent purchaser, exercising pressure on healthcare providers and hence acting as an additional stimulus for quality improvement. Given that targeting was implemented in a similar fashion across PBF2, PBF3, and PBF4 catchment areas, we expected to observe similar equity impacts, although somewhat more pronounced in PBF3 due to the additional provider incentive to reach out to the targeted ultra-poor. As addressed earlier when describing the intervention design, the list of incentivized services was identical across all four intervention models and it was very comprehensive, effectively covering almost the entirety of the essential package of services at primary and secondary levels. Hence, we did not expect healthcare providers in PBF1 and PBF4 to focus their efforts on specific services, simply because the payment structure was not set in a way as to offer any incentive to privilege some services above others. In the PBF2 and PBF3 intervention arm, however, the additional payments to compensate for loss of revenues from user fees and the additional financial incentive to treat the ultra-poor in PBF3 were exclusively tied to services normally offered against payment of direct user charges at point of use. Consequently, we expected that, when treating the very poor, providers would have an incentive to provide services for which they receive both the standard PBF payment and the additional compensation related to the attended loss of income from user fees/additional financial incentive (such as curative consultations) rather than services for which they received only the standard PBF payment (such as vaccination or antenatal care (ANC)). Therefore, in PBF2 and PBF3 facilities, we expected to observe an overall greater increase in case volume as well as a more remarkable equity improvement specifically for these services for which additional compensation was offered. Across the four PBF models, we expected increases in the quantity and quality of service provision to result in improved health status, especially among women and children, given that most indicators targeted MCH services. We did not expect the intervention to produce measurable changes in mortality patterns over its short implementation period, but we did expect the improved health service delivery resulting from PBF to be able to produce changes in simpler health status indicators, such as those related to anemia and acute malnutrition. Table 2 contains the set of main outcome indicators selected for our impact evaluation. In their selection, we took into account the above-described theory of change, including indicators at the different levels, the list of incentivized services, national and international standards, as well as data availability, quality and baseline values. In alignment with the focus of the study, the set of main outcome indicators pertains exclusively to the primary level of care. Main impact evaluation outcome indicator set An equity analysis was possible for all health care utilization and health status indicators for which data stems from the household survey, but not for indicators based on HMIS data. No equity analysis was possible for quality of care or human resources indicators The impact evaluation relied on two main sources of data: i. a household survey, conducted at baseline (October 2013–March 2014) and endline (April–June 2017); ii. a facility-based survey, also conducted at baseline and endline, including different data collection tools: a health facility assessment, a health worker survey, direct provider-patient observations (ANC and curative services for children under the age of 5 (U5)), and patient exit interviews (ANC, U5 curative services, and curative services to patients aged five or older). In addition, we used data from the routine health management information system (HMIS) to triangulate results obtained with primary data, and to estimate impact on indicators for which our sampling strategy did not generate a sufficiently large sample, namely health care utilization for acute illness and ART for prevention of mother-to-child transmission. We pooled the respective six months of routine data before the primary baseline and endline data collections and treated the resulting data points the same way as the primary data in the analyses described in the following. To collect primary data, we used a slightly revised version of the data collection tool set included in the HRITF impact evaluation toolkit [60] tailored to the needs of this specific impact evaluation and to the Burkinabé context. Table 3 illustrates the content of each data collection tool employed in our impact evaluation. Data collection tools and sources Immunization status, use of growth monitoring services Data collection teams spent one day at each sampled health facility and village, collecting data partially on paper (facility-based survey) and partially electronically (household survey) at baseline, and fully electronically at endline. In order to ensure efficient data collection, field supervisors travelled ahead of their teams to observe social protocols, finalize sampling before arrival of the data collection team, and ensure availability of interview partners. An independent quality assurance control of the endline data collection process was commissioned to the Institut de Recherche en Sciences de la Santé (IRSS), Burkina Faso. Our sampling strategy seeked to cover all health facilities and their attached catchment areas included in the intervention at baseline. Ideally, we would have liked panel data for all indicators, with the same units measured at baseline and endline, to ensure a maximally robust impact estimation. However, this was only partially possible, as explained below. The health facility survey was conducted in all public and private primary- and secondary-level health facilities in the intervention districts providing essential reproductive and child care services, as well as a random sample of health facilities in the control districts for an intervention-control facility ratio of approximately 3:1. This amounted to a total of 537 primary- and 24 secondary-level facilities in the 24 intervention and control districts surveyed at both baseline and at endline. The health facility sample was therefore a fully balanced panel. Health facilities newly opened in intervention districts between baseline and endline were included in the PBF intervention, but not in the impact evaluation sample. Health facility assessments were conducted in all health facilities included in the sample. In primary care facilities, the health worker survey was applied to all clinical skilled healthcare staff available on the day of the visit. Random sampling from all staff on payroll or a census was not feasible for reasons of time and budget. Due to high staff turnover and for the same budgetary and time reasons, it was not possible to systematically reinterview baseline respondents at endline. At secondary facilities, the health worker survey was administered to a random sample of three health workers with maternal and child health service delivery responsibilities. At all facilities where ANC services were provided on the day of the interviewer team visit, five direct provider-patient observations of ANC consultations were performed. Specifically, interviewers were instructed to observe the first five consultations as random sampling was not feasible for pragmatic reasons. Similarly, five direct provider-patient observations for U5 curative consultations presenting with a new condition were performed at all health facilities included in the sample. We further performed exit interviews with the observed patients (or their caregivers) for both service categories (ANC, U5 curative services), as well as with five curative consultation patients aged five and above at each sampled health facility. As observations and exit interviews depended on the patients naturally presenting on the day of data collection, the construction of a panel at this level was not possible. One village was randomly selected from the catchment area of each of the 523 public primary health care facilities included in the intervention. Since the secondary and few private facilities do not have an own catchment area, we could not sample a specific village at their level. The same villages were visited at baseline and endline, resulting in a fully balanced panel at the village level. In each selected village, at baseline, we randomly selected 15 households among all households with at least one currently pregnant woman or at least one woman having ended a pregnancy (irrespective of outcome) within the prior 24 months. Eligible households were identified through a preliminary comprehensive listing of all households in the selected villages, conducted by the same field enumerators who later administered the survey. This sampling criterion was chosen to allow obtaining sufficiently large sample sizes for the key indicators of interest (i.e. utilization and quality of maternal and child health care services) at much lower cost than with a fully representative population sample, which would have exceeded the budget available. We are aware that such a sample, which is not fully representative of the population in the concerned districts, limits our ability to approach research questions not directly pertaining to MCH. At endline, we organized data collection at the household level in such a way as to create, at least to the extent possible, a partial panel. Hence, we first returned to baseline households at endline. If they still fulfilled the abovementioned sampling criterion, they were included in the endline survey. If not, they were replaced with the nearest household meeting the above-outlined eligibility criteria. Our expectation was that a relatively large share of baseline households would still be eligible at endline, given high fertility rates in Burkina Faso and a three-year lag period between baseline and endline. However, we could identify and include in the endline household sample only 53% of baseline households, while 38% were no longer eligible, 5% could not be traced again and 4% were still eligible, but refused to participate again. Additional analyses showed that while the partial panel introduced some selection bias to our sample (slightly higher fertility in panel households), the bias was constant across study groups, and no other observable systematic differences between panel and non-panel households and individuals were apparent. While keeping alignment with the overall HRITF analytical framework and strategy, our analytical approach had to accommodate the specific nature of our study design, research questions, and data structure. In line with what was described earlier in this manuscript, we nested an experimental component within a broader quasi-experimental design to deliberately address the knowledge needs raised by the different policy stakeholders. As we had a balanced panel only at the health facility/village level, but not at the health worker, patient, household, and household member levels, where the majority of key indicators was measured, we could not conduct observation-level panel analyses, and therefore treated the baseline and endline samples as repeated cross-sections. Important to note is that this is not necessarily a specific weakness of our study, but rather the norm in studies pertaining to health system interventions [61]. We employed a Difference-in-Differences (DID) approach to identify the impact of PBF compared to status quo and the added benefit of the equity measures in PBF2, PBF3, and PBF4 compared to the standard PBF1. In DID, the intervention effect is estimated as the difference between the baseline-endline change in the intervention group and the control group. DID thereby isolates the intervention effect from baseline differences between the study groups, as well as from secular trends over time [47]. In contrast to our nested experiment, where impact estimates can be identified through a simple difference approach, causal inference in our quasi-experimental setting makes a DID approach necessary. In our specific case, we indeed observed significant differences between the intervention and control group on certain indicators at baseline, which are likely driven by the purposive way in which intervention and control districts were selected. We also observed secular positive and negative trends on many indicators, in light of general developments in the country and diverse national efforts and policies, most notably the exemption policy targeting pregnant and lactating women and children under five (the ‘politique de gratuité’) introduced nationally in June 2016. DID relies on two main assumptions: 1) the ‘parallel trend assumption’ that intervention and control units of observations would have in fact developed in the same manner in the absence of the treatment; and 2) the ‘stable unit treatment value assumption’ that each unit of observation was clearly either exposed to the treatment or not (i.e. no spillover) and that treatment was uniform across all units assigned to it (i.e. no contamination) [47]. To the extent possible, we aimed to validate these two assumptions with additionally collected information and data. In regards to 1), using HMIS, we found uniformity of pre-intervention trends among the different study groups on the indicators pertaining to health care utilization, strengthening our confidence in the control districts as an appropriate approximation of the counterfactual. In regards to 2), we systematically collected information on other interventions on-going and newly introduced in the intervention period. While the collected information did not allow us to adjust the models directly, it helped us in the interpretation of resulting effect estimates. For each of the indicators included in our impact evaluation, we specified three different DID models to address research questions 1 to 3. All analyses described below were applied exclusively to primary-level facilities. The twelve secondary-level facilities included in our sample were excluded from analysis due to lack of comparable controls. The panel structure of the data at the facility/village level allowed us to strengthen the precision of our estimations through the inclusion of facility/village fixed effects in the models, thereby controlling for time-invariant unobserved differences across health facilities/villages. Moreover, as an additional robustness test, we also estimated all effects relying only on the partial panel subsamples at the individual and household level. Given the specific equity focus of the impact evaluation (research question 4), for all indicators relying on household data we additionally performed all analyses with the subsample that contains only units from the lowest socio-economic quintile. This reflects an Intention-to-Treat (ITT) approach, measuring changes at the population level (e.g. “How has health service utilization changed in areas where a targeting of the ultra-poor had taken place, as opposed to areas where only the standard PBF1 was implemented?”) rather than at the individual level (e.g. “How has being selected as indigent changed health service utilization behavior?”). The latter would have been very interesting to explore, but was not possible within the standard HRITF methodology for PBF impact evaluations. Specifically, it would have required a full panel at the household member level and a substantially larger sample size to capture a sufficiently high number of individuals in the demographic groups relevant to the main indicators of interest at baseline who would later be selected as ultra-poor as the project rolled out. First, to answer research question 1, we relied on the quasi-experimental design component and compared all PBF facilities/villages (pooled across all intervention arms) to control facilities/villages, leading to the following regression equation: where Ydfit is the outcome variable for individual i from facility/village f in district d at time t with t = {2014, 2017}. 2017t is a dummy variable indicating endline observations, thus coefficient β gives the time fixed effect. PBFd is a dummy variable that equals one for individuals in PBF districts and zero for individuals in control districts. αf are facility/village fixed effects, and Xit is a set of additional individual-level covariates as relevant to the respective indicator (i.e. health worker, patient, household member, and/or caregiver characteristics). ϵdfit is the error term. Following common practice, standard errors were clustered at the district level, which is the level of treatment assignment for the quasi-experimental component of the study design [47]. The coefficient δ gives the DID estimate for the effect of being located in a PBF district when compared to non-PBF districts. Second, to answer research question 2, we relied on the quasi-experimental design component, but compared single PBF intervention arms to the controls. This leads to the following regression equation: where PBFdk is are dummy variables that equal one for individuals from facilities/village in treatment arm PBFk, where k = {1,2,3,4}. Individuals from facilities/village in control districts provide the comparison group. The DID estimates δk give the effects of PBFk in comparison to status quo (control districts). The remaining equation components are equal to specification 1. Note that as in specification 1, standard errors were clustered at the district level, the level of treatment assignment. Third, to answer research question 3, we relied on the experimental design component (randomized controlled trial) nested within the quasi-experiment and compared the PBF arms with built-in equity measures (PBF2, PBF3, and PBF4) to the standard PBF (PBF1). For this exercise we had to estimate two separate regression equations; eq. 2b to compare PBF4 to PBF1 in the two districts of Boucle du Mouhoun, where the combination of insurance and PBF and PBF alone had been randomly assigned across facilities, and eq. 2a to compare PBF2 and PBF3 to PBF1 in the eight remaining districts: where Yfit is the outcome variable for individual i from facility/village f at time t with t = {2013, 2017} in the intervention districts. In this set of analyses, standard errors were clustered at health facility/village level, the level at which random assignment into the four PBF models took place. δ2 and δ3 give the DID estimate for the effect of being located in PBF2 compared to PBF1 and PBF3 compared to PBF1, respectively. δ4 gives the DiD estimate for the effect of being located in PBF4 compared to PBF1. By design, the quasi-experimental study component is challenged by a relatively low number of clusters (24 districts), as it was not possible to randomize intervention and control facilities at facility level. Too few clusters might lead to the estimation of downward-biased standard errors and, consequently, to an over-rejection of the null hypothesis that there is no program effect. Thus, there is an elevated risk of postulating significant program effects when there is actually no effect detectable with the design. There appears to be no consensus in the literature yet as to which number of clusters is sufficient, but 24 clusters are on the lower end of the spectrum of sufficiency in available simulation studies [62, 63]. Further, studies have shown that the implications of too few clusters are considerably worse when clusters are strongly imbalanced in terms of within-cluster sample sizes as it is unfortunately the case in our study design [64, 65]. The available literature proposes several robustness tests [62, 66]. In a simulation study, Cameron, Gelbach and Miller [63] investigated different recently suggested bootstrapping methods to obtain asymptotic refinement in a scenario with as few as five clusters. They found that the ‘wild bootstrap’ can considerably improve statistical inference of the coefficient estimate and produces much lower over-rejection rates of the H0 than, for instance, the common practice of bootstrapping standard errors. Following this literature, as a robustness test, we applied the ‘wild bootstrap’ to all specification 1 models. In contrast to bootstrapping standard errors, the ‘wild bootstrap’ involves a bootstrap t-procedure [67], where the Wald statistic is bootstrapped, and where the resulting distribution of the Wald statistic is used to confirm or reject inference on the original Wald statistic obtained in the DID regressions.

Materiais

N/A

Inovações

Based on the provided information, here are some potential innovations that could be used to improve access to maternal health:

1. Mobile Health (mHealth) Solutions: Develop mobile applications or text messaging services that provide pregnant women with important health information, reminders for prenatal visits, and access to emergency services.

2. Telemedicine: Implement telemedicine programs that allow pregnant women in remote areas to consult with healthcare professionals via video conferencing or phone calls, reducing the need for travel and improving access to medical advice.

3. Community Health Workers: Train and deploy community health workers to provide prenatal care, education, and support to pregnant women in underserved areas. These workers can help bridge the gap between communities and healthcare facilities.

4. Transportation Solutions: Develop transportation programs or partnerships to provide pregnant women with reliable and affordable transportation to healthcare facilities, especially in rural areas where access to transportation is limited.

5. Financial Incentives: Implement performance-based financing programs that provide financial incentives to healthcare providers for delivering quality maternal health services. This can help improve the quality of care and incentivize providers to reach underserved populations.

6. Public-Private Partnerships: Foster collaborations between public and private sectors to improve access to maternal health services. This can involve leveraging private sector resources, expertise, and technology to expand healthcare infrastructure and services.

7. Health Education Programs: Develop comprehensive health education programs that target pregnant women and their families, providing them with information on prenatal care, nutrition, and healthy behaviors. This can help empower women to make informed decisions about their health and improve maternal health outcomes.

8. Maternal Health Vouchers: Implement voucher programs that provide pregnant women with access to essential maternal health services, including prenatal care, delivery, and postnatal care. These vouchers can be distributed to vulnerable populations to ensure equitable access to care.

9. Task Shifting: Train and empower non-physician healthcare providers, such as nurses and midwives, to provide a wider range of maternal health services. This can help alleviate the shortage of skilled healthcare professionals and improve access to care in underserved areas.

10. Quality Improvement Initiatives: Implement quality improvement initiatives in healthcare facilities to ensure that maternal health services are delivered in a safe and effective manner. This can involve training healthcare providers, improving infrastructure, and implementing evidence-based practices.

It is important to note that the specific context and needs of the target population should be considered when implementing these innovations. Additionally, rigorous monitoring and evaluation should be conducted to assess the impact and effectiveness of these interventions.

AI Innovations Description

The recommendation to improve access to maternal health is to implement a Performance-Based Financing (PBF) intervention, as described in the study. PBF is a program that rewards health facilities for providing a defined package of services, based on the quantity and quality of care provided. This intervention has been shown to have a positive impact on maternal healthcare services in Burkina Faso.

The PBF program in Burkina Faso combines elements of both supply-side and demand-side interventions. It includes specific provisions to facilitate access to care for the ultra-poor, such as targeted subsidies and exemptions from user fees. The program also integrates community-based health insurance (CBHI) to further improve access to care.

To ensure the success of the PBF intervention, it is important to involve key stakeholders, including policy makers, implementing agencies, and researchers. The design of the intervention should be based on the specific needs and context of the country. It is also important to conduct rigorous impact evaluations to assess the effectiveness of the intervention and make evidence-based decisions for future health financing strategies.

Overall, implementing a PBF intervention with equity measures, such as targeted subsidies and CBHI, can help improve access to maternal health services and reduce inequalities in healthcare utilization.

AI Innovations Methodology

The article describes a study that aimed to evaluate the impact of a Performance-Based Financing (PBF) intervention on maternal health in Burkina Faso. The study employed a combination of experimental and quasi-experimental approaches to estimate the impact of the intervention.

The PBF intervention involved a contractual agreement between the Ministry of Health and health facilities, where the facilities were rewarded for providing a defined package of services based on quality and quantity. The intervention also included equity measures such as targeting and subsidization for the ultra-poor, as well as the integration of community-based health insurance (CBHI).

To simulate the impact of the recommendations on improving access to maternal health, the study used a Difference-in-Differences (DID) approach. This approach compared the outcomes of the intervention group (facilities/villages in PBF districts) with the control group (facilities/villages in non-PBF districts). The DID approach allowed the researchers to isolate the impact of the intervention from other factors that may have influenced the outcomes.

The study collected data through a household survey and a facility-based survey, including health facility assessments, health worker surveys, direct provider-patient observations, and patient exit interviews. The data was analyzed using regression models with fixed effects and clustering at the district or facility/village level.

The study evaluated the impact of the PBF intervention on various indicators related to health service utilization, quality of service delivery, and health outcomes. The results showed that the PBF intervention had a positive impact on health service utilization and quality of service delivery, particularly in facilities/villages with equity measures. The intervention also showed potential to reduce equity gaps in access to care for the ultra-poor.

Overall, the study demonstrated the effectiveness of the PBF intervention in improving access to maternal health services in Burkina Faso. The combination of experimental and quasi-experimental approaches allowed for a rigorous evaluation of the intervention’s impact while considering the contextual constraints and policy interests.

Autores & Coautores

Statistics:

Citations: 10

Authors: 8

Identifiers:

DOI: 10.1186/s12913-019-4558-3

Research Areas:

Community Interventions, Disparities, Food Security, Health System and Policy, Infectious Diseases, Maternal Access, Maternal and Child Health, Quality of Care, Sexual and Reproductive Health, Social Determinants

Study Countries:

Burkina Faso, Multi-Countries

Study Design:

Cross Sectional Study, Grounded Theory, Quasi Experimental Study, randomised-control-trial

Study Approach:

Quantitative

Participants Gender:

Female

Partilhar isto: