WHO recommends participatory learning and action cycles with women’s groups as a cost-effective strategy to reduce neonatal deaths. Coverage is a determinant of intervention effectiveness, but little is known about why cost-effectiveness estimates vary significantly. This article reanalyses primary cost data from six trials in India, Nepal, Bangladesh and Malawi to describe resource use, explore reasons for differences in costs and cost-effectiveness ratios, and model the cost of scale-up. Primary cost data were collated, and costing methods harmonized. Effectiveness was extracted from a meta-analysis and converted to neonatal life-years saved. Cost-effectiveness ratios were calculated from the provider perspective compared with current practice. Associations between unit costs and cost-effectiveness ratios with coverage, scale and intensity were explored. Scale-up costs and outcomes were modelled using local unit costs and the meta-analysis effect estimate for neonatal mortality. Results were expressed in 2016 international dollars. The average cost was $203 (range: $61-$537) per live birth. Start-up costs were large, and spending on staff was the main cost component. The cost per neonatal life-year saved ranged from $135 to $1627. The intervention was highly cost-effective when using income-based thresholds. Variation in cost-effectiveness across trials was strongly correlated with costs. Removing discounting of costs and life-years substantially reduced all cost-effectiveness ratios. The cost of rolling out the intervention to rural populations ranges from 1.2% to 6.3% of government health expenditure in the four countries. Our analyses demonstrate the challenges faced by economic evaluations of community-based interventions evaluated using a cluster randomized controlled trial design. Our results confirm that women’s groups are a cost-effective and potentially affordable strategy for improving birth outcomes among rural populations.
The systematic review identified seven women’s group trials in six locations across four countries: India, Nepal, Bangladesh and Malawi (Prost et al., 2013). All trials used a cluster randomized controlled design to evaluate the effectiveness of a community participatory learning and action cycle using women’s groups to reduce neonatal and maternal deaths. Although the content of the group discussions was targeted at women of reproductive age, groups were open to all women. All except Malawi-MaiKhanda implemented health service strengthening in both intervention and control areas, but otherwise the control clusters carried on with current practice. More detailed explanations of the intervention and trial characteristics can be found elsewhere: India (Tripathy et al., 2010; More et al., 2012), Nepal (Manandhar et al., 2004), Bangladesh (Azad et al., 2010; Fottrell et al., 2013) and Malawi (Colbourn et al., 2013a; Lewycka et al., 2013). Table 1 explains which of the seven trials have had cost-effectiveness analyses previously published, and which are included in this article. Sensitivity analyses have been conducted for the two trials that published separate cost-effectiveness reports (Borghi et al., 2005; Colbourn et al., 2015). Primary cost data were not collected for the one urban trial located in Mumbai, India (More et al., 2012) and costs could not be estimated retrospectively without introducing significant bias. We therefore excluded it from this analysis. Summary of previously published cost-effectiveness evidence The target population was the same in all six trials: all pregnant women living in the study area. Despite a high degree of similarity in design and implementation sufficient to warrant meta-analysis of effectiveness, there were differences between the trials including: the duration, coverage and intensity of the intervention, and the size of the targeted population. These differences are described in Table 2. For example, in the first trial in Nepal, the intervention period was relatively short (24 months) and the intervention area population relatively small (86 704). Intervention coverage as defined in the systematic review (Prost et al., 2013) was relatively high, at 37% of pregnant women having attended at least one women’s group meeting. Intervention intensity, which can be measured using the number of women’s groups and average length of the intervention cycle, was relatively low (n = 111 groups; 10 meetings per group). By comparison, in the other trials, the intervention period was up to 12 months longer, and coverage ranged from 3% (Bangladesh I) to 51% (Malawi-MaiMwana); intensity was higher (highest in Bangladesh II at 810 groups and 24 meetings per group); and the target area population was larger (largest in Malawi-MaiKhanda at 1.2 million). The trials were of substantially different sizes: the Nepal study had c.3000 live births, compared with 100 000 in Malawi-MaiKhanda. Description and comparison of the interventions The figures are based on published papers (Manandhar et al., 2004; Borghi et al., 2005; Azad et al., 2010; Tripathy et al., 2010; More et al., 2012; Colbourn et al., 2013a, 2013b, 2015; Fottrell et al., 2013; Lewycka et al., 2013; Prost et al., 2013). The WHO recommendation (World Health Organization, 2014) focused on women’s groups to reduce neonatal mortality, as this was supported by the overall meta-analysis (Prost et al., 2013). We therefore focus our analyses of cost-effectiveness on this outcome. We calculated LYS from the reduction in the neonatal mortality rate in each trial as reported in the meta-analysis (see Table 2B in Prost et al., 2013). For Malawi-MaiKhanda, the number of recorded deaths was multiplied by 11 to adjust for the fact that only about 9% of the area was randomly selected to be under surveillance over the intervention periodColbourn et al., 2013b). Neonatal deaths averted were multiplied by 30.81 to generate a measure of LYS. This corresponds to assuming a standard life expectancy of 86 years, a 3% discount rate and no age weighting, as recommended in the 2010 Global Burden of Disease Study (Murray et al., 2012; World Health Organization, 2017). The original economic evaluations for these trials, which are described extensively elsewhere (Borghi et al., 2005; Tripathy et al., 2010; Fottrell et al., 2013; Lewycka et al., 2013; Colbourn et al., 2015), prospectively collected cost data from a provider perspective and applied a step-down costing methodology. For the analyses presented here, we inputted the source cost data from the individual trials into a single, standardized Excel-based tool. Data categories and the procedure for allocating costs between cost centres were harmonized across trials, as we have previously described elsewhere (Batura et al., 2014). The trial designs in Bangladesh and Malawi presented two specific costing challenges that had to be addressed to ensure comparability of estimates across countries. Supplementary Appendix S1 gives further details of how costs were identified in the original evaluations; the costing challenges specific to Bangladesh and Malawi; and the conversion of figures to 2016 international dollars (INT$, henceforth $). We calculated total, annual and unit costs using the parameters shown in Table 2. Total cost of the women’s group intervention was computed as the sum of all start-up and implementation costs over the time horizon used for each trial’s cost-effectiveness analysis. This was consistent with the original evaluations, which conservatively included the costs of all activities during the start-up period (excluding research activities), such as staff recruitment and training, securing community approval and adapting intervention delivery methods, content and materials to the local context. A share of recurrent costs during the implementation period was also included as start-up costs, to reflect the recruitment and training of replacement staff. Total cost was divided by the cost-effectiveness time horizon to compute annual total cost. Implementation cost was divided by the intervention period to compute annual implementation cost. We calculated three different unit cost estimates with reference to population size and the number of women’s groups in the intervention area: cost per live birth, annual cost per person and annual cost per group. Cost per live birth was computed by dividing total cost by the number of live births during the intervention period, which represents the population of potential beneficiaries of the intervention in relation to the main outcome measure, neonatal deaths averted. Annual cost per person and annual cost per group were computed by dividing annual total cost by the total population (all ages) living in the intervention area and the number of women’s groups, respectively. The design of the trials and the characteristics of the women’s group intervention precluded the identification and measurement of resource use on the individual level, and thus the estimation of unit costs at the level of individual intervention participants (Batura et al., 2014). We explored the components of total cost by computing the proportion of total costs for each of the four data categories: staff (including programme staff, women’s group facilitators and supervisors), materials, other recurrent (items such as transportation, communication, utilities, bank charges, etc.) and capital costs. A more detailed break-down was not possible due to differences in the level of detail in the primary cost data. In particular, due to lack of disaggregated data on staff costs from all six trials, we were not able to examine variation in factors such as the number of staff involved in intervention implementation, their remuneration levels and staff productivity. The cost-effectiveness ratio was calculated in the base case as cost per neonatal LYS. We compared the estimates with income-based thresholds that have been recommended by WHO, which suggest in our case that the intervention is ‘very cost-effective’ if the cost per LYS is less than annual gross domestic product (GDP) per capita, and ‘cost-effective’ if it is less than three times per capita GDP (Commission on Macroeconomics and Health, 2001). These thresholds have since come under criticism, and alternative methods for estimating thresholds have been developed (e.g. Bertram et al., 2016; Culyer, 2016; Woods et al., 2016). We used the WHO-recommended thresholds because they are currently the most widely applied. However, we also discuss the implications of a lower threshold. The analytical methods and reporting of the cost-effectiveness results follow the Consolidated Health Economic Evaluation Reporting Standards Statement (Husereau et al., 2013). The completed checklist is provided in Supplementary Appendix S2. We explored the possible reasons for variation in cost-effectiveness ratios across countries using simple two-way scatter plots and the Pearson’s correlation coefficient. First, we examined whether cost per neonatal LYS was more strongly associated with effectiveness (the number of LYS) or with unit costs (cost per live birth). Second, we compared unit costs and the cost-effectiveness ratio with coverage, scale and intensity of the intervention. Coverage, defined as the proportion of pregnant women who report having attended at least one women’s group meeting, was previously found to be a significant determinant of effectiveness (Prost et al., 2013). Scale was measured by the number of live births and the total intervention area population. Intensity was measured by the number of women’s groups. A P-value of <0.05 was used to determine significance. The cost, affordability and outcomes of national scale-up in Bangladesh, India, Malawi and Nepal were then estimated to inform national policy. Previously, the affordability of national delivery has been examined only for Malawi (Colbourn et al., 2015). Scale-up analyses assumed delivery of the intervention to the whole rural population, over a 1-year period. Cost was estimated using the average annual cost per person from the trial for that context. Since our own analyses found no conclusive evidence of economies of scale (see Results section), we assumed that cost per person is constant when the intervention is scaled-up. The benefits of intervening at scale were estimated, taking the same approach as in the meta-analysis (Prost et al., 2013), but updating the population parameters with more recent values. As the effectiveness of a trial may not be maintained at scale (Hanson et al., 2015), we provide two estimates of effect at scale, an upper and a lower bound. For the upper bound, we assumed that the scaled-up intervention will have the same effectiveness as reported in the meta-analysis of high coverage trials i.e. a 33% reduction in neonatal mortality. To estimate a lower bound, we assumed a 30% loss of effectiveness when the intervention is implemented at scale. Supplementary Appendix S3 summarizes the population data used for these calculations and describes the methods in more detail. The base case is the ‘best’ estimate of cost-effectiveness, measured with prospective cost and effect data. It is against this base case that the sensitivity of cost-effectiveness to changes in the assumptions and estimated parameters was formally compared using deterministic one-way sensitivity analysis. We first added maternal LYS to the estimated neonatal LYS to explore the resulting effect on the cost-effectiveness ratio. Maternal mortality was not included in our base case because of the lack of statistical significance in the overall meta-analysis (odds ratio 0.77, 95% confidence interval 0.48–1.23). However, limiting the base case to neonatal LYS represents a highly conservative estimate of the health effects of women’s groups. The meta-analysis found that in the four trials where at least 30% of women had attended women’s groups, the intervention had a significant effect on maternal mortality (Prost et al., 2013). We therefore used the adjusted odds ratio for maternal mortality in each trial (Prost et al., 2013), and multiplied the number of maternal deaths averted by the life expectancy that corresponds to the average age at death in each trial (between 26 and 30), to calculate maternal LYS. A 3% discount rate was applied. The meta-analysis also examined effects on stillbirths but found no evidence of a reduction. We therefore did not consider LYS from stillbirths. Second, we reduced the start-up costs of all trials by 50%. This reflects the assumption that while all trials had a relatively long start-up period (as is typical of community interventions), once an intervention has been tested in a context and standardized, it is very likely that the start-up period and associated costs would reduce significantly. Third, we varied the trial-specific joint cost allocation rules that were used in the original economic evaluations. The joint cost allocation rule decides which percentage of common (shared) staff, material, capital and other recurrent costs, should be allocated to the women’s group intervention as opposed to other activities, such as monitoring and evaluation, process evaluation, other interventions or research. We varied the allocated share up and down, by 10 percentage points from the original allocation. Fourth, we conducted a specific sensitivity analysis for the two Malawi trials that tested another intervention alongside women’s groups (see Supplementary Appendix S1 for details). The proportion of women’s group implementation costs allocated to the women’s groups only arm was varied between a 33% lower bound and a 75% upper bound. This can be interpreted as reflecting alternative scenarios regarding economies of scale and scope when two interventions are implemented in the same trial. Finally, we explored two alternatives to the 3% discount rate for both costs and outcomes (NICE International, 2014): a 0% rate for both costs and life years, and a differential scenario of 6% for costs and 3% for life years (Claxton et al., 2011). The funder of the study had no role in study design, data collection, data analysis, data interpretation or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.