Background Despite a growing body of literature on HIV service costs in sub-Saharan Africa, only a few studies have estimated the facility-level cost of prevention of Mother-to-Child Transmission (PMTCT) services, and even fewer provide insights into the variation of PMTCT costs across facilities. In this study, we present the first empirical costs estimation of the accelerated program for the prevention of mother-to-child transmission of HIV in Zimbabwe and investigate the determinants of heterogeneity of the facility-level average cost per service. To understand such variation, we explored the association between average costs per service and supply-and demand-side characteristics, and quality of services. One aspect of the supply-side we explore carefully is the scale of production-which we define as the annual number of women tested or the yearly number of HIV-positive women on prophylaxis. Methods We collected rich data on the costs and PMTCT services provided by 157 health facilities out of 699 catchment areas in five provinces in Zimbabwe for 2013. In each health facility, we measured total costs and the number of women covered with PMTCT services and estimated the average cost per woman tested and the average cost per woman on either ARV prophylaxis or ART. We refer to these facility-level average costs per service as unitary costs. We also collected information on potential determinants of the variation of unitary costs. On the supply-side, we gathered data on the scale of production, staff composition and on the types of antenatal and family planning services provided. On the demand side, we measured the total population at the catchment area and surveyed eligible pairs of mothers and infants about previous use of HIV testing and prenatal care, and on the HIV status of both mothers and infants. We explored the determinants of unitary cost variation using a two-stage linear regression strategy. Results The average annual total cost of the PMTCT program per facility was US$16,821 (median US$8,920). The average cost per pregnant woman tested was US$80 (median US$47), and the average cost per HIV-positive pregnant woman initiated on ARV prophylaxis or treatment was US$786 annually (median US$420). We found substantial heterogeneity of unitary costs across facilities regardless of facility type. The scale of production was a strong predictor of unitary costs variation across facilities, with a negative and statistically significant correlation between the two variables (p350) were recommended to receive one of two options of prophylactic regimens: Option A consisted of starting at 14 weeks of pregnancy through 7 days postpartum; or Option B, an ARV preventive regime beginning at 14 weeks of pregnancy and continuing until weaning [27]. In 2013 and 2015, WHO revised these guidelines and rolled out Option B+, which recommended that all HIV infected pregnant women should receive lifelong ARV treatment regardless of CD4 count starting as soon as diagnosed [28, 29]. The Accelerated National PMTCT Program was implemented in 2011 by Zimbabwe’s Ministry of Health and Child Care (MoHCC) based on PMTCT Option A from WHO’s 2010 guidelines. Zimbabwe then switched to Option B+ in late 2013. [27, 30, 31]. Thus, at the time covered by our surveys (2012–2013), pregnant women were not exposed to Option B+. Under Option A, non-eligible HIV-positive women received ARV prophylaxis starting at 14 weeks of pregnancy through 7 days postpartum. HIV-infected pregnant women with CD4<350, regardless of symptoms, were eligible to receive lifelong ART. The recommended ARV regimen was Zidovudine (AZT) twice daily for the mother, and prophylaxis with either AZT or Nevirapine (NVP) for six weeks after birth if the mother was not breastfeeding the infant. If the infant was breastfeeding, daily NVP infant prophylaxis continued for one week after the end of the breastfeeding period [27, 30]. The sampling strategy followed a two-stage process. First, we randomly selected 157 out of 699 health facilities from five provinces that provided PMTCT services for the entirety of 2013 in Zimbabwe: Harare, Mashonaland West, Mashonaland Central, Manicaland, and Matabeleland South. These provinces were selected to include three of the four largest cities in Zimbabwe, rural communities with high and low HIV prevalence, representation of both major ethnic groups in Zimbabwe (Shona and Ndebele), and areas where detailed monitoring-and-evaluation data were being collected [24]. Second, in each catchment area–defined as a neighborhood of a 10-km radius around each facility, we identified 21,205 eligible pairs of mothers/caregivers aged 16 or older and infants (born 9–18 months before the survey). We randomly selected a fraction of all mother-infant pairs with the aim of recruiting 50 from each catchment area to participate in the study. Overall, 9,087 mother-infant pairs did. The sampling strategy was previously described in more detail [23, 24, 25, 32]. We estimated a sample size of 157 PMTCT clinic catchment areas. We expected to identify an average of 190 infants aged 9–18 months per catchment area of PMTCT clinic, assuming 10.5 living infants aged 9–18 months per 100 households and approximately 1,800 households per catchment area of a PMTCT clinic. If we identified all eligible infants and enrolled 1 in every four eligible infants, this would result in 47.5 living infants aged 9–18 months per catchment area. Our estimated sample size for the baseline community survey is approximately 7,800 infants (7,442 alive and 353 deceased) and their mothers or caregivers from 157 PMTCT catchment areas. However, those initial sample size calculations were revised upwards after incorporating new data on variability in the number of eligible mother-infant pairs and the underlying HIV prevalence across catchment areas. The overall size of the population recruited increasing from an estimated 7,800 to 9,087. The survey team administered a short questionnaire in each of the 157 health facilities. All data were collected on paper by a trained data collector and later entered into an Access™ database by a data entry operator. Researchers conducted periodic reviews of the data for completeness and consistency. We collected data on the type of facility visited, according to Zimbabwe’s classification of health centers. Our sample included ten different types of health facilities; however, for the analysis, we grouped them into hospitals and non-hospitals (See S 1 for a detailed description of the different types of facilities). PMTCT services costs in hospitals tend to be higher. One important reason is that hospitals are more complex facilities, given that they provide not only outpatient but also inpatient services. Hospitals also tend to include more specialized, therefore higher paid medical staff. Because of these reasons, one hypothesis regarding our study is that hospitals are less flexible in terms of allocation of resources compared to clinics. This lack of adaptability could decrease efficiency and increase costs. We adopted the perspective of service providers on the analysis. Therefore, we did not measure patients’ expenditures, such as out-of-pocket fees or transportation expenses. Moreover, only 4% of the women in the sample declared to have paid for PMTCT services. We used a retrospective microcosting approach to measure monthly quantities and prices of three essential input categories: personnel, ARV drugs, and HIV tests kits. These categories comprise the largest share of the total costs of PMTCT services, according to previous studies [13, 14, 33]. We did not include other inputs involved in the provision of PMTCT, which typically represent less than 10% of the total costs [13, 14, 33], such as capital costs, training, supervision, and other recurrent costs. We valued all inputs at market prices, including donations, adopting an economic rather than a financial costing approach. Given that all the units in our sample are government clinics, there is no variation in input prices or salaries across facilities. The Ministry of Health centrally establishes wages based on cadre categories and purchases all essential inputs (drugs and tests) through centralized procurement processes. Thus, the variation in unitary costs across facilities reflects differences in efficiency; the ability of clinics to produce services, given the resources at their disposal and given the demand characteristics they face. We collected data on the total annual outputs produced along two steps in the PMTCT service cascade–HIV testing and ARV/ART initiation. When these data were not available at the facilities, we collected them from the district health information system. We also assessed the time allocation of personnel providing PMTCT services through interviews with five randomly selected providers per facility, in which we asked them to report the time they spent working on PMTCT every day of the previous week. Self-reported time allocation to specific tasks has been found to overestimate effort allocated to particular services [34]. We attempted to minimize this potential bias in three ways. First, instead of asking about a “typical week” or a “typical year” which is a common wording for these questions, we asked about the “last week”; this would spare the respondent any mental calculation and make it easier to remember. Secondly, we clarified the purpose of the questions, which was not part of a performance evaluation. Finally, we assured respondents that the information they provided would be used only in statistical analyses at the facility level. We categorized staff in three types of nurses (primary, general, and sisters in charge) and one broad category of health personnel which includes counselors, health promotion officers and other support staff (S 2). "Primary care nurse" was the category with the lowest average salary, and the second most common in the sample, after "general nurse." Additionally, we collected information on the scope and comprehensiveness of PMTCT and other related maternal services provided at the facilities, as a proxy for quality. Process quality, which measures the extent to which providers follow the processes outlined or explicitly listed in official guidelines, has been used previously as an indicator of the quality of health care [35]. For example, Marley et al. (2004) found that process quality is as good as clinical quality in predicting patient satisfaction in hospitals in the U.S. [36]. Rademakers (2011) found that processes followed in hospitals in the Netherlands explained most of the variation of the patients' evaluation of quality of surgery and other interventions [37]. Meehan (1997) assessed the quality of care for Medicare patients hospitalized with pneumonia and found an association between the process of care and mortality [38]. There is also evidence from less affluent countries; Das and Gertler (2007) document practice quality on six different studies in five low- and middle-income countries. They provide evidence on the large effect of process quality on health outcomes compared to those of availability or structural quality [34]. Das and Hammer (2014) argue that access to healthcare is not the main problem in low-income countries anymore [35]. They provide evidence suggesting that process quality to be the real issue. We collected information on the availability of services offered at the facilities as a proxy for process quality. Specifically, we asked which services they provided out of a comprehensive list of family planning, ANC and ART/ARV prophylaxis services recommended by the government. For example, whether contraception services were offered during antenatal visits, after labor and delivery, during postnatal care visits, and during child immunization visits, and constructed a variable index for contraception services based on the number of recommended care practices undertaken in the facility. We computed similar variables for antenatal services and ARV prophylaxis or ART (see S 2 for the complete list of recommended processes used for each indicator). We also collected data on some characteristics of the demand as proxies for size and complexity. We surveyed mothers about their HIV status and estimated the maternal HIV prevalence; the rate of mother-to-child transmission of HIV; the rate of HIV-free infant survival; and the uptake of HIV testing during pregnancy. Finally, we asked each facility for the estimated total size of the population they served. All of these variables refer to the catchment area level. The survey team administered a household-level survey in 2014 to the selected sample of 9,087 mother-infant pairs. Mothers or caregivers 16 years old and older and their infants born between 9 and 18 months before the interview (alive or deceased), were eligible to participate in the study. The data enumerator sought informed consent to complete the questionnaire and to collect dried blood samples (DBS) for HIV testing from the mother and her eligible infant. The mother could consent to participate in both, neither, or either the questionnaire and DBS. The questionnaire was administered using a Personal Digital Assistant (PDA) and captured the mother’s demographic characteristics, her experience with ANC and HIV testing, and more specifically, her experience with ANC during the pregnancy for the eligible child [31]. Finally, all living biological mothers and infants provided DBS for HIV testing. We calculated the total annual costs of PMTCT services for each facility as the sum of personnel and recurrent inputs costs used in 2013, as follows: Where TCj denotes total costs of facility j. Personnel costs were estimated by the sum of the number of hours hij each type of personnel i in a facility j worked on PMTCT services, multiplied by the hourly wage wi corresponding to provider type i. Recurrent costs (ART drugs and HIV test kits) were calculated as the sum of the k number of goods xjk multiplied by their prices pk. Then, the facility-level average unitary costs per output, ACjl along the cascade were defined as: Where qjl is the 2013 annual number of outputs produced by facility j along the cascade indicator l, where l = 1 for number of pregnant women tested, l = 2 for HIV positive women on ARV prophylaxis or treatment. The objective of this costing approach is to explore the efficiency of the PMTCT program by looking at the heterogeneity of "unitary costs" across implementers. This approach provides insight on the efficiency of the program at the facility level and across the service cascade. We explored the facility-level variation of the unitary costs of PMTCT at two steps of the service cascade: HIV testing and ARV prophylaxis or ART. First, we present the dispersion of average unitary costs by type of facility to describe the variation across facilities. Then, we explore three potential determinants of the heterogeneity in unitary costs. On the supply side, we investigated the role of scale, staff categories, and facility type. On the demand side, we included in the analysis the prevalence of maternal HIV, the rates of mother-to-child transmission of HIV and HIV-free infant survival at 9–18 months, and the levels of uptake of HIV testing during pregnancy. Finally, we analyzed the role of the quality of services. We were interested in identifying how supply- and demand-side characteristics influence the variation of unitary costs across facilities, and in particular the role of the scale of PMTCT services. However, a simple linear regression could yield a biased parameter due to endogeneity bias since there may be unobservable or omitted characteristics simultaneously explaining costs and scale. To minimize this potential bias, we apply a two-stage least square model using demand size, measured by the size of the population at the catchment-area level, as an instrumental variable (IV). With this approach, we estimated first, the association between demand size and scale; and, second, the effect of scale and other supply-side characteristics on the unitary costs of PMTCT services. The second stage regression model is specified as follows: Where yi represents the log-transformed unitary costs of PMTCT services (for each step of the cascade): the average cost per pregnant woman tested for HIV in facility i, or the average cost of an HIV-infected pregnant woman on ARV prophylaxis or ART in facility i. Scale is the endogenous variable represented by xi, ssi is a vector of s supply characteristics including a binary variable for type of facility (hospital or non-hospital), the proportion of primary care nurses with respect to the other types of staff; ddi is a vector of d demand-side characteristics, qqi is a vector of q process quality variables; and εi is the residual term. In the first stage (Eq 2), we regress the endogenous variable on the exogenous variables in the model including our instrument, the log-transformed size of the population in the catchment area (zi) to obtain adjusted values of x (x*) and use them in the second stage of the analysis (Eq 1) to obtain unbiased estimates of y. Although the validity of an instrumental variable cannot be fully statistically tested, we computed the first stage F-statistic to verify the condition of the Stock-Yogo critical values [39]. To test for endogeneity, we implemented the Durbin-Wu-Hausman test with the null hypothesis of exogeneity. However, we also verified the two conditions for the instrumental variable model to be valid. First, we descriptively examined the association between the instrumental variable ‘population size’ and the endogenous variable ‘scale’ of PMTCT services production (see S3 and S4 Files). We found a meaningful and statistically significant association. Secondly, we explored the association between costs and the size of the population and found no statistically significant correlation, suggesting the instrumental variable (population size) influences the dependent variable only through scale. We acknowledge that different channels could also be associated with population size and costs. For example, more populated areas concentrate more skilled providers which could translate into higher staff costs or more efficient use of resources. There appears to be a low correlation between higher skilled providers and costs; however, this is not statistically significant. Nevertheless, we have addressed this possible source of bias by controlling for staff composition in all our regressions. Overall, our results are consistent with our assumption that the area population affects costs only through scale, rendering validity to our approach. In addition to the IV model, we estimated two naïve OLS models, one for each of the two stages of the PMTCT cascade. These OLS models regressed log-transformed costs on the log-transformed scale and controlled for the same supply-side characteristics and quality/completeness, as the IV models. We used robust standard errors and province fixed effects in all regressions. This research was approved by the Institutional Review Board of the University of California Berkeley.