Background: Digital innovations have shown promise for improving maternal health service delivery. However, low-and middle-income countries are still at the adoption-utilization stage. Evidence on mobile health has been described as a black box, with gaps in theoretical explanations that account for the ecosystem of health care and their effect on adoption mechanisms. Bliss4Midwives, a modular integrated diagnostic kit to support antenatal care service delivery, was piloted for 1 year in Northern Ghana. Although both users and beneficiaries valued Bliss4Midwives, results from the pilot showed wide variations in usage behavior and duration of use across project sites. Objective: To strengthen the design and implementation of an improved prototype, the study objectives were two-fold: to identify causal factors underlying the variation in Bliss4Midwives usage behavior and understand how to overcome or leverage these in subsequent implementation cycles. Methods: Using a multiple case study design, a realist evaluation of Bliss4Midwives was conducted. A total of 3 candidate program theories were developed and empirically tested in 6 health facilities grouped into low and moderate usage clusters. Quantitative and qualitative data were collected and analyzed using realist thinking to build configurations that link intervention, context, actors, and mechanisms to program outcomes, by employing inductive and deductive reasoning. Nonparametric t test was used to compare the perceived usefulness and perceived ease of use of Bliss4Midwives between usage clusters. Results: We found no statistically significant differences between the 2 usage clusters. Low to moderate adoption of Bliss4Midwives was better explained by fear, enthusiasm, and high expectations for service delivery, especially in the absence of alternatives. Recognition from pregnant women, peers, supervisors, and the program itself was a crucial mechanism for device utilization. Other supportive mechanisms included ownership, empowerment, motivation, and adaptive responses to the device, such as realignment and negotiation. Champion users displayed high adoption-utilization behavior in contexts of participative or authoritative supervision, yet used the device inconsistently. Intervention-related (technical challenges, device rotation, lack of performance feedback, and refresher training), context-related (staff turnover, competing priorities, and workload), and individual factors (low technological self-efficacy, baseline knowledge, and internal motivation) suppressed utilization mechanisms. Conclusions: This study shed light on optimal conditions necessary for Bliss4Midwives to thrive in a complex social and organizational setting. Beyond usability and viability studies, advocates of innovative technologies for maternal care need to consider how implementation strategies and contextual factors, such as existing collaborations and supervision styles, trigger mechanisms that influence program outcomes. In addition to informing scale-up of the Bliss4Midwives prototype, our results highlight the need for interventions that are guided by research methods that account for complexity.
A total of 6 prototype devices were deployed in 7 predominantly rural locations—4 facilities in the upper east region and 3 in the northern region. A total of 25 maternal health workers were trained to operate B4M. As the device was withdrawn from 1 facility in the second month of the intervention, the evaluation focused on 6 of the 7 health facilities: facilities A to D in the upper east region and facilities E and F in the northern region. Facility A is the ANC unit of a district hospital and the first-level referral point for facilities B, C, and D, which are health centers. Facility E is an independent public health unit of a district hospital, whereas F is a health center. With the exception of facilities B and C, which shared a single B4M device on a rotating schedule, the other facilities had stable access to 1 device each. We employed a multiple case study design, defining a case as 1 B4M health facility [13]. Informed by knowledge of the project, ANC volume per facility and trend analysis on adoption (first 2 months) and utilization (continued or prolonged use over time) of the device over a 10-month period (unpublished data [10]), health facilities were classified as low (average number of screenings <15 per month), moderate (average number of screenings ≥16 and ≤40 per month), or high (average number of screenings ≥41 and ≤75 per month) adoption and utilization (Table 1). Cases were subsequently grouped into 3 usage clusters: low, moderate, and high, whereby the term usage is a composite term describing adoption and utilization (Table 2). No health facility fell under the high usage cluster, which was recognized as the ideal state. The evaluation sought to understand usage variation between low and moderate usage clusters and reflect on how a high usage state may be attained in implementing an improved prototype. Adoption and utilization per health facility. aN/A: not applicable. bDue to data loss and inability to track the usage trend in facility F, we relied on cumulative usage data and reports from monitoring visits. Clustering of cases. aAs utilization covered a longer period than adoption and total duration of use varied between facilities, when defining clusters, cases were stepped down to account for this. bN/A: not applicable. Realist evaluation is a theory-based approach for opening the black box on complex interventions [14,15]. It has shown promise in unraveling explanations for complex interventions in health, international development, and technological innovation [16-18]. It involves an iterative process beginning and ending with program theories, systematically moving from the specific to the abstract, described as “climbing the ladder of abstraction” [19,16]. Realist methodology is suited for evaluating B4M because it is method neutral and can aid an in-depth understanding of the explanatory processes for program outcomes as well as in the identification of implicit and explicit mechanisms underlying them. Due to its theoretical underpinning and applicability in real-life settings, realist methodology was applied to assess differences between low and moderate B4M usage clusters. This involved developing and subsequently testing initial program theories using qualitative and quantitative data. Identified causal explanations underlying variation in mHealth usage between clusters were framed in configurations that showed the interrelationship between the Intervention, implementation Context, participating Actors, explanatory Mechanisms, and Outcomes. Simply put, ICAMO configurations. Using this analytical heuristic, 2 main layers of context may be differentiated: the broad external environment in which interventions are situated (C1) and the health system or health facility setting in which mobile technology is introduced (C2). Where mechanisms broadly refer to the reasoning and responses to the B4M intervention underlying observed outcomes, main mechanisms (M) were differentiated from subexplanatory mechanisms (m). The initial program theories of the B4M intervention, which describe how the intervention was expected to work, were developed using a 2-pronged approach: Initial program theories. Features and characteristics of the intervention- (I); Contextual factors are denoted (C1) and (C2) for environmental and health system context respectively; Outcomes are denoted (O1) or (O2) representing adoption and utilization respectively; Mechanisms are identified (M1) or (M2) following the outcomes they are linked to, with related explanatory mechanisms further depicted (m1) or (m2); Actor or user characteristics are denoted (A); (Oa) represents additional outcomes. ANC: antenatal care; B4M: Bliss4Midwives. Using quantitative and qualitative methods, the 3 candidate initial program theories were empirically tested. Data collection activities are presented in Multimedia Appendix 1 and summarily involved: All interviews and meetings were conducted in English, audio recorded, and transcribed verbatim. For the data on usability, negative statements were reverse coded, and raw scores were exported to SPSS. Nonparametric t test was used to compare perceived usefulness and perceived ease of use of B4M between clusters. Interview transcripts as well as observation and field notes were analyzed using realist thinking, applying an interpretive lens to build a casual web of explanations from multiple strands of evidence [22]. Using abductive inference, we started from the main outcomes of interest (adoption and utilization) and worked backward to trace plausible underlying explanations. We queried the data for mechanisms of perceived usefulness, perceived ease of use and empowerment (self-efficacy and confidence) for adoption, and the mechanism of recognition for utilization, while being open to new configurations. A cumulative stepwise approach applying inductive and deductive reasoning was employed. First, aided by an Excel spreadsheet, we entered information on each health facility that spoke to elements of the ICAMO configuration into rows and columns, including supporting quotes. Furthermore, previous analysis has shown that over time, the intervention itself can become a new contextual layer within the study setting [9]. Nevertheless, we chose to differentiate the intervention (I) from the existing contextual factors (C1 or C2) to clarify the resources and support that are specifically introduced by B4M. As our data were closer to the project itself than to the broader environmental context (C1), we did not have sufficient strands of evidence on this level. Next, the realist thinking of “if C, then O, because M, for A” was applied to develop ICAMO configurations for each cluster. This involved grouping similar patterns and corroborating or voiding strands of preliminary evidence. Although most evidence strands manifested to varying degrees in each facility, when these were not sufficient to explain usage behavior, they were discarded from the configuration. Theory testing and refining were incremental; data from the low usage cluster were first assessed and then compared with data from the moderate usage cluster. Finally, a cross-case comparison between clusters was used to develop refined program theories. Study approval was granted by the Navrongo Health Research Centre Institutional Review Board (approval ID: NHRCIRB18) and the EMGO+ Scientific Committee of the Amsterdam Public Health Institute (reference number: WC2017-026). Before all interviews, written consent was secured using informed consent forms.