Use of big data and machine learning methods in the monitoring and evaluation of digital health programs in India: An exploratory protocol

listen audio

Study Justification:
– Digital health programs in India have the potential to generate “big data”
– Rigorous evaluations of digital health programs are limited, and few have included applications of machine learning
– This study aims to evaluate two digital health programs in India and illustrate possible applications of machine learning for program evaluation and improvement
Highlights:
– The study focuses on two digital health programs in India: the maternal mobile messaging service (Kilkari) and the mobile training resource for frontline health workers (Mobile Academy)
– The study aims to generate evidence on program effectiveness and improve implementation using machine learning methods
– The study has obtained necessary approvals for data use and is expected to publish results in August/September 2019
Recommendations:
– Implement machine learning approaches to analyze routine system-generated data from the digital health programs
– Identify gaps in data quality, poor user performance, predictors for call receipt, user listening levels, and linkages between early listening and continued engagement
– Use the results to improve the performance of the digital health programs and enhance their effectiveness
Key Role Players:
– Public health practitioners
– Program implementors and evaluators
– Data entry operators
– Ministry of Health and Family Welfare of the Government of India
– Bill & Melinda Gates Foundation
– Johns Hopkins University
– University of Cape Town
– BBC Media Action
Cost Items:
– Data processing and analysis software
– Data storage and security measures
– Training and capacity building for data analysts
– Collaboration and coordination costs with key role players
– Ethical certification and approvals
– Publication and dissemination of results

The strength of evidence for this abstract is 7 out of 10.
The evidence in the abstract is rated 7 because it provides a detailed description of the data sources, methods, and proposed analysis. However, it does not mention any specific results or findings. To improve the evidence, the abstract could include preliminary results or expected outcomes of the analysis.

Background: Digital health programs, which encompass the subsectors of health information technology, mobile health, electronic health, telehealth, and telemedicine, have the potential to generate “big data.” Objective: Our aim is to evaluate two digital health programs in India-the maternal mobile messaging service (Kilkari) and the mobile training resource for frontline health workers (Mobile Academy). We illustrate possible applications of machine learning for public health practitioners that can be applied to generate evidence on program effectiveness and improve implementation. Kilkari is an outbound service that delivers weekly gestational age-appropriate audio messages about pregnancy, childbirth, and childcare directly to families on their mobile phones, starting from the second trimester of pregnancy until the child is one year old. Mobile Academy is an Interactive Voice Response audio training course for accredited social health activists (ASHAs) in India. Methods: Study participants include pregnant and postpartum women (Kilkari) as well as frontline health workers (Mobile Academy) across 13 states in India. Data elements are drawn from system-generated databases used in the routine implementation of programs to provide users with health information. We explain the structure and elements of the extracted data and the proposed process for their linkage. We then outline the various steps to be undertaken to evaluate and select final algorithms for identifying gaps in data quality, poor user performance, predictors for call receipt, user listening levels, and linkages between early listening and continued engagement. Results: The project has obtained the necessary approvals for the use of data in accordance with global standards for handling personal data. The results are expected to be published in August/September 2019. Conclusions: Rigorous evaluations of digital health programs are limited, and few have included applications of machine learning. By describing the steps to be undertaken in the application of machine learning approaches to the analysis of routine system-generated data, we aim to demystify the use of machine learning not only in evaluating digital health education programs but in improving their performance. Where articles on analysis offer an explanation of the final model selected, here we aim to emphasize the process, thereby illustrating to program implementors and evaluators with limited exposure to machine learning its relevance and potential use within the context of broader program implementation and evaluation.

We present the methods section in parts. We first present a detailed description of the data we plan to use as our source including the architecture of the databases and data elements. Program data are currently held in different databases located in Gurugram and call data records are held in the Mobile Network Operator’s datacenter in Delhi. Next, we provide a description of the data munging (ie, data wrangling) and analysis methods including a brief description of the various machine algorithms under consideration. Auxiliary nurse midwives collect and register details of pregnant women and, after delivery, of postpartum women and children born in their catchment areas. These data are captured in print registers and uploaded at the block level by data entry operators, forming the data in the pregnancy tracking databases. The data collected include personal identifiers such as geographic location, names of women and a mandatory mobile phone number, and where available, details of the pregnancy and childbirth. Data capture happens at two key time points: (1) the earliest is the registration of the woman at the time of the identification of pregnancy, and (2) following childbirth, when the details for delivery care are available. In actual practice, these events may happen many days or months after the event (pregnancy registration or birth of child) has happened. Figure 1 summarizes the databases and flow of data for both Mobile Academy and Kilkari. The following are existing databases: Summary of data flow for Kilkari and Mobile Academy. The registration data on pregnant women and ASHAs are collected by the Ministry of Health and Family Welfare of the Government of India and the ministries of health of the states participating in the program. The data will be analyzed under a data sharing agreement with the Bill & Melinda Gates Foundation and Johns Hopkins University, University of Cape Town, and BBC Media Action. The Institutional Review Boards of Johns Hopkins School of Public Health, Sigma in New Delhi, India, and the University of Cape Town have provided the ethical certification for the study. For the Kilkari program, the pregnant women or postpartum women’s data are captured in the RCH and MCTS systems, or in state-based systems that then pass data to RCH or MCTS, and from there to the MOTECH system. Before the data are accepted by MOTECH, the system automatically runs validations to check that the mobile numbers are in the correct format, locations match location masters in the MOTECH database, and last menstrual period and date of birth are within the Kilkari timeframe. The MOTECH system uses the last menstrual period or the delivery date to determine the schedule of messages to be delivered. The MOTECH engine provides the list of phone numbers (clients) to be called each day to the IVR system, which then calls the numbers and plays the appropriate pre-recorded message, which is stored in the IVR system’s content management system. If the call is not answered, then the IVR system attempts to call again at least 3 times every day for 4 days until the call is answered. For the Mobile Academy program, details on ASHAs including their names, phone numbers, geographic location, and age are contained in either the RCH or MCT databases, or in state-based databases integrated with MCTS and used to register them to Mobile Academy. The MOTECH engine captures these data on ASHAs from the RCH or MCTS databases and following registration to Mobile Academy, ASHAs are eligible to call in to the IVR system using the same phone number provided in the RCH database. The IVR system validates the phone number against the MOTECH system and then retrieves the “bookmark” information that details the status of the ASHA and her progress on the list of content expected to be covered. Based on this information, the appropriate content is delivered to the ASHA via the IVR system and the updated data return to the MOTECH database. The data from the databases (Figure 1) will be extracted onto secure password-protected hard drives from each server storage. Merging data files will be complex given the nature of identifiers across databases. An MCTS record does not have a beneficiary ID. Instead, it has a “Mother” (pregnancy) ID or a “Child” ID. In other words, MCTS tracks pregnancies and births, rather than women. When Kilkari first went live in October 2015, it mirrored the MCTS approach and generated subscription IDs for each pregnancy and then birth. However, the new RCH database does have a unique beneficiary ID, which enables the system to track an individual woman through her multiple pregnancies and the births of children. The architecture of the MOTECH database and Kilkari was changed in December 2016 to introduce a unique beneficiary ID and MOTECH was then integrated with RCH in mid-2017. There is an additional complexity, namely that MOTECH used to allow multiple Kilkari subscriptions on one mobile number, assuming a single phone could be shared by a number of women in a joint family. However, a decision was made to remove this feature in 2017 (July 28 for RCH and October 6 for MCTS) due to the complexity it created in analyzing system-generated data. Hence the analytic time horizon assumed in the analysis may span from 2017-2018 after the MCTS-RCH integration occurred and the aforementioned changes were made. The merging of datasets will occur in India, and only de-identified data will be stored on the hard drives and used in this analysis. As part of Study Aim 1, we will examine the quality of the data for completeness, including patterns and any geographic clustering in missingness. Analyses described in this section are being carried out as part of a larger external evaluation of Kilkari. We describe concurrent efforts to undertake a randomized controlled trial (RCT) in the state of Madhya Pradesh (MP) for Kilkari, inclusive of baseline surveys with pregnant and postpartum women, and ASHA workers. Once identified as part of baseline survey activities and randomized to receive Kilkari content (or no content at all), phone numbers will be fed directly into the MOTECH database for provision of program services. For pregnant women, additional data collected as part of baseline household surveys include demographic factors (age, education, parity, literacy), socioeconomic characteristics (household assets, conditions), health care seeking and practices, as well as data on digital literacy and phone access. These data can be linked to MOTECH, IVR, and call center records to provide additional data elements. Overall, these data as well as data on technology performance (receipt of messages) and user engagement (behavioral performance) with content will help estimate exposure to Kilkari used in the assessment of causality as part of the RCT. For ASHAs, baseline survey data will include similar data elements on demographic, socioeconomic, and mobile literacy and phone access as well as knowledge and work-related variables linked to reported motivation and satisfaction. Overall, these added data elements can be linked to IVR and call record data for this subpopulation of Mobile Academy and Kilkari users in four districts of MP where the RCT is underway. Descriptive statistics, including univariate plots like histograms, will be used to understand the distribution of each variable, including skewness and outliers. Multivariate plots like scatterplots and locally weighted scatterplot smoothing (LOWESS) lines will be used to understand the relationships between different variables. Efforts to prepare the data are divided into two parts: splitting data into training and testing groups, and data processing. To avoid overfitting models that work well for the data in hand but fail to predict well with other datasets, the data will be split into three components. This is possible due to the large size of the dataset. The training set will comprise 60%, the test dataset 20%, and the validation dataset 20% of the data. The test set will be used to test and fine-tune the accuracy of predictive models, and the final selected model will be applied to the validation dataset. We anticipate having data from 2017, 2018, and 2019 and will ensure equal representation by random sampling. To ensure that the data are controlled for time as a confounder, subsets will be equally represented across different time periods. Data processing is the act of preparing the data from its raw format into a usable format by the machine learning models. Indications for data processing will include (1) making the data easier to use; new indicators will need to be created to facilitate their use as predictors, (2) reducing computational cost of many algorithms by decreasing the number of variables, especially correlated and collinear variables, (3) removing noise due to outliers, and (4) making the results easier to understand. The most common methods by which algorithms learn about data to make predictions are supervised, unsupervised, and semisupervised learning [1]. Supervised learning trains algorithms using example input and output data, previously labeled by humans. Data may be labeled—a term used to denote that the outcome (or class) is known (eg, ASHA has completed the training module or not completed the training module)—or unlabeled. In contrast, unsupervised learning is concerned with uncovering structure and patterns within complex datasets based on information that is neither classified nor labeled. In unsupervised machine learning, the algorithms learn to infer structure based on unlabeled input data using clustering techniques. Semisupervised learning is a hybrid analytic technique, applied in contexts where the majority of data points are missing outcome information and yet prediction remains the goal [1]. In this program context, supervised machine learning algorithms are expected to be the primary analytic method employed because analyses are focused on classification using predictors and available data are expected to be labeled. The transformation of variables may be achieved by a variety of techniques including the creation of composite indicators and box-cox transformations. Unsupervised machine learning techniques, including dimensionality reduction techniques such as principal components analysis or K-means clustering, will be carried out as appropriate. Principal component analysis uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The first principal component has the largest possible variance and accounts for the highest proportion of the variance in the data, with each succeeding component accounting for the highest variance possible after accounting for the previous components. K-means clustering is a way to use data to uncover natural groupings within a heterogeneous population (Table 1). To uncover patterns, the algorithm starts by first assigning data points into random groups. The group centers are then calculated, and the group memberships are re-assigned based on the distances between each data point and the group centers. This process is repeated until there are no changes in the group memberships from the previous iteration [16]. In its application to Mobile Academy, K-means clustering will be used to detect patterns in ASHA engagement with training content, including training initiation and completion. Among Kilkari users, K-means clustering will be used to assess patterns in exposure to content by user characteristics based on data elements available in the RCH, including parity, age, and geographic area. Sample of data elements by source for Kilkari & Mobile Academy. Once data have been processed, testing of algorithms will be carried out. Table 2 summarizes the algorithms proposed for training along with their intended applications to Mobile Academy and Kilkari. To determine the model with the best fit, we will explore several machine learning approaches in turn. Models will be fit on the training set, and the fitted model used to predict the responses for the observations in the validation set. The preferred analytic approaches will be selected based on their ability to minimize the total error of the classification, where the latter is defined as the probability that a solution will classify an object under the wrong category. We describe each approach considered below in lay terminology, along with indications for use, and its proposed application in the evaluations of Mobile Academy and Kilkari. Summary of algorithms proposed for testing and their intended application to Mobile Academy and Kilkari. Our choice of methods will include a mix of algorithms based on their strengths and weaknesses and the objective of the process. A comprehensive comparison of supervised learning methods is provided in literature [17,18]. SVM and NNs are expected to perform better with continuous data while the Naïve Bayes method and decision trees perform better with discrete/categorical variables. Naïve Bayes and decision trees have good tolerance to missing values, while NNs and SVM do not. NNs and Naïve Bayes have difficulty handling irrelevant and redundant attributes (ie, extra variables with no useful information or variables with too many categories and too few numbers), while SVM and decision trees are insensitive towards them. Variables with high correlation negatively affect the performance of both Naïve Bayes and NNs, whereas SVM are relatively robust to correlated variables. While Naïve Bayes is robust to noise, NNs are sensitive to poor measurement of variables and susceptible to overfitting. NNs and SVM perform well with multidimensional data and when there is a nonlinear relationship between predictor and outcome. Naïve Bayes requires less memory for both training and validation phase, whereas NN requires large memory allocation across all phases. SVM and NNs usually outperform other methods while Naïve Bayes may yield less accurate results. Table 3 compares the strengths and weakness of different supervised machine learning methods. Performance comparisons of learning algorithms modified from Kotisiantis et al [17,18] (++++ represents the best and + the worst performance). To facilitate decision making on the optimal analytic approach, three steps will be undertaken: (1) develop the correct model for each algorithm using the training dataset, (2) apply the final model for each algorithm on the test dataset, and (3) apply the best performing algorithm on the validation dataset. In Step 1, algorithms will be run using the training dataset comprising 60% of the total sample from across all states for which data are available. For each algorithm, iterative testing will be run to select the best model that fits the data. The emerging results will then be assessed for model fit and accuracy. Table 4 summarizes the four proposed metrics for assessing the performance of each model. Metrics for assessing the performance of each model. aTP: true positive, TN: true negative, FP: false positive, FN: false negative To illustrate the definition of performance metrics for Mobile Academy, we define true positives (TP) as the number of correctly classified ASHAs who have completed the training, and true negatives (TN) as the number of correctly classified ASHAs who have not completed the training. False positives (FP) are defined as the number of ASHAs incorrectly classified as having completed the training, while false negatives (FN) are the number of ASHAs incorrectly classified as not having completed the training. Results from the performance metrics will help define the final model for each algorithm. In Step 2, these final models for each algorithm will be applied to the test dataset, which comprises approximately 20% of the total data. Using the same performance metrics, the models with the best fit and accuracy will be applied to the validation dataset as part of Step 3. Ultimately, predictions for Mobile Academy will aim to determine the probability of the ASHA finishing the course in a predetermined time frame and the possible score/performance of the individual ASHAs. For Kilkari, we will determine predictors for exposure to Kilkari content based on user characteristics, as well as explore the effect of early listening patterns on postpartum engagement and overall exposure.

N/A

The use of big data and machine learning methods can be an innovative approach to improve access to maternal health. In the context of the maternal mobile messaging service (Kilkari) and the mobile training resource for frontline health workers (Mobile Academy) in India, here are some potential recommendations:

1. Predictive analytics: Machine learning algorithms can be used to analyze the large amount of data generated by the digital health programs. By identifying patterns and trends in the data, predictive models can be developed to anticipate gaps in data quality, poor user performance, and predictors for call receipt. This can help improve the effectiveness of the programs and ensure that the right messages are delivered to the right users at the right time.

2. Personalized interventions: Machine learning algorithms can also be used to personalize the interventions provided through the digital health programs. By analyzing user characteristics and engagement patterns, the algorithms can tailor the content and delivery of messages to meet the specific needs and preferences of individual users. This can enhance user engagement and improve the impact of the interventions.

3. Real-time monitoring and evaluation: Big data analytics can enable real-time monitoring and evaluation of the digital health programs. By continuously analyzing the data generated by the programs, public health practitioners can quickly identify areas of improvement and make timely adjustments to the interventions. This can lead to more efficient and effective implementation of the programs.

4. Data integration and interoperability: Machine learning algorithms can help integrate and analyze data from different sources, such as the databases used in the routine implementation of the programs and the call data records from mobile network operators. By linking and analyzing these data, public health practitioners can gain a comprehensive understanding of the program’s impact and identify opportunities for improvement.

5. Data-driven decision making: By leveraging big data and machine learning, public health practitioners can make data-driven decisions to improve access to maternal health. The insights gained from the analysis of the data can inform policy and programmatic decisions, resource allocation, and targeted interventions. This can lead to more effective and efficient maternal health services.

Overall, the use of big data and machine learning methods in the monitoring and evaluation of digital health programs can provide valuable insights and innovations to improve access to maternal health.
AI Innovations Description
The recommendation to improve access to maternal health is to use big data and machine learning methods in the monitoring and evaluation of digital health programs. This can be achieved by analyzing data from digital health programs such as the maternal mobile messaging service (Kilkari) and the mobile training resource for frontline health workers (Mobile Academy) in India.

The data for analysis is drawn from system-generated databases used in the routine implementation of these programs. The data includes information on pregnant and postpartum women (Kilkari) and frontline health workers (Mobile Academy). The data elements captured include personal identifiers, geographic location, pregnancy and childbirth details, and mobile phone numbers.

To analyze the data, machine learning algorithms can be applied. Supervised learning algorithms, such as support vector machines (SVM) and neural networks (NN), can be used for continuous data, while Naïve Bayes and decision trees are suitable for discrete/categorical variables. The algorithms will be trained and tested using a split of the data into training, testing, and validation sets.

The performance of the algorithms will be assessed using metrics such as true positives, true negatives, false positives, and false negatives. The best-performing algorithm will be selected based on its fit and accuracy.

The results of the analysis can provide insights into program effectiveness, identify gaps in data quality, predict user performance, and understand user engagement. This information can be used to improve the implementation of digital health programs and ultimately improve access to maternal health services.
AI Innovations Methodology
The methodology described in the text outlines a process for using big data and machine learning methods to evaluate and improve access to maternal health through digital health programs in India. Here is a brief summary of the methodology:

1. Data Source: The study uses data from two digital health programs in India – the maternal mobile messaging service (Kilkari) and the mobile training resource for frontline health workers (Mobile Academy). Data elements are drawn from system-generated databases used in the routine implementation of these programs.

2. Data Munging and Analysis: The data undergoes data wrangling and analysis methods, including the consideration of various machine learning algorithms. The data is processed to make it easier to use, reduce computational cost, remove noise, and make the results easier to understand.

3. Training and Testing: The data is split into training, testing, and validation datasets. Supervised learning algorithms are used to train the models using labeled data. Unsupervised learning techniques, such as clustering, are used to uncover patterns in the data.

4. Algorithm Selection: Several machine learning approaches are explored, including support vector machines (SVM), neural networks (NN), Naïve Bayes, and decision trees. The algorithms are evaluated based on their ability to minimize classification errors and their suitability for the specific applications in Mobile Academy and Kilkari.

5. Performance Evaluation: The performance of each model is assessed using metrics such as true positives, true negatives, false positives, and false negatives. The models with the best fit and accuracy are selected as the final models.

6. Application to Validation Dataset: The final models are applied to the validation dataset to further evaluate their performance and make predictions. For Mobile Academy, the models aim to predict the probability of frontline health workers completing the training course. For Kilkari, the models aim to determine predictors for exposure to maternal health content and explore the effect of early listening patterns on postpartum engagement.

By using this methodology, the study aims to generate evidence on the effectiveness of digital health programs and improve their implementation to enhance access to maternal health services in India.

Share this:
Facebook
Twitter
LinkedIn
WhatsApp
Email