Skip to main content Skip to secondary navigation

Maternal and Child Health - A Satellite’s Perspective (2023 DSSG)

Main content start

Authors: Haojie Wang, Emily Wesel, Poojit Hegde, Mac Ya

Summary:

Currently, basic indicators of maternal and child health (MCH) and coverage with essential MCH services (e.g., childhood vaccinations) are obtained from expensive nationally representative household surveys. Such surveys, however, do not allow for detailed insights on how MCH and MCH service coverage indicators vary between communities because they only sample a tiny subset of all villages and neighborhoods in a country (usually <2%) and estimates tend to be outdated.

This research project explores the use of machine learning with satellite imagery and other publicly available geotagged data for the estimation of key indicators of maternal and child health (MCH) indicator status. We use three main sources for training data: an aggregation of geo-tagged data collected using Google Earth Engine, extracted features from satellite imagery collected using the MOSAIKS API, as well as actual images from the Landsat satellites. Using household surveys collected by the USAID’s Development and Health Surveys (DHS) program as our ground truth data, we train a set of regression and classification models to predict MCH indicators. Specifically, we train models to predict the following MCH indicators: Mean/Median BMI, Under Five Mortality Rate, Unmet Need Rate, Skilled Birth Attendant Rate, and Stunted Growth Rate. With the numerical data from Google Earth Engine and MOSAIKS as our training datasets, we use Microsoft Azure’s automated machine learning functionalities to automatically generate stack ensemble models. For the Landsat images, we experimented with convolutional neural networks and vision transformers (ViT). 

Our regression models trained using the extracted feature datasets from Google Earth Engine and MOSAIKS perform reasonably well for estimating several health indicators, with the model for estimating Skilled Birth Attendant Rate achieving the highest r-squared coefficient of 0.668 (r-squared coefficients for all models shown in figure below). Our classification model for estimating Skilled Birth Attendant Rate also received the highest classification accuracy of 68%. Altogether, our research suggests that utilizing satellite imagery and other geotagged data is a promising approach for estimating MCH indicators.

Figure 1. Predicting the percentage of births with a skilled attendant present. Ground truth (top) versus our model. Figure 2. R-squared results of regression models predicting MCH indicators.