Sebastian Ruiz's Github repository for Data 150 at WIlliam and Mary
Word Count: 2108
Topic Proposal
If a person is sick or disabled and does not have access to adequate healthcare, then they are unable to go out to work or find work, they might be unable to take care of their children, they might result to self-medicating with illegally attained and potentially dangerous drugs, and their lives could very well end early. Health is a prerequisite to living and therefore a prerequisite to the development of human life. That is why my research topic is about the utility of geospatial data science methods to providing beneficial health outcomes in the developing world. Most of my sources so far examine regions in Sub-Saharan Africa, however, for this initial research I did not focus too much on a specific region yet because I wanted to explore the different applications of data science in health care first. Given how important health is for human development, simulations of human development in the context of healthcare could provide essential benefits.
Annotations
Alegana, V. A., Wright, J., Pezzulo, C., Tatem, A. J., & Atkinson, P. M. (2017). Treatment- seeking behaviour in low- and middle-income countries estimated using a Bayesian model. BMC Medical Research Methodology, 17(1). doi:10.1186/s12874-017-0346-0
This article's objective was to explore if categorical data, in the form of treatment-seeking behavior, can be used to predict disease treatment in developing countries. Treatment-seeking behavior describes a person's attempts to get formal healthcare to mitigate or treat their illness. Treatment-seeking behavior is important to ensuring that disease does not spread in developing countries. However, it is difficult to quantify in a way that it applies to all communities because it is affected by multiple economic and demographic factors that vary greatly in developing communities. Therefore treatment-seeking behavior is not analyzed often as a predictor while it could prove useful to expanding access to healthcare in the developing world. To address this issue, the researchers used Bayesian models to use categorical treatment-seeking behavior to perform a continuous probabilistic estimate of fever treatment in Namibia. The dataset they used was a nationally representative Namibian household survey from the 2013 Demographic and Health Survey. They discretized the individual level treatment-seeking responses by using their geographic data to connect them to estimates of travel-time to the nearest healthcare facility and then fitted their Bayesian models with Markov Chain Monte Carlo simulations to predict the probability of fever treatment for children five years old and under. The models found that individuals that are far away from healthcare facilities have a 30% chance of seeking treatment and that those that are closer are more probable to exhibit healthier behaviors. This article demonstrates the importance that probabilistic data science models have to discretizing categorical variables that would otherwise be difficult to use in developing countries and turning them into a valuable asset for prediction. It addresses the sustainable development goal of ensuring that people live long and be healthy by examining geographical data to investigate if the distance from healthcare facilities is indicative of people actually receiving treatment. Furthermore, this article relates well to Amartya Sen's definition of human development as it indicates that people need the freedom to access healthcare in order to live well. The research noted that their methodology can be extended to other categorical behaviors from survey data and to other illnesses which proves well for future application.
Dotse-Gborgbortsi, W., Dwomoh, D., Alegana, V., Hill, A., Tatem, A. J., & Wright, J. (2020). The influence of distance and quality on utilization of birthing services at health facilities in Eastern Region, Ghana. BMJ Global Health, 4(Suppl 5). doi:10.1136/bmjgh-2019- 002020
This article sought to discretize data on the distance and quality of healthcare treatment centers to analyze their effect on the use of birthing services in the Eastern region of Ghana. Maternal mortality rates are especially concerning in developing countries and attendance at skilled birthing services being the best way to prevent them. The researchers notice a gap in research that uses healthcare data to analyze distance and healthcare quality's effect on child birthing service use in developing countries. Therefore, they used a spatial interaction model and routine birth data, obstetric care surveys, and geospatial data on the health facility locations to use socioeconomic and demographic characteristics to model movement patterns of pregnant individuals in the Eastern region of Ghana. This article used four data sets for its study: HMIS records of hospital-based births from Ghana's District Health Information Management Systems 2, a gridded map layer of estimated pregnancy for 2015 from the HMIS to attain the locations of the healthcare facilities, and a nationally representative sample survey of emergency obstetric and newborn care to measure healthcare quality. They used WorldPop to fit their spatial interaction model and produce maps that modeled the movement patterns. The study found that as little as a kilometer increase in distance from a healthcare facility has a significant effect on the prevalence of women using birthing services—it reduced the rate by 6.7%, that women in rural areas travel on average four kilometers more to reach their nearest healthcare facility, and that 56% bypassed their nearest healthcare facility. This article highlights the importance of placing healthcare facilities closer to populations in developing countries by demonstrating how much access to quality healthcare would be improved if people had to travel less. Additionally, the fact that most people passed their nearest healthcare facility shows the importance of assuring that individuals are aware of their nearest healthcare resources and are comfortable using them—this interesting trend would not have been apparent without a study like the one implemented in the article. Monitoring people's travel patterns to healthcare facilities is valuable to ensuring that everyone can have access to quality healthcare. This is especially important in the context of Amartya Sen's definition of human development that development comes from the elimination of unfreedom because lack of access to quality healthcare often leaves people trapped by the confines of their circumstances.
Messina, J. P., Kraemer, M. U., Brady, O. J., Pigott, D. M., Shearer, F. M., Weiss, D. J., . . . Hay, S. I. (2016). Mapping global environmental suitability for Zika virus. ELife. doi:DOI: 10.7554/eLife.15272
This study presents a species distribution model that maps the environmental suitability of the Zika virus. Although the Zika virus was just confined to Uganda when it was first discovered, outbreaks throughout the world in recent years, have sparked public concern as the virus has killed millions. The virus spreads fast since is also transmitted through mosquitoes and this is especially concerning for people in the developing world since hotspots for Zika outbreaks have been located almost exclusively in developing countries. To address the problems above, the researchers created a species distribution modelling to map environmental suitability for Zika in order to present a way to predict Zika outbreaks. To create a model for the sustainable zone for the Zika virus, this study used species distribution modelling to establish an empirical and multivariate relationship between the probability of Zika cases and the environmental conditions in locations where the virus has been confirmed to occur. To measure this relationship, they used ensemble boosted regression trees. These are models that use data on variables associated with Zika transmission to generate thousands of regression trees that attempt to predict areas that are suitable for the Zika virus. The ensemble models where fitted with predictors from peer-reviewed literature, case reports, and other informal online sources. These predictors include known locations of disease occurrence in humans, locations where Zika has not yet been reported, and a set of multiple geographic and socio-economic factors such as biome and poverty rates. After conducting their Ensemble model, the researchers generated a five by five kilometer spatial-resolution global map of environmental suitability for Zika transmission to humans. The study found that many countries with tropical climates are susceptible to the Zika virus, that 63% of all recent cases originate from the 2015 Zika outbreak in Latin America, and that there is an astounding 10,000 square kilometers of regions across the world that have not had Zika cases but are susceptible to the virus because they provide the right environment for it. I found this source particularly interesting because, instead of tracking the movements of people, it tracked the movement and environmental suitability of a virus. Research like this is critical to ensuring that individuals in developing countries that are susceptible to the Zika virus can have good health by being aware of the virus and taking the necessary precautions. This source addresses the sustainable development goal of providing those in developing countries with good health as well the goal of making sure people are educated and can face upcoming challenges. This relates well to Amartya Sen's definition of development because providing people and governments with the knowledge of their susceptibility to the Zika virus gives them the freedom to prepare for it and the freedom of good health.
Sedda, L., Tatem, A. J., Morley, D. W., Atkinson, P. M., Wardrop, N. A., Pezzulo, C., . . . Rogers, D. J. (2015). Poverty, health and satellite-derived vegetation indices: Their inter- spatial relationship in West Africa. International Health, 7(2), 99-106. doi:10.1093/inthealth/ihv005
This source investigates the spatial relationship between health, poverty, and satellite-derived vegetation indices such as the normalized difference vegetation index (NDVI) in West Africa. Their goal is to model the statistical connections connection between each of the three factors. Given that poverty and poor health are pervasive problems in West Africa, this source provides valuable insight into how vegetation indexes correlate with poverty and health that can be used by governments and humanitarian organizations to explore the importance of clean environmental conditions and vegetation cultivation. The researchers used principal component analysis to synthesize geographic and socioeconomic variables to fit them into three different types of spatial models, variography, factorial kriging and cokriging, to analyze the correlations between the degree of poverty, health, and NDVI for a large part of West Africa which includes Benin, Burkina Faso, Cameroon, Cote d'Ivoire, Ghana, Mali, Niger, Nigeria and Togo. The data on health and poverty from these countries comes from the Oxford Poverty & Human Development Initiative and the environmental variables come from the MODIS sensor of NASA's Terra and Aqua satellites. Child mortality and undernutrition were used to indicate health;years of schooling, school attendance, cooking fuel, improved sanitation, safe drinking water, electricity, flooring, and assets ownership where to indicate poverty, and land surface temperature, NDVI and elevation where the environmental variables. The study found that poverty and health vary inversely with NDVI in West Africa. This source demonstrates how useful satellite imagery can be at modeling poverty in developing countries as well as highlights how environmental factors such as climate change impact developing countries. NDVI is an efficient predictor that can be used to make progress towards the sustainable development goals of poverty elimination and good health. This relates with Amartya Sen's definition of human development as the elimination of unfreedoms because it shows a healthy environment is necessary so that people have the freedom to be healthy and to lift themselves out of poverty.
Utazi, C. E., Thorley, J., Alegana, V. A., Ferrari, M. J., Takahashi, S., Metcalf, C. J., . . . Tatem, A. J. (2018). High resolution age-structured mapping of childhood vaccination coverage in low and middle income countries. Vaccine, 36(12), 1583-1591. doi:10.1016/j.vaccine.2018.02.020
This article models gaps in childhood vaccination coverage in developing countries. Its objective is to identify heterogeneities and gaps in childhood vaccination coverage as well as to investigate which variables are associated with them. Although there are successful childhood vaccination programs in developing countries, due to the high financial and time costs of surveys, their coverage levels are usually evaluated through national surveys and statistics. These large-scale surveys tend to hide homogenies and gaps in coverage and hence enable diseases to spread. Therefore, to address this problem, this source uses geographic data to model homogenies and gaps in childhood vaccination coverage. It uses data fromcluster-level Demographic and Health Surveys and Bayesian models implemented via Markov Chain Monte Carlo (MCMC) methods to model measles vaccination coverage in Cambodia, Mozambique, and Nigeria for children under five years. The variables investigated to predict coverage where travel time, population density, distance to residential areas, distance to infrastructure, precipitations, poverty rates, partial still, and spatial decay. The study found that there are various districts in each country that have not me the World Health Organization's target of 80% vaccination coverage and that the distance to residential areas, infrastructure, and poverty rates are indicative of gaps in coverage. Studies like this one demonstrate that widespread use of these geostatistical data science methods enable faster and more efficient data collection and modeling than traditional methods such as a national census. In order to achieve the sustainable development goal of good health and eliminate disease, it is imperative that rural and underrepresented areas are considered so that there are no gaps in coverage. This relates to Amartya Sen's definition of development as he elimination of unfreedoms because examining vulnerable populations is necessary so that all individuals in developing countries have the freedom of good health. Furthermore, this study is evidence that rural populations should have nearby access to quality healthcare providers,
Further Investigation
Now that I have been introduced to data science applications in healthcare for developing countries, I have decided to investigate the use of geostatistical methods to model gaps in healthcare coverage in West Africa. My research showed me how pervasive lack of healthcare access is in developing countries in West Africa today. I agree with Amartya Sen that good health is a prerequisite so that individuals can better their lives. I believe that agent based modeling provides invaluable information that can be used to set up quality healthcare facilities in places that need them.