How to Enhance the Accuracy of Healthcare Predictive Analytics
By Roman Davydov
In recent years, predictive analytics has been implemented across a wide range of industries, and the healthcare sector is no exception. The global healthcare predictive analytics market is valued at $20.68 billion in 2024 and is expected to grow up to $52.61 billion by 2028, according to Research and Markets.
Health professionals can use healthcare predictive analytics to generate accurate forecasts and make informed clinical decisions, providing more personalized patient treatment and care. Kaiser Permanente hospitals, for example, were using predictive analytics during COVID-19 to forecast numbers of hospital admissions six weeks in advance and operate more efficiently.
However, generating accurate and meaningful healthcare forecasts can be challenging, as models must be configured and managed properly in order to drive such results. This article covers healthcare predictive analytics accuracy and provides recommendations for enhancing it.
What is model accuracy in predictive analytics?
Accuracy is one of the common metrics for evaluating the performance of a machine learning mode that is mainly used in data classification tasks. The metric measures how well the model performs by quantifying the percentage of its correct classifications. The simple formula for accuracy calculation looks the following way:
Accuracy = the number of correct predictions/the total number of predictions
Let’s say a hospital has a machine learning model that predicts ovarian cancer based on data collected from blood tests. Specialists feed it historical data from 60 patients who once visited the hospital, some of whom later developed ovarian cancer. The model forecasts that 28 people would get cancer and 32 would stay healthy. The actual data shows that 24 people got cancer and 36 did not. Therefore, 52 out of 60 of the model’s predictions are correct, meaning its accuracy is 87%.
4 tips to achieve better predictive model accuracy
Here are ways to achieve better predictive model accuracy.
- Optimize hyperparameters
Before putting a predictive model to work, data analysts must determine its optimal settings, which are called hyperparameters. Since hyperparameters directly determine the model’s behavior, they can significantly affect the performance and accuracy of healthcare predictive analytics.
Data analysts can use different methodologies to optimize hyperparameters and thus improve forecasting accuracy. In particular, analysts can use techniques such as random search, grid search, or Bayesian optimization, each of which can better suit different forecasting scenarios.
- Enhance data quality
The accuracy of forecasts is directly related to the quality of the data processed by the model, so healthcare institutions should prioritize improving it.
Preprocessing data before analyzing it is one of the ways to enhance quality. Data analysts at the institution can perform data cleaning, which involves identifying and correcting errors and inconsistencies in the data set. To make this procedure more efficient, data analysts can automate it using data cleaning tools built into a cloud storage solution (in case an institution uses one). Alternatively, analysts can write their own custom data cleaning scripts (for instance, when the data requires more complex cleaning) using Python or other programming language.
Another way to improve data quality is normalization, which involves restructuring and unifying the information in a database to make it compliant with normal forms to improve its integrity. Normalized data is easier to process and interpret, which can enhance learning speed and forecast accuracy of the ML model.
- Gather additional data
Increasing the volume and variety of data processed and analyzed by a predictive model can also with time improve its prediction accuracy. EITCA Academy highlights an example of a deep learning model with an initial image recognition of 85%. When the data set was increased from 1,000 to 10,000 images, the model’s accuracy reached 92%.
This means that if an institution has a model that generates predictions based on time series data to forecast future medical events, fueling it with several years’ worth of data instead of data from a couple of months can help enhance forecast efficiency.
- Continuously evaluate a predictive model’s accuracy
A healthcare institution should also monitor the accuracy of its existing ML model continuously after putting it into operation. Basically, its accuracy can range depending on many factors, including the data quality, the dataset size, and the type of algorithm used by the model. For example, researchers from Google have claimed their Med-Gemini model achieved 91.1% accuracy in medical diagnosis, the best result among all healthcare-oriented models.
An institution can evaluate the accuracy of its predictive analytics model by using the same benchmark Google used in its study, namely MedQA. This Multiple-Choice Question Answering (MCQA) dataset made from US, Chinese, and Taiwanese physician licensing exams allows researchers to evaluate the model’s ability to understand and respond to a variety of medical questions.
To run more comprehensive assessments, data analysts can also test the model against other benchmarks, such as MedExQA. The MedExQA dataset complements the common MedQA questionnaire with question-answer pairs related to clinical psychology, biomedical engineering, clinical laboratory science, speech pathology, and other topics.
Final thoughts
Predictive analytics technology allows physicians, lab workers, and other healthcare professionals to analyze large amounts of medical data and generate forecasts for certain events or outcomes, which can be used to improve patient care and treatment quality. However, if the model is not accurate enough, its forecasts can be misleading and result in medical errors.
Medical institutions can take multiple measures to improve the accuracy of their predictive models. It’s advised to start by optimizing a model’s hyperparameters and assessing the accuracy of the model’s current performance. If improvements are required, data analysts can run data cleaning and normalization and increase the training data volume.
In addition, institutions should implement reliable predictive analytics software designed specifically for healthcare providers. Predictive analytics experts can help an institution select suitable platform-based tools or build a solution from scratch based on the organization’s unique requirements.
Roman Davydov is Technology Observer at Itransition.