AI system can predict a positive or negative COVID-19 test result

The study shows that machine-learning models can help predict COVID-19 infections.
Brittney Grimes
A positive blood test result for coronavirus in a test tube.
A positive blood test result for coronavirus in a test tube.

Ovidiu Dugulan/iStock 

Researchers have discovered a new way to predict which features are most useful in determining test results for COVID-19. The research team, from Florida Atlantic University’s (FAU) College of Engineering and Computer Science in the U.S., used AI to predict positive or negative COVID-19 test results.

The most common techniques currently used to detect COVID-19 are blood tests, also called serology tests, and molecular tests. Since the two assessments use different methods, they vary substantially.

“Molecular tests depend on viral load and serology tests depend on seroconversion, which is the period during which the body starts producing detectable levels of antibodies. Both of these tests are time dependent,” said Dr. Xingquan “Hill” Zhu, senior author of the study and a professor in FAU’s Department of Electrical Engineering and Computer Science.

AI system can predict a positive or negative COVID-19 test result
Dr. Xingquan “Hill” Zhu (left) and Magdalyn E. Elkin, a Ph.D. student, in FAU’s Department of Electrical Engineering and Computer Science.

The research team used a type of artificial intelligence called machine learning (ML) to understand the correlation between the blood tests and molecular tests. The team used ML to discover which features are the most effective in distinguishing COVID-19 test results.

The study was published recently in the journal Smart Health.

How the two tests detect the virus differently

The two types of tests have different processes to detect infection in an individual. The molecular test measures the prevalence of viral SARS-CoV-2 RNA, while the serology test detects the presence of antibodies triggered by SARS-CoV-2.

The study mentioned that other evaluations have shown that symptoms of COVID, along with demographic and diagnosis features can help predict COVID-19 test outcomes. However, due to the vast differences in the assessments, serology and molecular tests provide too much variation to predict the possible outcome.

AI used to determine COVID-19 results

The researchers trained five sets of algorithms to predict COVID-19 test results, if they would be negative or positive. They created the predictive system using symptom features of the virus, along with demographic features that included fever, number of days post-symptom onset, age, and gender.  

The study revealed that AI could be used to predict COVID-19 infections. This was done by training machine learning models using demographics and symptoms. The results from the study were able to identify key symptoms associated with the infection and provide cost-effective detection, as well as rapid screening.

The research team noticed that the molecular test had the lowest positive rate because it measured the infection in its current state. The study also showed that the total number of days an individual has symptoms, such as fever or cough, played a large role in the test results.

Another difference between the molecular test and serology test was the onset days of symptoms. Molecular tests had post-symptom onset days between three to eight days, while serology tests have post-symptom offset days between five to 38 days.

The COVID-19 tests vary significantly due to changes in the participants’ immune response and viral load. This can cause possible positive/negative results from two different tests in the same individual.

The study

Dr. Zhu explained the significance of the study, which could be used to screen patients for the infection. “Our results suggest that the number of days post symptomatic are highly important for a positive COVID-19 test and should be under careful consideration when screening patients,” he stated.

Researchers utilized COVID-19 test results from a total of 2,467 individuals. These results were from one or multiple Covid tests and were collected for the study, used as the testbed.

“One unique feature of our testbed is that some donors may have multiple test results, which allowed us to analyze the relationship between serology tests versus molecular tests, and also understand consistency within each type of test,” Zhu stated. The team created a set of features for predictive modeling using the five types of machine-learning models.

ML models and the binning approach used in the study

The five models used were Random Forest, XGBoost, Logistic Regression, Support Vector Machine (SVM) and Neural Network.

“Because COVID-19 produces a wide range of symptoms and the data collection process is essentially error prone, we grouped similar symptoms into bins,” said Zhu. “Without a standardization of symptom reporting, the symptom feature space greatly increases. To combat this, we utilized this binning approach, which was able to decrease symptom feature space while keeping sample feature information,” he continued.

The binning approach is a process that distributes sorted data into bins, or smaller intervals, in order to minimize errors and to handle noisy data. “Our researchers have designed a new way to narrow down noisy symptom features for clinical interpretation and predictive modeling,” said Dr. Stella Batalama, a dean at FAU College of Engineering and Computer Science. “Such AI based predictive modeling approaches are becoming increasingly powerful to combat infectious diseases and many other aspects of health issues.”

The team combined the binning approach with the five ML algorithms to generate a prediction with more than 81 percent AUC scores - area under the receiver operating characteristic (ROC) curve, which provides an aggregate measure of performance across all possible classification thresholds- and with over 76 percent classification accuracy. The ROC curve shows a measure of true positives and true negatives in results.

“Predictive modeling is complicated by many puzzling questions unanswered by research. The testbed created by our researchers is indeed novel and clearly shows correlation between different types of COVID-19 tests,” said Batalama.

Abstract: Molecular tests and serology tests are the most commonly used methods for rapid COVID-19 infection testing. The two types of tests have different mechanisms to detect infection, by measuring the presence of viral SARS-CoV-2 RNA (molecular test) or detecting the presence of antibodies triggered by the SARS-CoV-2 virus (serology test). A handful of studies have shown that symptoms, combined with demographic and/or diagnosis features, can be helpful for the prediction of COVID-19 test outcomes. However, due to the nature of the test, serology and molecular tests vary significantly. There is no existing study on the correlation between serology and molecular tests, and what type of symptoms are the key factors indicating the COVID-19 positive tests. In this study, scientists propose a machine learning based approach to study serology and molecular tests, and use features to predict test outcomes. A total of 2,467 donors, each tested using one or multiple types of COVID-19 tests, are collected as the testbed. By cross checking test types and results, researchers study correlation between serology and molecular tests. For test outcome prediction, 2,467 donors were labeled as positive or negative, by using their serology or molecular test results, and symptom features were created to represent each donor for learning. Because COVID-19 produces a wide range of symptoms and the data collection process is essentially error prone, the scientists grouped similar symptoms into bins. This decreases the feature space and sparsity. Using binned symptoms, combined with demographic features, they trained five classification algorithms to predict COVID-19 test results. Experiments show that XGBoost achieves the best performance with 76.85% accuracy and 81.4% AUC scores, demonstrating that symptoms are indeed helpful for predicting COVID-19 test outcomes. The study investigates the relationship between serology and molecular tests, identifies meaningful symptom features associated with COVID-19 infection, and also provides a way for rapid screening and cost-effective detection of COVID-19 infection.