Genetic sequencing can predict the severity of COVID-19 Variants 

Genetic sequencing can predict the severity of COVID-19 Variants

Researchers at Drexel University have created a new computer model that could help health services prepare for new variants of COVID-19 using genetic sequencing.  

By analysing changes in the genetic sequencing of the COVID-19 virus and upticks in transmission, hospitalisation and death, the new model can use machine learning to provide an early warning for new variants of the virus.  

New variants are difficult to predict

Two years on from the beginning of the pandemic, scientists and public health officials are still attempting to predict how mutations of the SARS-CoV-2 virus are likely to make the infection more transmissible. Collecting and analysing the genetic data needed to identify new variants is a difficult process. Currently, most public health projections on new variants are based on observations and surveillance of the regions where they already spreading.  

“The speed with which new variants, like Omicron, have made their way around the globe means that by the time public health officials have a good handle on how vulnerable their population might be, the virus has already arrived,” said Bahrad A Sokhansanj, an assistant research professor in Drexel’s College of Engineering who led development on the project. “We’re trying to give them an early warning system – like advanced weather modelling for meteorologists – so they can quickly predict how dangerous a new variant is likely to be — and prepare accordingly.” 

The Drexel model uses a targeted analysis of the genetic sequencing given off the virus’s spike protein. Spike protein is the part of the virus that allows it to evade the immune system and infect healthy cells. The spike protein is also known to be the part of the virus that mutated the most, frequently throughout the pandemic. The Drexel model also considers the age, sex, and geographic location of COVID-19 patients.   

The research team used a machine learning algorithm called GPBoost, this algorithm was based on methods commonly used by large businesses to analyse sales data. Using textual analysis, GPBoost can home in on the areas of the virus’s genetic sequencing most likely associated with changes in the severity of the variant.  

The Drexel model compares these patterns with patient metadata (age and sex) and medical outcomes (mild cases, hospitalisations, deaths). This process allows the programme to make projections when it finds new mutations in the spike protein.  

“When we get a sequence, we can make a prediction about the risk of severe disease from a variant before labs run experiments with animal models or cell culture, or before enough people get sick that you can collect epidemiological data. In other words, our model is more like an early warning system for emerging variants,” said Sokhansanj. 

Genetic sequencing can improve healthcare efficiency

The Drexel model’s targeted approach to predictive modelling of COVID-19, is important because the large amount of genetic sequencing data that has already been collected has strained standard analysis methods. The new model will provide a more efficient method for extracting useful information.  

“Some estimates suggest that SARS-CoV-2 has only ‘explored’ as little as 30-40% of the potential space for spike mutations,” added Gail Rosen, Professor at the Drexel College of Engineering, who worked on the project. “When you consider that each mutation could impact key virus properties, like virulence and immune evasion, it seems vital to be able to quickly identify these variations and understand what they mean for those who are vulnerable to infection.” 

Until the Drexel Model, scientists have predominantly used genetic sequencing to better identify mutations alongside lab experiments and epidemiological studies. However, there has been little success in linking genetic sequencing to the virality of new variants. Researchers on the Drexel model believe this is because of progressive changes in vaccination and immunity, as well as inconsistencies in how data is reported in different countries.  

“The virus can and will continue to surprise us. We urgently need to expand our global capacity to sequence variants, so that we can analyse the sequences of potentially dangerous variants as soon as they show up — before they become a worldwide problem,” concluded Sokhansanj.  


Please enter your comment!
Please enter your name here