Identifying epigenetic drivers of cancer with machine learning

machine learning

Scientists have developed an innovative machine learning technique that is able to identify the epigenetic drivers of cancers accurately.

The study, published in Cancer Discovery, was a collaborative project between researchers at Weill Cornell Medicine, New York-Presbyterian, and the New York Genome Center (NYGC), and successfully created a machine learning method capable of identifying gene modifications that trigger cancer.

The team examined methylation – a particular type of DNA-modifying chemical that characteristically suppresses its neighbouring genes. This novel method can efficiently analyse thousands of DNA methylation modifications in tumour cells, determining which ones pose the most significant risk of instigating tumour growth.

What is methylation?

Methylation is responsible for regulating gene activity throughout the genome – making it an epigenetic process, achieving its function by modifying the structure of the DNA without changing the information within the genes. However, in the events that there is excessive methylation present – hypermethylation – around a tumour suppressing gene, this can subsequently silence the gene, initiating the cell division that causes cancer.

Dr Dan Landau, the senior author of the study and an oncologist at New York-Presbyterian/Weill Cornell Medical Center, said: “If we can profile a large number of tumours with techniques like this, we can map the epigenetic changes that are contributing to tumour growth in certain cancers. Then we can use that information to improve our understanding of cancer origins, as well as to optimise treatments for individual patients.

“The challenge addressed by the new technique is similar to the one cancer researchers have faced regarding DNA mutations – how to distinguish driver mutations from more abundant passenger mutations that do not affect cancer. Though there are now sophisticated methods for making the distinction among genetic mutations, techniques for distinguishing driver methylation changes from passenger methylation changes have not been nearly as sophisticated.”

MethSig algorithm

To analyse the behaviour of the methylation, the team manufactured an algorithm called Methsig, which monitors the background rate of methylation in the genome, estimating the point in which it may be a cancer driver. Next, the algorithm was applied to DNA methylation maps from varying tumour types, discovering a small number of cancer-driver events – each tumour containing around a dozen-compared to thousands of passenger methylation changes. The patterns were consistent across multiple patients and tumour types, indicating a non-incremental performance increase compared to other methods.

Additionally, multiple DNA methylation cancer drivers were confirmed by eliminating the affected gene in chronic lymphocytic leukaemia cells, signifying that removal of the gene enhanced untreated cell growth, and concluding that this method is more accurate than previously used techniques. The team then demonstrated the qualities of the algorithm by applying it to a set of chronic lymphocytic leukaemia samples, predicting the aggressiveness of each patient’s cancer.

Dr Heng Pan, a senior research associate at Weill Cornell Medicine, said: “The classifier we developed using MethSig produced estimated risks for each patient, and we found that patients with higher estimated risks were more likely to have had worse outcomes.”

“Ultimately, we envision being able to map the entire landscape of cancer-driving DNA methylation changes, for different tumour types and in the contexts of different treatments, so that we can expand the scope of precision medicine beyond genetics to include also the critical dimension of epigenetic changes in cancer,” added Landau.


Please enter your comment!
Please enter your name here