Research

Reconstructing evolutionary history of cancer genomes

The reconstruction of cancer evolutionary history is important for early detection, prognosis, and treatment of patients. Phylogenetic inference methods have been widely used to reconstruct evolutionary history of cancer cells with different types of markers.

Phylogeny inference from copy number profiles of multiple samples

Using copy number alterations (CNAs) as markers, we have developed CNETML,the first program to jointly infer the tree topology, node ages, and CNA rates from longitudinal samples.

We are interested to develop new methods to enrich the toolbox of cancer phylogenetics.

References:

Stochastic modelling of somatic evolution

Linking genomics with stochastic modelling and Bayesian inference provides a powerful approach to quantify somatic evolution, which may help to predict disease progression and drug response.

Parameter inference with approximate Bayesian computation (ABC)

We have applied this approach to model CNAs and structural variants (SVs) resulted from chromosome instability from experimental and real patient data.

Our previous inferences of important prognosis-related parameters including chromosome mis-segregation rates and selection strengths were limited by the mixture of signals in bulk sequencing data.

We are interested to improve the inference at a higher resolution with increasingly available single cell data.

References:

Investigating intratumour heterogeneity and clonal evolution in cancer genomes

Intratumour heterogeneity (ITH) and clonal evolution often cause therapy failure and drug resistance in cancer patients. We worked on the analysis of genomic ITH in lung adenocarcinoma (LUAD) and hepatocellular carcinoma (HCC) patients previously.

The flow chart of PSiTE

To evaluate the performances of different variant callers and clonal decomposition methods, we also developed a phylogeny guided simulator for tumour evolution (PSITE).

We are interested to develop new methods to decompose ITH and decipher clonal evolution.

References:

Detecting lateral gene transfer and recombination in microbial genomes

Lateral gene transfer (LGT) and recombination are common and important evolutionary processes in microbes. We have developed two machine learning methods to predict genomic islands (GIs), a large genomic region probably acquired by LGT which may contain genes related to pathogenesis and antibiotic resistance.

The flow chart of GI-Cluster

In collaboration with Dr Tim Downing, Dr Caroline Wright, and Dr Xiatian Zhu, we will apply and adapt these methods to nucleocytoplasmic large DNA viruses which cause extensive livestock diseases.

References:

Phylogenetic networks are becoming essential to represent complex evolutionary relationships when LGT and other reticulation events are involved.

An example of a phylogenetic network

In collaboration with Prof. Louxin Zhang, we developed algorithms related to two fundamental problems in phylogenetic networks, the tree containment problem (TCP) and the cluster containment problem (CCP). We are interested to solve other related problems.

References:

Integrating multi-omics data to characterize cancer evolution and beyond

To understand all the underlying processes shaping cancer evolution and inform treatment, it is necessary to integrate data measurements of various types. High-throughput sequencing has been generating huge amounts of multi-omics data, which provide a rich resource of information to address important questions in cancer evolution. However, it is challenging to systematically integrate these heterogeneous data types. Machine learning (ML) has emerged as a promising technique for multi-omics data integration, but there are still many challenges including data of high dimension yet low sample size, data noise and missing information, and biological interpretations. We are interested to develop new ML methods to facilitate the integration of multi-omics data.

References:

Drug repurposing with ex vivo drug sensitivity testing and computational omics

Despite multiple options and drugs to treat cancer, treatments often fail due to intratumour heterogeneity, metastasis, and drug resistance. However, it remains very expensive and time-consuming to develop new cancer drugs. Drug repurposing serves as a cost-effective option to provide patients with affordable and effective individualized treatments.
In collaboration with Dr Mutsa Takundwa, we will develop new integrative strategies with machine learning to identify effective drug combinations for cancers of interest.

Reconstructing and analyzing cancer genome graphs

SVs often alter large genomic regions and play an important role in cancer progression. However, the complete landscape of SVs in cancer genomes has been understudied due to technical limitations and gradually gets improved with the new sequencing techniques such as long-read sequencing.

An example of reconstructed cancer genome graphs

Due to the extreme variety of SVs, graph-based genome representation provides a natural way to analyze SVs, but the utilities of these graphs have not been fully exploited. We are interested to develop new approaches to better understand the patterns and mechanisms of SVs with cancer genome graphs.