How new -omics data and deep learning are accelerating biomarker development for cancer

By Angela Qu, M.D., Ph.D., Senior Vice President, Translational Medicine, Global Head of Biomarkers & Genomic Medicine

Published on: Jan 21, 2025

4 min

How new -omics data and deep learning are accelerating biomarker development for cancer

Biomarkers are essential to diagnosing cancer, staging its progression, and identifying which patients will benefit most from precisely targeted therapies. They allow sponsors to design more efficient, faster trials by selecting the right patients and maximizing the chance they will respond to treatment with minimized side effects. However, there are no reliable and qualified biomarkers for many cancers. At Parexel, we use next-generation analytic tools and technologies to help clients identify diagnostic, prognostic, and predictive biomarkers that can be validated in subsequent development. We asked Angela Qu to describe how recent advances in multi-omics and computational analytics accelerate the discovery and validation of biomarkers in precision oncology.

Can you briefly describe the newest multi-omics data?

Scientists started by looking at the DNA in genes (genomics) and the expression of genes at the RNA level (transcriptomics) and protein products (proteomics). We also needed to understand how chemical markers turn genes on and off yet don’t change the DNA sequence (epigenomics). All these processes interact with the whole human metabolism (metabolomics), and we must also account for that. Together, we call these diverse data streams the “omics.” They have transformed how we search for and validate biomarkers in cancer.

Cancer cells that escape both the immune system and targeted treatments multiply unchecked, causing disease progression. Limiting their ability to do that is critical for patients and their families. Scientists have recently begun studying the genome's content and activity within a specific anatomical context (spatial genomics). This is exciting because it could be a sea change for preclinical and clinical research in precision oncology. Cancer cells are very challenging to kill because they reside in a three-dimensional, rapidly changing environment. By mapping a tumor's local microenvironment, we may better understand how malignancies evade immune cells’ surveillance and continue to access nutrient-rich surroundings in specific locations in the body.

How does multi-omics data drive biomarker discovery?

Conventional biomarkers measure the level or presence of one modality, such as the protein expression of a single gene (or a set of genes) or a specific type of circulating cell(s). Multi-omics increasingly enables us to construct more complex hybrid biomarkers that may more accurately diagnose, prognose, and predict patient outcomes.

Recently, we helped a company developing combination therapies to treat a solid tumor discover a predictive biomarker. We allocated a scientist with oncology-specific expertise in computational biology and statistical genetics from our in-house Biomarkers and Genomic Medicine team to support the work. The project is extraordinarily complex because it combines multiple data tiers, including circulating tumor DNA (ctDNA), cytokines, tissue and liquid biopsies, and imaging data. The goal is to construct and then validate a composite biomarker—or set of biomarkers—with the ability to differentiate responders from non-responders since no single tier of information, such as a mutation in one gene, has yielded a predictive biomarker in this challenging cancer. Parexel’s scientist works in residence with the sponsor, analyzing their data, distilling insights, and summarizing and reporting results. This highly effective collaboration model has consistently delivered new methodologies and innovative biomarkers for multiple clients.

Many companies are working hard to find predictive biomarkers to identify patients who will respond well to combination therapies. Predicting the therapeutic impact of combining two targeted therapies in individuals with cancer is a complex task. In this case, the sponsor made a substantial investment because they intend to use the analytical methodologies refined for this project across their portfolio of targeted oncology treatments.

Are recent advances in AI/ML/DL impacting the field?

Advances in computational AI analytics, including machine learning (ML) and deep learning (DL), have allowed us to combine conventional omics with newer omics, such as spatial genomics. The ever-expanding body of omics data is so large that we could not make sense of it without these powerful new tools.

For example, ML/DL can automate the processing and classification of imaging data from histopathology, which is used to diagnose and prognose cancer patients and predict their response to therapy. Digital histopathology—the visual interpretation of cellular and tissue biology captured in slide images that have been digitized—requires pattern identification and analysis. Traditionally, this was done manually by expert pathologists. The emergence of non-traditional, composite biomarkers that incorporate imaging data would not be possible without automating this historically manual process.

Advances in computational AI analytics, including machine learning (ML) and deep learning (DL), have allowed us to combine conventional omics with newer omics, such as spatial genomics.

Has AI/ML/DL produced concrete results yet?

Absolutely! Our computational scientists recently collaborated with a large pharmaceutical company to improve an existing methodology for histological subtyping in lung cancer. The joint scientific team trained a deep neural network using whole-slide images derived from proprietary and public data sources, including The Cancer Genome Atlas (TCGA) (a publicly accessible repository housing rich genomic, epigenomic, transcriptomic, and proteomic data totaling over two petabytes). The goal was to improve the accuracy, specificity, and sensitivity of the subtyping methodology. By applying AI/ML/DL-based computational pathology algorithms, we developed several promising methodological approaches that may help diagnose other cancers or cancer subtypes soon.

In another successful project, our Biomarkers and Genomic Medicine team worked closely with a sponsor to apply a novel DL approach for analyzing chimeric antigen receptor (CAR) T cell microscopic slide imaging data, which enabled automated immune cell characterization. The resulting algorithm demonstrated high predictive accuracy in differentiating cellular phenotype features. We are still refining and validating this methodology, but it holds significant promise for the discovery and development of CAR T cell therapies and cellular imaging biomarkers. There are persistent gaps in our understanding of the immune environment of cancer, including different types of immune cells. Monitoring this microenvironment could allow us to quantify a treatment’s impact on immune cells because microscopy can be performed pre- and post-treatment, and methods such as DL can dissect the cellular features in a more complicated dimension.

Our computational scientists recently collaborated with a sponsor to improve an existing methodology for histological subtyping in lung cancer. The joint scientific team trained a deep neural network using whole-slide images derived from proprietary and public data sources, including The Cancer Genome Atlas.

What is the top challenge for companies developing biomarkers?

Most oncology-focused companies are eager to incorporate biomarkers because they recognize their value in clinical development. It’s encouraging to see more oncology drug developers consider collecting and correctly storing patient samples for biomarker research. However, after completing early exploratory biomarker research, they often underestimate the effort required for the biomarker's technical, analytical, and clinical validation.

Technical validation involves confirming the assay’s ability to measure what it is supposed to measure with accuracy and reproducibility. Clinical validation requires that a clinical study demonstrate that the biomarker correlates with a response or a safety signal, whatever the intent. To properly validate a biomarker, you need to plan early. For example, the clinical study must be sufficiently powered (enrolling a large enough sample size) to validate the biomarker. Companies must also determine if there is a gold-standard biomarker that can be used as a comparator in the study.

Many companies are surprised by the high levels of variability and heterogeneity in biomarker research. Variability exists in patient populations, laboratory performance and qualification, testing platforms, and sample handling, among other things. These factors can impact a biomarker's performance and must be addressed during validation. For example, ensuring assay standardization across different laboratories, platforms, and study sites is a considerable challenge. The goal is to maintain consistency and robust testing compatibility. At Parexel, we guide sponsors on assay selection, choosing suitable laboratories, and prioritizing technical considerations during development and validation.

Many companies are surprised by the high levels of variability and heterogeneity in biomarker research. Variability exists in patient populations, laboratory performance and qualification, testing platforms, and sample handling, among other things.

What keeps you inspired in this work?

It’s inspiring to see that AI and other analytical advances accelerate cancer drug development and provide direct patient benefits, and we are proud to be part of this journey. You need a diagnostic biomarker for an accurate diagnosis, a prognostic biomarker to stratify the patient population, and a predictive biomarker to find patients with the best chance of a response. These help with the sometimes exhaustive inclusion/exclusion criteria for trials of precisely targeted therapies and guiding treatment with existing therapies.

Generating new insights and solving specific problems in collaboration with sponsors are how we will fulfill the promise of precision oncology and benefit more patients. I consider it a privilege to lead and work with Parexel’s global biomarker team of about 50 talented scientists with expertise in genomics and genetics, bioanalysis, computational biology, statistical genetics, and data engineering. Working alongside this remarkable team, I am inspired and driven to share my knowledge and learn new things every day!

Contributing Experts

Angela Qu, M.D., Ph.D.

Senior Vice President, Translational Medicine, Global Head of Biomarkers & Genomic Medicine

Read bio

Regulatory Strategies

Community sites are a win-win for cancer patients and sponsors, but there are some risks to manage