Exascale Deep Learning and Simulation Enabled Precision Medicine for Cancer

The nation has recently embarked on an all government approach to the problem of cancer, codified in the ‘Cancer Moonshot’ initiative of the Obama administration led by Vice President Biden. Cancer is an extremely complex disease, which disrupts basic biological processes at a fundamental level, leading to renegade cells threatening the health of an individual. To accelerate the capabilities needed to realize the promise envisioned for the Cancer Moonshot and to establish a new paradigm for cancer research for years to come, the Department of Energy (DOE) entered into a partnership with the National Cancer Institute (NCI) of the National Institutes of Health (NIH). This partnership identified three key challenges that the combined resources of DOE and NCI can accelerate: to provide better understanding of the disease, to make effective use of the ever-growing volumes and diversity of cancer related data to build predictive models, and, ultimately, to provide guidance and support decisions on anticipated effective treatments for individual patients.

Four DOE national laboratories are collaborating with the NCI and the NCI-supported Frederick National Laboratory for Cancer Research to advance these challenges. The DOE laboratories are drawing on their strengths in HPC, machine learning and data analytics, and coupling those to the domain strengths of the NCI, particularly in cancer biology and cancer healthcare delivery, to bring the full promise of exascale computing to the problem of cancer and precision medicine.

The first challenge—the “RAS pathway problem”—is to understand the molecular basis of key protein interactions in the RAS/RAF pathway that is present in 30% of cancers. The second challenge—the “drug response problem”—is to develop predictive models for drug response that can be used to optimize pre-clinical drug screening and drive precision medicine based treatments for cancer patients. The third challenge—the “treatment strategy problem”—is to automate the analysis and extraction of information from millions of cancer patient records to determine optimal cancer treatment strategies across a range of patient lifestyles, environmental exposures, cancer types and healthcare systems.

While each of these three challenges are at different scales and have specific scientific teams collaborating on the data acquisition, data analysis, model formulation, and scientific runs of simulations, they also share several common threads. First, they are all linked by common sets of cancer types that will appear at all three scales (i.e., molecular, cellular and population), all have to address significant data management and data analysis problems, and all need to integrate simulation, data analysis and machine learning to make progress. This project focuses on the machine learning aspect of the three challenges and, in particular, building a single scalable deep neural network code called CANDLE (CANcer Distributed Learning Environment) that can be used to address all three challenges.


In the RAS pathway problem, we guide multi-scale molecular dynamics (MD) runs through a large-scale state-space search, using unsupervised learning to determine the scope and scale of the next series of simulations based on the history of previous simulations. The scale of the deep learning in this problem comes from the size of the state-space (O(109)) that must be navigated and the number of model parameters to describe each state (O(1012)).

In the drug response problem, we use supervised machine learning methods to capture the complex, non-linear relationships between the properties of drugs and the properties of the tumors to predict response to treatment and therefore develop a model that can provide treatment recommendations for a given tumor. The scale in this problem derives from the number of relevant parameters to describe properties of a drug or compound (O(106)), number of measurements of important tumor molecular characteristics (O(107), and the number of drug/tumor screening results (O(107)).

In the treatment strategy problem, we use semi-supervised machine learning to automatically read and encode millions of clinical reports into a form that can be computed upon. These encoded reports will be used by the national cancer surveillance program to understand the broad impact of cancer treatment practices and drive simulations of entire cancer populations to determine optimal treatment strategies for patient cohorts. The scale of this problem is determined by the number of individual patient records (O(108)), the scale of the medical vocabulary (O(105), and the scale of the structure output record (O(105)). When clinical images are added, the input scale jumps an additional two orders of magnitude.

While each problem requires a different approach to the embedded learning problem (i.e., unsupervised, supervised and semi-supervised), all can be supported with the same scalable deep learning code. The common requirement across all three problems is for a flexible deep learning engine that a) scales to use the full memory (> PBs), b) effectively utilizes high-performance interconnects, c) effectively exploits advanced memory hierarchies incorporating both ultra-high bandwidth memory stacks and non-volatile memory, d) takes full advantage of floating point accelerators for vector and matrix operations, and e) can be optimized for the DOE CORAL, APEX and exascale systems.


Our approach to developing CANDLE is to leverage the best open source tools and ideas that have been developed over the last decade by the deep learning community. Toward that end, we are currently evaluating the top community developed deep learning systems that have been released as open source, to determine if any of them offer enough leverage as a starting point for CANDLE. This evaluation will be done on model versions of our specific cancer problems and on the interim machines for CORAL (Theta at Argonne, Summit early-access system at Oak Ridge, and Sierra early-access system at LLNL). Our development plan calls for annual software releases, continuous scalability and performance characterization, and tracking key metrics of performance and progress over the course of the project. Our ECP application end goal is to enable high-performance deep learning in support of the DOE-NCI Cancer project. At the end of the ECP project period, we expect to have achieved major advances in the performance of deep learning on DOE leadership architectures, and have achieved an understanding of the capabilities of deep learning to address key computational bottlenecks in cancer research.

The focus of this project is entirely on the core CANDLE development, scaling CANDLE to exascale and the demonstration of CANDLE on the three cancer use cases. All software developed in this project will be made available as open source.


Principal Investigator
Rick Stevens
Associate Laboratory Director
Argonne National Laboratory
Professor of Computer Science
University of Chicago







Recent News

Subscribe to our mailing list

* indicates required

Powered by MailChimp

The work performed by Argonne will be in accordance with DOE’s contract with UChicago Argonne, LLC, for the operation of Argonne (Contract No. DE‐AC02‐06‐CH11357) and the MOU between DOE and NCI for the “Exascale Deep Learning Enabled Precision Medicine for Cancer”. The ECP is executed by the DOE’s Office of Science and the National Nuclear Security Administration; it is the DOE’s contribution to President Obama’s National Strategic Computing Initiative (NSCI).