Two topics that are of continuing interest to me are supercomputing and cancer research. The former because I was involved in it in the past, and the latter because the time and expense it involves makes me wonder if anyone is actually serious about it. For example, every scientific article I read is about treating cancer, especially developing new drugs, but doesn't address the more critical, foundational issue – why is there cancer?
In that vein, an article crossed my desk from Frontiers in Oncology: AI Meets Exascale Computing: Advancing Cancer Research With Large-Scale High-Performance Computing.
That's the message you hear about supercomputers, especially the forthcoming "exascale" computer, which is promoted by the Department of Energy as the promise to cure cancer.
Before examining what this means, let's look at a brief history of supercomputing to put this into perspective. I was involved with the fastest supercomputer ever In June 1997 at Sandia Labs, Intel's ASCI Red. It was the world's first computer to achieve one teraFLOP and beyond. A teraFLOP is a trillion floating-point calculations per second.
It had distributed memory MIMD (Multiple Instruction, Multiple Data) message-passing, every compute node had two 200 MHz Pentium Pro processors, each with a 16 KB level-1 cache and a 256 KB level-2 cache. These were later upgraded to two 333 MHz Pentium II OverDrive processors, each with a 32 KB level-1 cache and a 512 KB level-2 cache. According to Intel, the ASCI Red Computer is also the first large scale supercomputer to be built entirely of commercially available components.
ASCI Red had the best reliability of any supercomputer ever built and was supercomputing's high-water mark in longevity, price, and performance.
It should not come as a surprise that ASCI Red was developed at Sandia National Laboratories. If you're not familiar with the National Labs under the direction of the Department of Energy, Sandia, Los Alamos, and Livermore are all under the directive of the National Nuclear Security Administration. Los Alamos was always the "Theoretical Division." Sandia designed and built nuclear weapons. ASCI Red was designed to simulate new nuclear weapons and to simulate the efficacy of the existing nuclear weapon stockpile because nuclear testing was banned.
Today, twenty-two years later:, two IBM-built supercomputers, Summit and Sierra, installed at the Department of Energy’s Oak Ridge National Laboratory (ORNL) in Tennessee and Lawrence Livermore National Laboratory in California, respectively, retain the first two positions as the fastest supercomputers at 148.6 petaflops for the Summit and 94.6 petaflops for the Sierra.
Taking up 7,000 square feet (650 sq m), Sierra has 240 computing racks and 4,320 nodes. Each node has two IBM Power9 CPUs, four Nvidia V100 GPUs, a Mellanox EDR InfiniBand interconnect, and Nvidia's NVLink interconnect. Across 24 racks of Elastic Storage Servers, Sierra has 154 petabytes of IBM's software-defined parallel file system Spectrum Scale. The 11MW system is thought to be five times as power-efficient as its predecessor, Sequoia.
In other words, about 150,000 times faster than ASCI Red. What do you do with a computer that is 150,000 times faster than one that can design and simulate the launch and detonation of nuclear weapons?
The United States Department of Energy awarded a $325 million contract in November 2014 to IBM, Nvidia, and Mellanox. The effort resulted in the construction of Summit and Sierra. Summit is tasked with civilian scientific research and is located at the Oak Ridge National Laboratory in Tennessee. Sierra is designed for nuclear weapons simulations and is located at the Lawrence Livermore National Laboratory in California. Summit is estimated to cover the space of two basketball courts and require 136 miles of cabling. Researchers will shortly utilize Summit for diverse fields such as cosmology, medicine, and climatology.
In 2015, the project called Collaboration of Oak Ridge, Argonne, and Lawrence Livermore (CORAL) included a third supercomputer named Aurora and was planned for installation at Argonne National Laboratory. By 2018, Aurora was re-engineered with completion anticipated in 2021 as an exascale computing project along with Frontier and El Capitan to be completed shortly after that.
Putting all of this into perspective, by 2021, there will be supercomputers that operate a million times faster than a computer that was used to simulate nuclear weapons in 1997. At the moment, Sierra is a massive connected hive of 190,000 processing cores performing astrophysics, climate, and precision medicine simulations in a “shake-down” cruise” while testing bad components and other technical hiccups. But early next year, Sierra’s real work will begin. The system will be "air-gapped," meaning that it will be disconnected from any external network to prevent unauthorized access. Once that happens, it can begin the calculations it was purpose-built to carry out: simulations of nuclear weapons launches and detonations, not curing cancer. Summit, the faster one at Oak Ridge, will be providing resources for computational science.
It makes me wonder why Summit is at Oak Ridge and Sierra and Livermore? Also, I can find no information about whether or not Summit will be air-gapped.
Back to cancer. More from the Frontiers in Oncology article mentioned above:
The application of data science in cancer research has been boosted by major advances in three primary areas: (1) Data: diversity, amount, and availability of biomedical data; (2) Advances in Artificial Intelligence (AI) and Machine Learning (ML) algorithms that enable learning from complex, large-scale data; and (3) Advances in computer architectures allowing unprecedented acceleration of simulation and machine learning algorithms.
To break that down:
- This has little to do with supercomputing. Availability of data for "omics" research is still a huge problem because it is siloed in thousands of centers, many of which are not even networked, and standards are almost non-existent
- I am not sure what “advances” they are referring to, but less esoteric resources for large scale ML processing are certainly available
- Simulation? Yes. 1,000,000,000,000,000,000 floating-point calculations per second would pack quite a wallop in simulating biological models for understanding the actions of proteins, epigenetic effects of changes in methylation…how cancer works.
What I’m left wondering, though, is this - is computational biology useful when even the basics are not yet understood? More vexing than anything else is this - the curated, population-level SEER data provide a rich information source for data-driven discovery to understand drivers of cancer outcomes in the real world.
The data from SEER is about people with cancer. Everything in cancer research, and this report is about treatment. I'm not suggesting there is anything wrong with improving treatment for better outcomes AND quality of life, but between therapies and early detection, everything is about dealing with cancer. It never seems to address the fundamental question: "Why is there cancer?" Why would nature provide for something so destructive?
The word count in the Frontiers in Oncology piece says it all: Drug, 17. Cause, 0. That needs to change.