Avoiding Homo Stupidicus - federated design and AI could be an answer to more than nuclear fusion

Martin Banks Profile picture for user mbanks August 22, 2023
Summary:
The UKAEA has given itself a 2040 target to get nuclear fusion energy at least to the state of being a real, workable prospect, and is now working with Intel, Dell and Cambridge University to develop the design environment that will be needed to make this dream a reality.

evolution

Nuclear fusion, and the abundance of energy that is foretold to spring from it, has been expected ‘within the next 10 years’ for decades now, long enough to have slipped into the world of myth and mystery. The latest projected date has it a bit further out – 17 years, or 2040 – and is being talked up as something a lot firmer than myth. I have, however, decided not to bet any money on it actually happening just yet.

 I could, however, be persuaded to put a modest wager on a development that is possible to emerge from the attempt to make fusion workable in that timescale. This has to do with the information processing and management support structure seen as being needed for the design and development of working nuclear fusion. This is currently in the early stages of development, but holds out hope for a potential future design approach for many other world problems that might just, in the near future, be at the point where we can stop our descent into becoming Homo Stupidicus and instead become Homo Vere Sapiens.  

And what connects nuclear fusion with such a thought is the coming together of a UK-based consortium of business, science and academia – namely Intel, Dell, Cambridge University and the United Kingdom Atomic Energy (UKAEA) – to carry out the important first step of developing the development environment needed to give UKAEA’s Culham-based STEP fusion energy program a chance of making usable nuclear fusion a reality.

The two companies and two organisations have deep understandings of their areas of contribution, and, with the best will in the world, if the relationship stayed that way it would likely end with a partial success in one or two areas that go to make nuclear fusion. But the underlying objective is a good deal greater. It is to take three key elements – AI, the open source community and the rest of the world – to create an environment where as many players as possible, no matter how big or small their contribution might be, can have some skin in the game.

Arguably, finding a workable solution to nuclear fusion is the Holy Grail of solving energy shortages in the medium term and the climate change crisis in the long term, and the more countries, companies, organizations and people that contribute, are involved, the more chance there is that the solution arrived at will be a good one we can all claim as 'ours’. Of course, I acknowledge the possibility that the result may be the proverbial ‘camel is a horse designed by a committee’, but I do suspect that part of the role AI will play in this will be to greatly reduce/eliminate the risk of that happening.

So what is happening here?

The UK Atomic Energy Authority needs to get sustainable fusion energy on the grid in the 2040s. Easy enough to say, but to get there will require some serious IT muscle in order to simulate the processes involved and then design the actual, physical systems needed to make it happen. In practice, the first stage is all about developing the IT systems and software that will be needed to run the real simulation and design stages of the fusion system itself. And that first step is already out on the bleeding edge of where technology lies today.

The most recent UK Government Budget announced funding that included the development of a roadmap for exascale supercomputer systems that will be able to develop a simulated digital version of its planned STEPS fusion system before the real plant itself is started. The hope is that this will be able to dramatically reduce the need for real world validation on a piece-by-piece, system-by-system basis. In the long term, there is also the hope that the final simulated system can become a digital twin of the power plant itself, to be used for tasks such as operational planning, fault diagnosis, anomaly detection, and optimising goals along the future roadmap.

UKAEA’s goal is to take the engineering design process into the virtual world, in the same way that the aerospace sector has moved wind tunnels into the world of computational fluid dynamics, or the automotive sector has moved to the process of crash testing into the virtual world using finite element analysis. The challenge with that, though, is that a fusion reactor is an incredibly complex, strongly coupled system with many inter-dependencies, and the models that underpin the operation of fusion power plants are said to be somewhat limited in their accuracy. There are many coupling mechanisms that have to be taken into account, such as the multiple areas of physics that span the entire load assembly of such a system, from structural forces to thermal heat loads, to electromagnetism and to radiation. 

The other three consortium members are playing to their specific strengths, with the Cambridge Open Zettascale Lab focusing on the simulation applications and the systems management of what it is already calling an exascale system environment. Intel is in the frame for its Graphics Processor Unit  (GPU) chips and associated system management tools, and Dell for its expertise in putting together very large supercomputers and data center systems.  

Intel and GPUs is not a pairing that instantly springs to mind, and prompts the question, ‘Why not NVIDIA?’. The answer is three-fold, with the company’s `Datacenter GPU Max’ devices seen as particularly beneficial to the task in hand. Second is two new software tools: One API, that is reckoned to improve the management of APIs into GPUs significantly, making application far more scalable, with reduced time spent on their creation, and the Daos parallel file system for solid state storage systems, aimed at managing the vast amounts of data produced by the simulations. Thirdly, and perhaps most important, all of it is designed to be open source and hardware agnostic, so will run on NVIDIA and other GPU devices.

Dell will be developing on from its long-standing PowerEdge servers and systems infrastructures to build a supercomputer capable of exaflop performance – 1018 calculations per second – based on tens of thousands of GPUs, coupled with high bandwidth NVMe solid state storage. This is where Intel’s new parallel file system, Baos, will come into play as traditional parallel file systems are not good at exploiting NVMe technologies. It is believed that there are significant performance advantages yet to be gained from NVMe with the right file system in place. 

The resulting system is expected to cost north of £600 million and to consume some 20 MegaWatts of power: an estimated £50 million a year just to plug it in.

Cambridge University’s role will be to create and develop the environment that will produce the multi-physics multi-timescale surrogate models. These will synthesise all of the information extracted from simulations in fluids and plasma materials into tools that can be used for engineering design. It is also targeting a new middleware layer, known as Scientific OpenStack, that can make such supercomputers accessible to a broad range of scientists and engineers. This is seen as a key part of the democratisation of the fusion development programme.

AI as the design ‘short circuit’

The goal is to use AI and ML to build complex systems, accessible to third party companies and organisations, that is capable of inferring, and simulating from those AI-derived inferences, the extremely complex problems of managed fusion energy release. Inference is important here, for engineering has historically been carried out through the process of iterative tests, design based on those tests, prototype construction, evaluation – and repeat as necessary. 

This is always time consuming and expensive, especially given the currently unknowable number of multi-physics/multi-timescale inter-dependent simulations that will be needed in a 17-year timescale. That timescale also has to include working out methods to derive knowledge and insight from that data. Traditional design methods cannot be used. Indeed, it is yet to be decided what the exascale ecosystem may look like. It could be one large exascale system with a single machine using a single technology. Alternatively, it could be a federation of large machines of different technologies at different places, which sum to an exascale

My take

The idea of building a distributed cluster of exascale systems, based on common technology and open-sourced software maps well on to the fact that there are already many other nuclear fusion concepts being developed by a wide variety of nations and organisations. In practice, it is probably safe to say that no single development will result in an ideal solution – or even a poor but just workable one.

At the moment, many of those other concepts will be quite different from UKAEA’a STEP developments,  but the use of common hardware and open source software across the various development teams gives the possibility of increasing levels of design commonality as the best design options emerge, not least because many of the fundamental science and technology issues across all of them will be similar if not exactly the same. Combining all that effort to arrive at a common solution will, perhaps, be a real pointer to Homo Stupidicus being a short-lived back alley. 

And should it work, the development model created by the UKAEA team could well prove to be the way all future developments of common, global importance are defined, simulated, modelled and developed by the collaboration of teams from across the world.

Loading
A grey colored placeholder image