About SBI FAIR

What we do ...

The Surrogate Benchmark Initiative (SBI) project will create a community repository and FAIR data ecosystem for HPC application surrogate benchmarks, including data, code, and all relevant collateral artifacts the science and engineering community needs to use and reuse these data sets and surrogates.

Abstract

Computational Science is being revolutionized by the integration of AI and simulation and in particular, by deep learning surrogates that can replace all or part or of traditional large-scale HPC computations. Surrogates can achieve remarkable performance improvements (e.g., several orders of magnitude) and so save in both time and energy. The Surrogate Benchmark Initiative (SBI) project will create a community repository and FAIR data ecosystem for HPC application surrogate benchmarks, including data, code, and all relevant collateral artifacts the science and engineering community needs to use and reuse these data sets and surrogates. We intend that our repositories will generate active research from both the participants in our project and the broad community of AI and domain scientists. By collaborating with the major industry organization in this area -- MLPerf -- and mirroring their process as much as possible, we will both increase the value of and obtain industry involvement in the SBI benchmarks. MLPerf is a major effort with over 1400 members from over 80 institutional members (mainly from industry) and strong existing involvement of the Department of Energy laboratories through the HPC and science data MLPerf working groups. We will build tutorials around each deposited benchmark which will allow users from a broad range of fields (shown in collaboration letters) to make new surrogates and new SBI benchmarks based on the initial set of four that we will produce in house. We will set up a set of working groups and other community activities that will advance all issues around the surrogate concept. In particular, we will take the requirements exhibited in benchmarks and produce general middleware to support the generation (training from HPC simulations) and the use of surrogates. This also will make it easier for general users to develop new surrogates and so make the major performance increases pervasive across DoE computational science. Here we see SBI benefitting application communities and computer systems research. SBI will also support several AI research areas which we will advance in our project. Our benchmarks will drive research on efficient generic surrogate architectures and how they fit with different hardware systems. Another specific activity will be research on the uncertainty quantification of the surrogate estimates and we expect future surrogates will always come with this built-in. Thirdly there will be important studies of the amount of training data needed to get reliable surrogates for a given accuracy choice. Finally, we have already derived some simple but effective performance models or surrogates but these need extension as deeper uses of surrogates become understood and exhibited in our repository depositions.

Funding

This project is funded by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, under Award No. DE-SC0023452.