Meeting Notes 10-31-2022
Meeting Notes from 10-31-2022
Minutes of SBI-FAIR October 31, 2022, Meeting
Present: Kamil Iskra, Xiaodong Yu, Peter Beckman, Deborah Penchoff, Shantenu Jha, Geoffrey Fox, Piotr Luszczek, Baixi Sun. Vikram Jadhao, Gregor von Laszewski
Updates
Virginia
Geoffrey discussed
- The transfer of the DOE grant is completed
- The Tsunami surrogate (see last meeting) is finished while the diffusion-based surrogate is still being finalized
- Rough draft of the diffusion model for cell simulations GENERALIZATION AND TRANSFER LEARNING IN A DEEP DIFFUSION SURROGATE FOR MECHANISTIC REAL-WORLD SIMULATIONS. Interesting is the study of dataset sizes 5000-400,000 and the importance of dealing with the large numeric range in computed values
- We discussed Margaret Lentz’s request for a project presentation
- Draft after SC22 with final presentation November 28 1-2 pm finalized with Margaret
- Some integrating slides and then 4-6 from each team covering past work; remaining work in the grant; what to do after the grant
- Pete reminded us not to forget FAIR!
- Geoffrey will make a plan
Argonne
- Their VLDB2023 paper: “SOLAR: A Highly Optimized Data Loading Framework Training CNN-based Scientific Surrogates,” was discussed
- This paper looks at the training of 3 surrogates and addresses the overhead of the I/O disk access that dominates the performance
- They compare with PyTorch Data Loader and the NoPFS paper [2101.08734] Clairvoyant Prefetching for Distributed Machine Learning I/O from Torsten Hoefler at the last SC meeting. This does optimized prefetching
- The shuffle is optimized to minimize redistribution and this leads to an improvement factor of 3.5 over NoPFS and 24 over default PyTorch \
Tennessee
Piotr reported that Cade Brown has left and they are hiring a replacement.
Rutgers
Shantenu reported
- That their team had identified 6 categories with AI enhancing HPC and they were studying performance
- He returned to topic of Large Language models LLM that can be effective in chemistry,
Indiana University
Vikram reported that
- They were continuing study of accuracy and robustness as last time as well as
- Dataset size
- Ensemble issues
- Definition of speedup
Last modified January 26, 2024: add notes (fa4a2ea)