Meeting Notes 06-26-2023
Meeting Notes from 06-26-2023
Minutes of SBI-FAIR June 26, 2023, Meeting
**Present: **Geoffrey Fox, Gregor von Laszewski, Przemek Porebski, Kamil Iskra, Xiaodong Yu, Baixi Sun. Vikram Jadhao, Piotr Luszczek, Shantenu Jha, Margaret Lentz
Virginia
- This was presented by Geoffrey
- He described work on new surrogates, including LHC Calorimeter, Epidemiology, Extended virtual tissue, and Earthquake
- He described work on the repository and SABATH
- This involved two existing AI models CloudMask and OSMIBench
- Shantenu Jha asked about the number of inferences per second.
- From MLCommons Science Working minutes, we find for OSMIBench
- On Summit, with 6 GPUs per node, one uses 6 instances of TensorFlow server per node. One uses batch sizes like 250K with a goal of a billion inferences per second
Argonne
- Continue to work on Second-order Optimization Framework for Deep Neural Networks with Communication Reduction
- Baixi Sun presented the details
- He introduced quantization to lower precision QSGD which gives encouraging results, although In one case quantization method failed in the eigenvalue stage
- We removed Rick Stevens from the Google Group
- Geoffrey mentioned his ongoing work on improving shuffling using Arrow vector format; he will share the paper when ready
Indiana
- Vikram gave the presentation without slides
- Continuing study of needed training set for the Ions confinement surrogate Probing Accuracy-Speedup Tradeoff in Machine Learning Surrogates for Molecular Dynamics Simulations | Journal of Chemical Theory and Computation
- New dataset to explore soft labels reflecting computational uncertainty to reduce errors
Rutgers
- Shantenu presented
- Nice paper on surrogate classes with Wes Brewer, who works with Geoffrey on OSMIBench
- Mini-apps for each of the 6 motifs that need FAIR metadata
- 5 motifs use surrogates; one generates them
- He described an interesting workshop on molecular simulations
- He noted that Aurora training trillion parameter foundation model for science
- LLMs need 10 power 8 exaflops: Need to optimize!
- Vikram noted SIMULATION INTELLIGENCE: TOWARDS A NEW GENERATION OF SCIENTIFIC METHODS
Tennessee
- Piotr presented slides
- CosmoFlow on 8 GPUs is running well
- He introduced the MiniWeatherML mini-app
- CUDA-aware pointers must be explicitly specified in the FAIR schema
- Test in PETSc leaves threaded MPI in an invalid state
- Alternative MPIX query interface varies between MPI implementations
- GPU Direct copy support is optional
- SABATH system is moving ahead with a focus on adding MPI support
- Piotr is now the PI of this project at UTK. We removed Cade Brown, Jack Dongarra, and Deborah Penchof from the Google Group
Last modified January 26, 2024: add notes (fa4a2ea)