10-13 March 2025
Sands Expo and Convention Centre
Marina Bay Sands, Singapore

Location: Room P8 – Peony Jr 4412 (Level 4)

Abstract: Large-scale ’foundation’ AI models show great promise for scientific discovery, with promising results being obtained in areas ranging from self-driving laboratories to hypothesis generation. But realizing this promise at scale will require unprecedented quantities of both computation to train models and multidisciplinary human effort to prepare diverse scientific data for use in model training and to construct evaluation suites to guide development. Only a small number of organizations have the resources to build models at state-of-the-art scales (e.g., trillions of parameters, trained using tens of trillions of tokens). This reality is already motivating the formation of multi-institutional teams to work together on model architecture, evaluation, and training as well as on the collaborative building and sharing of high-quality training data sets. This workshop will highlight such collaborations, which are being catalyzed by the international Trillion Parameter Consortium (TPC). The workshop will highlight progress in various aspects of generative AI for science and engineering with presentations from academics, national laboratories, HPC centers, industry, institutes, and leaders from funding agencies. The workshop will also introduce the structure and strategies of the TPC, with an overview of high-priority areas in which new collaborators can contribute and benefit from joining the consortium.

For any enquiries, please contact: mohamed.attia@riken.jp

Workshop URL: https://tpc.dev/tpc-workshop-at-sca-2025/

Programme:

TimeSession
09:00am – 09:10amOpening Remarks

– Jens Domke, RIKEN-CCS

09:10am – 09:45amInvited Talk #1

– Satoshi Matsuoka, RIKEN-CCS

09:45am – 10:20amInvited Talk #2

– Arvind Ramanathan, Argonne National Laboratory

10:20am – 10:50amTea Break



10:50am – 11:25amInvited Talk #3

– Speaker from Singapore (TBD)

11:25am – 12:00pmInvited Talk #4

– Speaker from Europe (TBD)

12:00pm – 01:30pmLunch

01:30pm – 02:00pmInvited Talk #5: Status and roadmap of the TPC

– Charles Catlett, Argonne National Laboratory

02:00pm – 02:20pmPresentation: Algebraic Approaches to Combining Multiple Large Language Models

– J. de Curtò, BSC-CNS

02:20pm – 02:40pmPresentation: MERaLiON-AudioLLM: Bridging Audio and Language with Large Language Models

– Shuo Sun, A*STAR

02:40pm – 03:00pm Presentation: Automated Detection of AI Training Jobs to Enhance Security In HPC Systems

– Francesco Antici, University of Bologna

03:00pm – 03:20pmTea Break

03:20pm – 03:40pmPresentation: Scientific Data Compression for Large Language Models

– Maximilian Sander, Technische Universität Dresden

03:40pm – 04:00pmPresentation: Advancing Autonomous Microscopy Agents with domain guided dynamic retrieval in a Virtual Foundation Model OS

– Gayathri Saranathan, Hewlett Packard Enterprise

04:00pm – 04:20pmClosing Remarks

– Mohamed Wahib, RIKEN CCS