Location: Room P9 – Peony Jr 4411 (Level 4)
Abstract: Recent advances in Deep Learning (DL) have led to many exciting challenges and opportunities. Modern DL frame-works enable high-performance training, inference, and deployment for various types of Deep Neural Networks (DNNs). This tutorial provides an overview of recent trends in DL and the role of cutting-edge hardware architectures and interconnects in moving the field forward. We present an overview of different DNN architectures, DL frameworks, and DL Training and Inference with special focus on parallelization strategies. We highlight challenges and opportunities for communication runtimes to exploit high-performance CPU / GPU architectures to efficiently support large-scale distributed training.
We also highlight some of our co-design efforts to utilize MPI for large-scale DNN training on cutting-edge CPU and GPU architectures available on modern HPC clusters. Throughout the tutorial, we include several hands-on exercises to enable attendees to gain first-hand experience of running distributed DL training and inference on a modern GPU cluster.
For any enquiries, please contact panda@cse.ohio-state.edu.
Workshop URL: https://nowlab.cse.ohio-state.edu/tutorials/scasia25-hidl/
Programme:
- Introduction
- The Past, Present, and Future of Artificial Intelligence (AI)
- Brief History and Current/Future Trends of Machine Learning (ML) and Deep Learning (DL)
- What are Deep Neural Networks?
- Deep Learning Frameworks
- Deep Neural Network Training
- Distributed Data-Parallel Training
- Basic Principles and Parallelization Strategies
- Hands-on Exercises (Data Parallelism) using PyTorch and TensorFlow
- Latest Trends in High-Performance Computing Architectures
- HPC Hardware
- Communication Middleware
- Advanced Distributed Training
- State-of-the-art approaches using CPUs and GPUs
- Hands-on Exercises (Advanced Parallelism) using DeepSpeed
- Distributed Inference Solutions
- Overview of DL Inference
- Case studies
- Open Issues and Challenges
- Conclusions and Final Q&A