Compilers for Machine Learning
Machine learning applications are becoming ubiquitous in large-scale production systems. With that growth and the scaling in data volume and model complexity, the focus on efficiently executing machine learning models has become even greater. The push for increased energy efficiency has led to the emergence of diverse heterogeneous system and accelerator architectures. In parallel, model complexity and diversity pushed for higher productivity systems, more powerful programming abstractions, type systems, language embeddings, frameworks and libraries. Compilers have historically been the bridge between programmer efficiency and high performance code, allowing the expression of code that remains understandable and productive to port and extend, while producing high-performance code for diverse architectures. As such, compiler techniques have been increasingly incorporated into machine learning frameworks. This goes both ways: given the broadening gap between high-level constructs and hardware accelerators, compilers in machine learning frameworks also emerged as natural clients of machine learning techniques, from domain-specific heuristics to autotuning.
This workshop aims to highlight cutting edge work and research that incorporates compiler techniques and algorithms with optimizing machine learning workloads. Compiler techniques affect a large part of the machine learning stack. The workshop topics span from high-level abstract representations to code generation for accelerators. The list of invited speakers are similarly experts across the different levels of the stack. The workshop does not have formal proceedings, and presentations will include ample time for interaction.
The workshop features 8 presentations from leading ML compiler experts from industry and academia. 7 posters will be displayed at the end of the workshop (together with the main conference's welcome and poster reception), with short talks introducing the posters in the last session.
Venue: Edinburgh International Conference Center (EICC).
Room: Carrick 1, 2.
09:15-09:20 - Opening
09:20-10:00 - Session 1 - Debunking ML for Compilers
10:00-10:20 - Break
10:20-12:20 - Session 2 - ML Compiler Construction
12:20-13:20 - Lunch
13:20-15:20 - Session 3 - Target- and domain-specific optimization
Ian Bearman, Microsoft
Scaling Triton to Multiple Platforms with Triton-Shared
15:20-15:40 - Break
15:40-16:20 - Session 4 - ML compiler infrastructure for general-purpose computing
16:20-17:20 - Session 5 - Poster Lightning Talks
Ari Rasch, Richard Schulze, Sergei Gorlatch, University of Muenster
Code Generation & Optimization for Deep-Learning Graphs via Multi-Dimensional Homomorphisms
Hongbin Zhang, Xulin Zhou, Jiuyang Liu, Zikang Liu, Linquan Wei, Yuliang Li, Taiqi Zheng, Meng Li, Hongyu Lin, Zhongyu Qin, Hanghang Cao, Jiongjia Lu, Weijia Li, Mingjie Xing, Yanjun Wu, Chinese Academy of Sciences, Huazhong University of Science and Technology, Beihang University, East China Normal University, NanJing University
Buddy Compiler: An End-to-End AI Compiler from DSL to DSA
Jude Haris, Nicolas Bohm Agostini, Antonino Tumeo, David Kaeli, Jose Cano, University of Glasgow and Northeaster University
Data Transfer Optimizations for Host-CPU and Accelerators in AXI4MLIR
S. VenkataKeerthy, Siddharth Jain, Umesh Kalvakuntla, G Pranav Sai, Rajiv S Chitale, Eugene Brevdo, Albert Cohen, Mircea Trofin, Ramakrishna Upadrasta, IIT Hyderabad and Google
MLCompilerBridge: A Tool for interfacing ML and Compilers
Marco Siracusa, Miquel Moreto, from Barcelona Supercomputing Center
Compiling Embedding Operations in MLIR to Decoupled Access-Execute Architectures
Ludger Paehler, Aiden Grossman, Jose Monsalve-Diaz, Tal Ben-Nun, Konstantinos Parasyris, Johannes Doerfert, TUM, UC Davis, ANL, LLNL
LLamaVM: Unlocking the Power of Intermediate Representation
18:00-20:00 - Poster Reception
Ian Bearman, Microsoft - Scaling Triton to Multiple Platforms with Triton-Shared
Triton is an open-source kernel authoring language from OpenAI. It allows programmers to efficiently produce high-performance code for machine learning. While core Triton development focuses on GPU code generation for Nvidia and AMD GPUs, this is just the beginning of what Triton can do. Through the triton-shared project, the AI Compiler team at Microsoft is brining Triton code generation to more varied platforms including NPUs and CPUs. This talk will touch on the Triton programming language, the triton-shared project, and the MLIR compiler framework.
Albert Cohen, Google
Diego Caballero, Google
Gokcen Kestor, PNNL
Jacques Pienaar, Google