Future is Sparse: Methods and Tools for Sparse Computations

This workshop was co-located with Supercomputing’23 in Denver, Colorado, on November 17, 2023.

Abstract

Many real-world computations involve sparse data structures in the form of sparse matrices, graphs, or sparse tensors. In computational sciences, sparse matrices are commonly used for numerically solving partial differential equations. Likewise, many approaches for deep learning using graph representations, namely GNNs, have been proposed. More generic, multi-dimensional sparse tensors are currently at the heart of data-driven fields such as AI and deep learning. Getting high performance on such sparse data structures is well-known to be challenging and is still an open research problem. This workshop gathered a group of experts who were researching various aspects of this topic and aimed to present the attendees with an overview of the state of the art research activities. More importantly, the workshop provided a forum for interactions between the SC participants, so that new ideas could be generated to push forward the state of the art.

Date

November 17, 2023

Venue

Location 401 – 402

Schedule

8.30 am – 9:00 am Welcome & Introduction by SparCity (Didem Unat, Koç University)
9:00 am – 9.30 am John Owens, University of California, Davis
Dynamic Data Structures on the GPU.
9:30 am – 10.00 am Richard Vuduc, Georgia Institute of Technology
Consider studying all-pairs shortest paths.
10:00 am -10:30 am Break
10:30 am – 11:00 am Hartwig Anzt, University of Tennessee
Tensor cores for matrix multiplication are on the rise – is there any hope for sparse operations?
11:00 am – 11:30 am Tal Ben-Nun, Lawrence Livermore National Laboratory
The Future of Machine Learning is Sparse.
11:30 am – 12:00 am Panel (Moderator: Xing Cai, Simula Research Laboratory)

Talks

John Owens, University of California, Davis

Title: Dynamic Data Structures on the GPU.
Abstract: If I had to sum up the purpose of data structures in three words, I’d say they “organize sparse data”. And as we know, the future is sparse. Dynamic data structures allow updates to data structures without having to rebuild them completely. In my talk, I’ll discuss recent progress in designing and implementing dynamic data structures for GPUs, whose implementations present significant challenges. I’ll talk briefly about work in log-structured merge trees, quotient filters, linked lists, hash tables, dynamic graphs, and B-trees, but mostly about principles that we followed in building them and what we learned from the experience.

Richard Vuduc, Georgia Institute of Technology

Title: Consider studying all-pairs shortest paths.
Abstract: There is a well-known connection between the all-pairs shortest paths (APSP) problem and Gaussian elimination. In the context of this workshop, that makes APSP a good candidate to study because, for an input graph that is sparse but fully connected, the computation lies at the boundary between “sparse” and “dense” and between “regular” and “irregular” parallelism. It may, therefore, be an illuminating prototype or analogue for trends we might foresee in algorithms, programming, and system design as “dense” problems like deep learning become “sparse.”

Hartwig Anzt, University of Tennessee

Title: Tensor cores for matrix multiplication are on the rise – is there any hope for sparse operations?
Abstract: In response to the demands from the machine learning community, an increasing number of hardware architectures feature tensor cores for high performance dense matrix multiplication in low precision. This is increasing the peak performance and power draw – and the expectation that applications achieve higher performance. But at the same time, many application sparse linear algebra is generally unable to exploit the high performance tensor cores. Is there any hope for improved performance in applications based on sparse linear algebra?

Tal Ben-Nun, Lawrence Livermore National Laboratory

Title: The Future of Machine Learning is Sparse.
Abstract: As machine learning models become prohibitively expensive to train and use, sparsity is increasingly essential for neural networks such as large language models. If performed correctly, sparsity does not come at the cost of performance, and pruned deep neural networks have shown benefits in both throughput and generalization. In this talk, we will review the many faces of sparsity in deep learning and how it can be leveraged, from input representation to specialized hardware. We will discuss the differences in sparsity structure from other HPC applications, learning schedules that enable near-lossless sparse training, and libraries that aid in introducing high-performance sparsity to existing machine learning models. Lastly, we will outline the current limitations of sparsity in deep learning and discuss future opportunities in pushing those boundaries.

Workshop Scope

Sparse computations that involve sparse matrices, graphs, or tensors are characterized by a low ratio between the volumes of computational work executed and memory traffic incurred. As memory resources are increasingly “lagging behind” on modern computing platforms, it is increasingly challenging to secure the performance of sparse computations. For example, even the fastest supercomputers in the world from the Top500 list can only achieve 3% or less of their peak performance on the conjugate gradient solver, which is a well-known kernel of sparse computation. The challenge is even larger for real-world problems where the data access patterns are irregular. Examples include sparse matrices that arise from discretizing partial differential equations over unstructured computational meshes, or sparse graphs representing entities that are unevenly connected, or missing information in high-dimensional data leading to sparse tensors.

Although there exist methodologies that can be used for understanding or improving the performance of some sparse computations, such as insightful architecture/application modeling and matrix reordering, these methods are typically too coarse-grained and often not easily applicable to sparse computations with irregular memory patterns. Moreover, the availability of heterogeneous systems, which include powerful accelerators, widens the design space in particular for this type of computation. Lately, research activities have gained new momentum on optimizing sparse computations and developing sparsity-aware tools on modern hardware architectures in HPC. One example is the ongoing EuroHPC-JU funded project SparCity (https://sparcity.eu/).

Sparsification is also a new trend in large-scale deep learning. The growing costs of model training have led to several researchers to explore methods to reduce the DNN size by pruning. Sparsification enables reduction of both the memory footprint and training time. This half-day workshop is therefore a discussion forum for the latest research developments and outlines major open problems in the field of HPC and AI.

The workshop aims to gather academic researchers, software developers, R&D staff from the HPC industry, and domain scientists who work with sparse computations in their daily work. The common denominator of the presenters and attendees is the wish to obtain high performance for sparse computations. The topics of interest include, but are not restricted to:

performance characterization and prediction, including the use of ML
optimization strategies covering algorithms, data structures, parallelization
automated code generation for general-purpose and specific hardware
software libraries and tools for implementing optimization strategies and real-world sparse, tensor and graph computations
sparsification in deep learning training and inference
sparse matrix/graph/tensor datasets and generators
real-world instances of sparse data
real-world applications of sparse computations

List of Panel

John Owens, University of California, Davis
Richard Vuduc, Georgia Institute of Technology
Hartwig Anzt, University of Tennessee
Tal Ben-Nun, Lawrence Livermore National Laboratory
(Moderator: Xing Cai, Simula Research Laboratory)

Expected Outcome

This workshop thus provides a timely forum with the expected outcome as follows:

Exchange of the latest research ideas and results
Inspiration for new ideas and strategies
Initiation of a dedicated yet open research community for this topic
Performance enhancement of sparse computations in real-world applications
Development of new implementation strategies that target emerging hardware architectures
Collaboration between HPC researchers, deep learning experts and domain scientists who leverage sparse computation in their applications
An interactive panel composed of experts in the field to simulate lively discussions
Knowledge transfer between different fields that utilize sparse data structures.
Industry involvement in the discussion to provide custom solutions to sparse applications