ML Performance Research Papers

On the Predictability of Pruning Across Scales (ICML, 2021)

We show that the error of magnitude-pruned networks follows a scaling law, and that this law is of a fundamentally different nature than that of unpruned networks. Learn more downloading the paper here.

Sparsity in Deep Learning: Pruning and Growth for Efficient Inference and Training in Neural Networks (Survey Paper, 2021)

The future of deep learning is sparse! See our overview of the field and upcoming opportunities for how to gain 10-100x performance to fuel the next AI revolution. HPC techniques will be key as large-scale training is super computing. Download the paper here.

WoodFisher: Efficient Second-Order Approximation for Neural Network Compression (NeurIPS 2020)

Learn about the WoodFisher optimization method for efficient second-order approximation for neural network compression. Download the paper here.

Relaxed Scheduling for Scalable Belief Propagation (NeurIPS 2020)

Learn about efficient parallel algorithms for the key machine learning task of inference on graphical models, in particular on the fundamental belief propagation algorithm. Download the paper here.

Adaptive Gradient Quantization for Data-Parallel SGD (NeurIPS 2020)

In this paper, we introduce two adaptive quantization schemes, ALQ and AMQ. In both schemes, processors update their compression schemes in parallel by efficiently computing sufficient statistics of a parametric distribution. We improve the validation accuracy by almost 2% on CIFAR-10 and 1% on ImageNet in challenging low-cost communication setups. Download the paper here.

eBook: Pruning for Success

Get an overview of the best practices for pruning a model, and an in-depth walkthrough of the gradual magnitude pruning algorithm. Download the eBook here.

Inducing and Exploiting Activation Sparsity for Fast Neural Network Inference (ICML 2020)

Learn how to gain significant performance by inducing and exploiting activation sparsity for fast neural network inference. Download the paper here.

A Constructive Prediction of the Generalization Error Across Scales (ICLR 2020)

In this work, we present a functional form which approximates well the generalization error in practice. Capitalizing on the successful concept of model scaling (e.g., width, depth), we are able to simultaneously construct such a form and specify the exact models which can attain it across model/data scales. Download the paper here.