Inducing and Exploiting Activation Sparsity for Fast Neural Network Inference
In July 2020, at the International Conference on Machine Learning, we presented a paper on methods for maximizing the sparsity of the activations in a trained neural network.
We showed that, when coupled with an efficient sparse-input convolution algorithm, we can leverage this sparsity for significant performance gains.
Download our paper to learn more!
And if you want to learn more about pruning, start by checking out the first of our five-part blog series: What is Pruning in Machine Learning? (Make sure you also read part two, where we state what the best pruning approach is!)