Sparse Transferring Hugging Face Models With SparseML
Presenter: Ricky Costa
In this tutorial, we explore the world of sparse transfer learning as it relates to transformer NLP models. Sparse transfer learning is an optimization technique that allows anyone to take machine learning models and convert them into smaller, faster, and in some cases even more accurate models than their dense variants.
You can follow along with this Colab Notebook.
In this tutorial, we:
- Sparse transfer a dense BERT model previously finetuned on the emotion dataset, and convert it into a pruned-quantized oBERT model from the Neural Magic SparseZoo.
- Highlight all of the important points of how to do the training.
- Benchmark sparse and dense model variations for accuracy and speed using DeepSparse, an inference runtime.