SparseML
Enable sparsity with a few lines of code, through open-source libraries, to boost neural network inference performance.
Large Models Are Inefficient
Many of the top models across NLP and computer vision domains are difficult and expensive to use in a real-world deployment. While they are accurate, they are computationally intensive, which can require inflexible hardware accelerators in production.Small Models Are Less Accurate
Smaller models, while faster and more efficient, deliver less accuracy on real-world data.What if you could deliver big model accuracy with small model performance?
Meet SparseML
Optimize Your Models for Inference
SparseML enables you to create inference-optimized sparse models using state-of-the-art pruning and quantization algorithms. Models trained with SparseML can then be exported to ONNX and deployed with DeepSparse for GPU-class performance on CPU hardware.
