Here are highlights of the 1.5 product release of DeepSparse, SparseML, and SparseZoo libraries. The full technical release notes are always available within our GitHub release indexes linked from the specific Neural Magic repository. Join us in the Neural Magic Community Slack if you have any questions, need assistance, or simply want to introduce yourself.
For the 1.5 release across all of our OSS libraries, we have implemented basic telemetry to measure usage for product improvements. To disable this telemetry across DeepSparse Community, SparseML, and SparesZoo, follow the instructions under Product Usage Analytics here.
DeepSparse 1.5 Highlights: Dense YOLOv8 Support, Pipeline Logging Improvements, and New Benchmark Sweep CLI
DeepSparse now supports dense and sparse YOLOv8 object detection models from Ultralytics. Additionally, we have expanded the built-in functions for NLP and CV pipeline logging to enable improved logging capabilities when using DeepSparse. Lastly, a new benchmark sweep capability is now included with DeepSparse 1.5, which enables sweeps of benchmarks across different settings, such as cores and batch sizes, to help evaluate models for different deployment scenarios.
On the performance side, we have improved inference latency for unstructured sparse-quantized CNNs by up to 2x. For dense CNNs, inference throughput and latency have been improved by up to 20%. Lastly, dense transformer models’ inference throughput and latency have been improved by up to 30%.
SparseML 1.5 Highlights: PyTorch 1.13 Support, YOLOv8 Sparsification Pipelines, Expanded Torchvision Training Pipelines, and Integrations
We have added new YOLOv8 sparsification pipelines to SparseML for the YOLOv8 object detection models from Ultralytics. We have expanded the logging capability of the Torchvision training pipelines to integrate with Weights & Biases (WandB), Tensorboard, and more informative console logging.
Additionally, we have added support for DataParallel and distillation support in SparseML’s Torchvision integration. The PyTorch Distillation Modifier has been refactored to support per-layer distillation.
SparseZoo 1.5 Highlights: Full UI Refresh, New YOLO Sparsified Models, and CLI Commands for Model Analysis and Deployment
We have done a full SparseZoo redesign with an improved look and feel and performance boost to improve the usability and user experience of the SparseZoo model hub. The performance metrics are more intuitive and allow for model comparisons to visualize the speedups of sparsity.
All of the sparse models SparseZoo had before are still available. If you have models you’d like to see added to the zoo and sparsified, fill out a SparseZoo Model Request form.
New models and model variants include:
We have also introduced a new sparsezoo.analyze CLI so you can easily analyze your own ONNX models for performance and sparsity metrics. Lastly, we have also implemented a new sparsezoo.deployment_package CLI to enable easy packaging of models from SparseZoo for simple deployments.
Sparsify Alpha Update
We've heard lots of feedback from users and are excited to let you know that the next generation of Sparsify is underway. You can expect more features and simplicity for you to build sparse models to accelerate inference at scale.
The next-gen of Sparsify enables you to apply model compression techniques to accelerate inference.
As an ML model optimization product, Sparsify applies state-of-the-art sparsification algorithms using techniques like pruning and quantization, to any neural network, with a simple UI and one-command API calls.
The Alpha will be released over the next few days to those who have filled out our Early Access Waitlist form. If you have not yet registered, there is still time to register and be among the first to try Sparsify Alpha!
- Neural Magic Product Team