Neural Magic 1.4 Product Release


Here are highlights of the 1.4 product release of our DeepSparse, SparseML, and SparseZoo libraries. The full technical release notes are always available within our GitHub release indexes linked from the specific Neural Magic repository. If you have any questions, need assistance, or simply want to say hello to our vibrant ML performance community, join us in the Neural Magic Community Slack

DeepSparse 1.4  Highlights: Pose Estimation Support, Performance Improvements, and DeepSparse Server Prometheus Integration Enhancements 

DeepSparse now supports a new duo of models targeting the Pose Estimation use case: OpenPifPaf and ViTPose along with full deployment pipeline support. Additionally, on the model support front, we have upgraded our YOLOv5 integration to the latest upstream for more reliable usage.  

On the performance side, we have improved the inference speed by up to 20% on dense FP32 BERT, up to 50% on quantized EfficientNetV1, and by up to 10% on quantized EfficientNetV2.

View full DeepSparse release notes.

SparseML 1.4 Highlights: PyTorch Integration Updates and Refactors

We have added layerwise distillation support for the PyTorch DistillationModifier to now enable you to perform model compression by distilling knowledge from a larger teacher model to a smaller student model in a more fine-grained manner. Additionally, the SparseML recipe template API has been added in PyTorch for simple creation of sparsification recipes. 

Additionally, we have refactored the ONNX Export pipeline to standardize implementations, adding functionality for more complicated models, and adding better debugging support. The PyTorch QuantizationModifier has also been refactored to expand supported models and operators and simplify the interface.

View full SparseML release notes.

SparseZoo 1.4 Highlights: Improved YOLOv5 and Initial oBERTa Models for SQuAD and GLUE Tasks

Continuing our efforts to enable new use cases and increase model performance based on user feedback, new models have been added for various datasets:

  • YOLOv5 sparse quantized models for m, l, x versions with better performance
  • BERT-base, DistillBERT, and BERT-Large on GoEmotions dataset for NLP multi-label use cases
  • Initial oBERTa  models for SQuAD and GLUE for NLP use cases

View full SparseZoo release notes.

📣 Sparsify Announcement 📣

We've heard lots of feedback from users and are excited to mention the next generation of Sparsify is underway. You can expect more features and simplicity to build sparse models to target optimal general performance at scale.

The next-gen of Sparsify can help you optimize models from scratch or sparse transfer learn onto your own data to target best-in-class inference performance on your deployment hardware.

We will share MUCH more in the coming weeks. In the meantime, sign up for our Early Access Waitlist and be the first to try Sparsify Alpha.

-Neural Magic Product Team