Hello

Accelerate Customer Review Classification with Sparse Transformers

Classify Even Longer Customer Reviews Using Sparsity with DeepSparse Customer review classification is crucial for customer-facing enterprises across industries such as retail, entertainment, food, and beverage. Knowing what your customers say about your product or solution can help you quickly address negative customer reviews and in turn reduce churn, providing a better customer experience. Implementing… Read More Accelerate Customer Review Classification with Sparse Transformers

Neural Magic Leverages Breakthrough Performance of 4th Gen AMD EPYC™ Processors For Machine Learning Workloads

As machine learning models grow larger and larger, the demands for hardware to run those models also continue to grow. At Neural Magic we are helping to alleviate these hardware pressures with a truly software-delivered AI solution that allows organizations to get extreme machine learning performance with commodity hardware only. Today, we are excited to… Read More Neural Magic Leverages Breakthrough Performance of 4th Gen AMD EPYC™ Processors For Machine Learning Workloads

Neural Magic 1.2 Product Release

Neural Magic is excited to announce and share highlights of the 1.2 release of our DeepSparse and SparseML libraries. The full technical release notes are always available within our GitHub release indexes linked from the specific Neural Magic repository. If you have any questions, need assistance, or simply want to say hello to our vibrant… Read More Neural Magic 1.2 Product Release

Build Efficient Vector Search on CPUs with Neural Magic and Weaviate

We are excited to share a recent conversation between Neural Magic and Weaviate. We've been collaborating with Weaviate in an effort to help companies scale machine learning efforts to enterprise-grade production, as many business use cases require robust ML pipelines through information retrieval, semantic search, image similarity search, recommendations, classification, and more. Weaviate is a… Read More Build Efficient Vector Search on CPUs with Neural Magic and Weaviate

Faster Zero-Shot Learning with Sparsity

If you have a text classification task at hand, exploring the zero-shot learning approach is a no-brainer. Zero-shot enables you to classify text without the need for model retraining, making it easier and faster to get started. However, zero-shot is very compute-intensive given it needs to infer each candidate label. Enter sparsity to save the… Read More Faster Zero-Shot Learning with Sparsity

YOLOv5 on CPUs: Sparsifying to Achieve GPU-Level Performance and a Smaller Footprint

This YOLOv5 blog post was edited in September 2022 to reflect more-recent sparsification research, software updates, better performance numbers, and easier benchmarking and transfer learning flows. Prune and Quantize YOLOv5 for a 12x Increase in Performance and a 12x Decrease in Model Files Neural Magic improves YOLOv5 model performance on CPUs by using state-of-the-art pruning… Read More YOLOv5 on CPUs: Sparsifying to Achieve GPU-Level Performance and a Smaller Footprint

BERT-Large: Prune Once for DistilBERT Inference Performance

Compress BERT-Large with pruning and quantization to create a version that maintains accuracy while beating baseline DistilBERT performance and compression In 2019, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, a research paper from Google research, introduced two versions of a transformative new NLP model: BERT-base and BERT-Large. Both were transformer-based architectures pre-trained on… Read More BERT-Large: Prune Once for DistilBERT Inference Performance

​​Come Build the Future with Labs by Neural Magic

Architect and deploy better machine learning solutions with industry experts Today, Neural Magic is debuting a new offering that allows you to deliver best-in-class ML solutions leveraging the same engineering talent behind our DeepSparse Engine. Labs by Neural Magic empowers organizations to define (or refine) their AI/ML best practices. Teams will develop their methodology, success… Read More ​​Come Build the Future with Labs by Neural Magic

Deploy Sparse DistilBERT with the DeepSparse Engine on AWS SageMaker for a 7x Increase in Performance

You can now automate the deployment of a sparse transformer model with an Amazon SageMaker endpoint. At Neural Magic, we have simplified the arduous task of infrastructure build (often requiring several steps to complete) by distilling it down to a single CLI command. This post describes the ease of building your personal SageMaker inference endpoint… Read More Deploy Sparse DistilBERT with the DeepSparse Engine on AWS SageMaker for a 7x Increase in Performance