YOLOv3 on CPUs: Sparsifying to Achieve GPU-Level Performance

Use CPUs to decrease costs and increase deployment flexibility while still achieving GPU-class performance. In this post, we elaborate on how we used state-of-the-art pruning and quantization techniques to improve the performance of the YOLOv3 on CPUs. We’ll show that by leveraging the robust YOLO training framework from Ultralytics with SparseML’s sparsification recipes it is… Read More YOLOv3 on CPUs: Sparsifying to Achieve GPU-Level Performance

ResNet-50 on CPUs: Sparsifying for Better Performance on CPUs

In this post, we elaborate on how we measured, on commodity cloud hardware, the throughput and latency of five ResNet-50 v1 models optimized for CPU inference. By the end of the post, you should be able reproduce these benchmarks using tools available in the Neural Magic GitHub repo, ultimately achieving better performance for ResNet-50 on CPUs.… Read More ResNet-50 on CPUs: Sparsifying for Better Performance on CPUs

Accelerating Machine Learning Inference on CPU with VMware vSphere and Neural Magic

This blog was originally posted by Na Zhang on VMware’s Office of the CTO Blog. You can see the original copy here. Increasingly large deep learning (DL) models require a significant amount of computing, memory, and energy, all of which become a bottleneck in real-time inference where resources are limited. In this post, we detail our… Read More Accelerating Machine Learning Inference on CPU with VMware vSphere and Neural Magic

Sparsify is Open Sourced – Try it Now

Today, we are very excited to provide you with early access to Sparsify, our automated model optimization tool! As deep learning models continue to grow in size, deploying and running them performantly and accurately has required significant investments in flops and system resources. Take GPT-3 for example, with over 175 billion parameters, it takes nearly… Read More Sparsify is Open Sourced – Try it Now

Neural Magic 1.4 Product Release

We are excited to announce the Neural Magic 1.4 product release. This milestone contains new product features, an improved user experience, and stability enhancements that will simplify the ability for our clients to achieve GPU-class performance on commodity CPUs. NEW – Introducing Sparsify BETA Experience driven tooling to simplify the process of analyzing and optimizing… Read More Neural Magic 1.4 Product Release

Product Release Notes

Release 0.1.0 for the Community! February 4, 2021 As of February 2021, our products have been renamed, most have been open sourced and their release notes can be be found in GitHub! Sparsify SparseML (formerly Neural Magic ML Tooling) SparseZoo (formerly Neural Magic Model Repo) DeepSparse Engine (formerly Neural Magic Inference Engine) Release 1.4.0 January… Read More Product Release Notes

Neural Magic at NeurIPS 2020

Are you attending this year’s virtual NeurIPS conference? The Neural Magic team would love to meet you.  Who is Neural Magic?  After years of research at MIT, our team concluded that throwing teraflops at dense models is not sustainable. So we’ve taken the best of known research on model compression (unstructured pruning and quantization, in… Read More Neural Magic at NeurIPS 2020

Neural Magic 1.2 Product Release

We are excited to announce the Neural Magic 1.2 product release. This product milestone contains new feature updates, an improved user experience, and stability enhancements that will simplify the ability for our clients to achieve price performance on commodity CPUs.  Neural Magic Inference Engine Enables clients to run mission critical deep learning models on commodity… Read More Neural Magic 1.2 Product Release