The full technical release notes are always found within our GitHub release indexes linked from our Docs website or the specific Neural Magic repository. SparseZoo The latest additions to sparsezoo.neuralmagic.com! Sparse BERT mask language modeling models with example recipes for transferring to other downstream datasets Pruned-Quantized BERT models on SQuAD (Question Answering) YOLACT - for image segmentation DeepSparse Engine Optimization… Read More Neural Magic CE 0.7, 0.8, and 0.9 Product Releases
Pruning Hugging Face BERT: Apply both pruning and layer dropping sparsification methods to increase BERT performance anywhere from 3.3x to 14x on CPUs depending on accuracy constraints In this post, we go into detail on pruning Hugging Face BERT and describe how sparsification combined with the DeepSparse Engine improves BERT model performance on CPUs. We’ll… Read More Pruning Hugging Face BERT: Using Compound Sparsification for Faster CPU Inference with Better Accuracy
Neural Magic has been busy this summer on the Community Edition (CE) of our DeepSparse tools; we’re excited to share highlights of releases 0.5 and 0.6. The full technical release notes are always found within our GitHub release indexes linked from our Docs website or the specific Neural Magic repository. For user help or questions… Read More Neural Magic CE 0.5 and 0.6 Product Releases
To understand how Neural Magic's Deep Sparse technology works, it's important to quickly cover the journey of our founders. While mapping the neural connections in the brain at MIT, Neural Magic’s founders Nir Shavit and Alexander Matveev were frustrated with the many limitations imposed by GPUs. Along the way, they stopped to ask themselves a… Read More How Neural Magic's Deep Sparse Technology Works
This blog was originally posted by Na Zhang on VMware's Office of the CTO Blog. You can see the original copy here. Increasingly large deep learning (DL) models require a significant amount of computing, memory, and energy, all of which become a bottleneck in real-time inference where resources are limited. In this post, we detail our… Read More Accelerating Machine Learning Inference on CPU with VMware vSphere and Neural Magic
We are excited to announce the Neural Magic January 2021 product release. This milestone contains new product features, an improved user experience, and stability enhancements that will simplify the ability for our clients to achieve GPU-class performance on commodity CPUs. NEW - Introducing Sparsify BETA Experience driven tooling to simplify the process of analyzing and… Read More Neural Magic January 2021 Product Release
Release 0.1.0 for the Community! February 4, 2021 As of February 2021, our products have been renamed, most have been open sourced and their release notes can be be found in GitHub! Sparsify SparseML (formerly Neural Magic ML Tooling) SparseZoo (formerly Neural Magic Model Repo) DeepSparse Engine (formerly Neural Magic Inference Engine) Release 1.4.0 January… Read More Product Release Notes
Run computer vision models at lower cost with a suite of new tools that simplify model performance. Today, Neural Magic is announcing the release of its Inference Engine software, the NM Model Repo, and our ML Tooling. Now, data science teams can run computer vision models in production on commodity CPUs – at a fraction… Read More Neural Magic Launches High-Performance Inference Engine and Tool Suite for CPUs