Today, we are very excited to provide you with early access to Sparsify, our automated model optimization tool! As deep learning models continue to grow in size, deploying and running them performantly and accurately has required significant investments in flops and system resources. Take GPT-3 for example, with over 175 billion parameters, it takes nearly… Read More Sparsify is Open Sourced – Try it Now
Neural Magic 1.4 Product Release
We are excited to announce the Neural Magic 1.4 product release. This milestone contains new product features, an improved user experience, and stability enhancements that will simplify the ability for our clients to achieve GPU-class performance on commodity CPUs. NEW – Introducing Sparsify BETA Experience driven tooling to simplify the process of analyzing and optimizing… Read More Neural Magic 1.4 Product Release
Product Release Notes
Release 0.1.0 for the Community! February 4, 2021 As of February 2021, our products have been renamed, most have been open sourced and their release notes can be be found in GitHub! Sparsify SparseML (formerly Neural Magic ML Tooling) SparseZoo (formerly Neural Magic Model Repo) DeepSparse Engine (formerly Neural Magic Inference Engine) Release 1.4.0 January… Read More Product Release Notes
Using Sparse-Quantization in Inference: NeurIPS 2020
Did you know that most weights in a neural network are actually useless? In other words, most weights can be removed with little to no impact on the loss. But, how and why would you optimize a deep learning model in practice? Through a combination of pruning and quantization (or “sparse-quantization”) you can drastically improve… Read More Using Sparse-Quantization in Inference: NeurIPS 2020
Neural Magic at NeurIPS 2020
Are you attending this year’s virtual NeurIPS conference? The Neural Magic team would love to meet you. Who is Neural Magic? After years of research at MIT, our team concluded that throwing teraflops at dense models is not sustainable. So we’ve taken the best of known research on model compression (unstructured pruning and quantization, in… Read More Neural Magic at NeurIPS 2020
Neural Magic 1.2 Product Release
We are excited to announce the Neural Magic 1.2 product release. This product milestone contains new feature updates, an improved user experience, and stability enhancements that will simplify the ability for our clients to achieve price performance on commodity CPUs. Neural Magic Inference Engine Enables clients to run mission critical deep learning models on commodity… Read More Neural Magic 1.2 Product Release
Speeding Up Memory-Bound Object Detection Models: MobileNetV2_SSD
TL;DR: Learn more about increasing performance for MobileNetV2_SSD models, via pruning and decreasing post-production time. Read time: 3 minutes, 15 seconds In many object detection scenarios, there’s not a moment to lose. A fraction of a second can mean the difference between a self-driving car hitting a dog crossing the street or narrowly missing it.… Read More Speeding Up Memory-Bound Object Detection Models: MobileNetV2_SSD
Neural Magic End-to-End Demo Videos
Neural Magic delivers best-in-class deep learning performance on commodity CPUs. We do this via: Model optimization techniques like pruning and quantization Smart algorithms that utilize CPU memory more effectively. To help visualize the power of Neural Magic, we recorded three short end-to-end video guides on how to install our software, prepare and run a model… Read More Neural Magic End-to-End Demo Videos
Part 4: Sparsity per Layer Hyperparameter
TL;DR: In addition to the general hyperparameters described in the previous post, the sparsity to target per layer is arguably the most critical hyperparameter you can set. Below we give you the reason why, and show you how. Reading time: 10 minutes, 47 seconds Welcome to Part 4 in Neural Magic’s five-part blog series on… Read More Part 4: Sparsity per Layer Hyperparameter
Part 3: Gradual Magnitude Pruning (GMP) Hyperparameters
TL;DR: To facilitate the GMP process when pruning a network, several hyperparameters must be defined. These include general hyperparameters such as learning rate, pruning update frequency, and pruning schedule function in addition to the sparsity per layer. All hyperparameters affect end level recovery, loss, and performance. Reading time: 5 minutes, 5 seconds Welcome to Part… Read More Part 3: Gradual Magnitude Pruning (GMP) Hyperparameters