Previously Recorded Discussions

How to Compress BERT NLP Models for Efficient Inference

Learn about SOTA research about compressing BERT models 10x for much more efficient deployments and a 9-29x CPU inference speedup. View the recording here.

Deep Sparse Platform Demo: Build and Deploy Accurate Deep Learning Models Faster

Get an overview of Neural Magic, along with key business use cases and applications that can be powered with the Deep Sparse Platform. See a demo of an end-to-end experience in action, starting from a Neural Magic pre-trained model in the SparseZoo, applying a private dataset with a recipe using SparseML, and deploying on CPUs with the DeepSparse Engine. View the discussion recording here.

Using “Compound Sparsification” with Hugging Face BERT for Faster CPU Inference with Better Accuracy

Learn what “compound sparsification” is, how we used it to accelerate Hugging Face BERT performance on CPUs by up to 14x, and how you can do the same with your private data. View the discussion recording here.

Date recorded: September 29, 2021
Presenter: Mark Kurtz, ML Lead, Neural Magic

Sparsifying YOLOv5 to Achieve Faster and Smaller Models

Learn how we sparsified (pruned and quantized) YOLOv5 for 10x better performance and 12x smaller model files. And how you can do the same with your private data. View the discussion recording here.

Date recorded: August 31, 2021
Presenter: Mark Kurtz, ML Lead, Neural Magic

Using Sparsification Recipes with PyTorch

Sparsification recipes make model pruning and quantization simple. This video shows what sparsification recipes are and how to use them to prune and quantize PyTorch models for smaller size and better performance. View the discussion recording here.

Date recorded: May 19, 2021
Speaker: Benjamin Fineran, Sr. ML Engineer, Neural Magic

Introducing the Deep Sparse Platform

To help the developer community interested in accelerating machine learning performance, we’ve open sourced our automated, recipe-driven model optimization technologies and made our CPU inference engine available for free. See our webinar recording to learn about the deep learning sparsification components that you can take advantage of immediately. View the webinar recording here.

Date recorded: April 7, 2021
Speakers: Nir Shavit, Founder, Neural Magic & Mark Kurtz, ML Lead, Neural Magic

Sparsify Demo: Optimize DL Models with Ease, for Free

Sparsify is an open-source solution with an easy-to-use interface to prune and quantize deep learning models. It allows for easy model hyperparamater tweaking to increase performance and decrease footprint, all while providing fine-grain controls over loss recovery. View webinar recording here.

Date recorded: December 17, 2020
Presenters: Gaurav Rao, Head of Product & Benjamin Fineran, Machine Learning Engineer

Neural Magic Demo: Run Deep Learning on CPUs

Learn how Neural Magic decreases TCO of your computer vision efforts, while increasing model performance on everyday CPUs. View webinar recording here.

Date recorded: October 22, 2020
Presenters: Bryan House, Chief Commercial Officer & Gaurav Rao, Head of Product

Big Brain Burnout: What’s Wrong with AI Computing?

Hear from Neural Magic’s award-winning co-founder why we need to fundamentally rethink how we’re building products that rely on machine learning and AI. Hint: Because if our brains processed information the same way today’s machine learning products consume computing power, you could fry an egg on your head. Spoiler: It’s about memory, not raw compute. View webinar recording here.

Date recorded: September 1, 2020
Presenter: Nir Shavit, Co-Founder Neural Magic & MIT Professor

Pruning Deep Learning Models for Success

According to a recent survey, 59% of data scientists are not optimizing their deep learning models for production, despite the performance gains techniques like quantization and pruning can offer. Contrary to popular belief, pruning deep learning models is not that hard. In this webinar we give an overview or pruning, as well as easy ways to prune. View webinar recording here.

Date recorded: May 28, 2020
Presenter: Mark Kurtz, Machine Learning Lead, Neural Magic

Was this article helpful?