NeuralFlix

Pruning Deep Learning Models for Success in Production

Presenter: Mark Kurtz

Research shows that 58% of data scientists are not optimizing their deep learning models for production, despite the significant advantages techniques like pruning and quantization can offer. Why? BECAUSE IT'S HARD!

Mark Kurtz, Machine Learning Lead at Neural Magic, demonstrates how to prune models for performance. He covers an overview of pruning, including its benefits and downsides. He shares easy ways to prune models and showcases tools that make pruning easy and successful. Lastly, Mark shows how to get performance out of a pruned model in production.

More ML Research in Action Videos

Apply Second-Order Pruning Algorithms for SOTA Model Compression
Sparse Training of Neural Networks Using AC/DC
How Well Do Sparse Models Transfer?
How to Achieve the Fastest CPU Inference Performance for Object Detection YOLO Models
Workshop: How to Optimize Deep Learning Models for Production
How to Compress Your BERT NLP Models For Very Efficient Inference
Sparsifying YOLOv5 for 10x Better Performance, 12x Smaller File Size, and Cheaper Deployment
Tissue vs. Silicon: The Future of Deep Learning Hardware
Pruning Deep Learning Models for Success in Production

Get more info about

Pruning Deep Learning Models for Success in Production