NeuralFlix

How to Achieve the Fastest CPU Inference Performance for Object Detection YOLO Models

Presenter:

Topics covered:

1. Object detection background, inluding history and current solutions

2. Easy ways to optimize/sparsify YOLOv5 models

3. Applying your own data with sparse transfer learning or sparsifying from scratch

4. Deploying YOLOv5 by exporting to ONNX and inferencing in the DeepSparse Engine on commodity CPUs at GPU speeds

5. Future research, next steps, and open discussion

After watching this video, you’ll be able to optimize your computer vision models, apply your own data with a few lines of code, and deploy it on commodity CPUs at GPU-level speeds.

More ML Research in Action Videos

Apply Second-Order Pruning Algorithms for SOTA Model Compression
Sparse Training of Neural Networks Using AC/DC
How Well Do Sparse Models Transfer?
How to Achieve the Fastest CPU Inference Performance for Object Detection YOLO Models
Workshop: How to Optimize Deep Learning Models for Production
How to Compress Your BERT NLP Models For Very Efficient Inference
Sparsifying YOLOv5 for 10x Better Performance, 12x Smaller File Size, and Cheaper Deployment
Tissue vs. Silicon: The Future of Deep Learning Hardware
Pruning Deep Learning Models for Success in Production

Get more info about

How to Achieve the Fastest CPU Inference Performance for Object Detection YOLO Models