NeuralFlix

YOLOv5 on CPUs: Sparsifying to Achieve GPU-Level Performance and Tiny Footprint

Presenter:

Comparing runs of sparsified (pruned and INT8 quantized) YOLOv5 object detection model running on DeepSparse Inference Runtime and ONNX Runtime.

Setup: Laptop deployment using a 4-core Lenovo Yoga 1.30GHz Intel i7-1065G7

More Neural Magic Software in Action Videos

YOLOv5 on CPUs: Sparsifying to Achieve GPU-Level Performance and Tiny Footprint
YOLOv3 on the Edge: DeepSparse Engine vs. PyTorch
State-of-the-Art NLP Compression Research in Action: Understanding Crypto Sentiment
3.5x Faster NLP BERT Using a Sparsity-Aware Inference Engine on AMD Milan-X

Get more info about

YOLOv5 on CPUs: Sparsifying to Achieve GPU-Level Performance and Tiny Footprint