A Software Architecture for the Future of ML
Sparsify (prune and quantize) your deep learning models to minimize footprint & run on CPUs at GPU speeds.
YOLOv3 540x540 Laptop Deployment: 1.30GHz Intel i7-1065G7
CONNECT WITH THE DEEP SPARSE COMMUNITY
Unprecedented Performance –– Run models on CPUs at GPU speeds. No special hardware required.
Reduce Costs –– Deploy and scale models on commodity CPU servers from the cloud to the edge.
Smaller Footprint –– Unlock edge possibilities by reducing model footprint by 20x.
Run Anywhere –– Deploy with flexibility on premise, in the cloud, or at the edge.
Open-source, easy-to-use interface to automatically sparsify and quantize deep learning models for CPUs & GPUs.
Open-source libraries and optimization algorithms for CPUs & GPUs, enabling integration with a few lines of code.
Open-source neural network model repository for highly sparse and sparse-quantized models with matching pruning recipes for CPUs and GPUs.
Free CPU runtime that runs sparse models at GPU speeds.
Paths to Sparse Acceleration
A.) Original Dense Path
Take your dense model & run it in the DeepSparse Engine, without any changes.
B.) SparseZoo Path
Take a pre-optimized model & run it in the DeepSparse Engine, or transfer learn with your data.
C.) Sparsified Path
Sparsify and quantize your dense model with ease & run it in the DeepSparse Engine.
New Tutorial: Sparsifying YOLOv3 Using Recipes
Neural Magic Appoints Brian Stevens as Chief Executive Officer