Software-delivered AI Inference

Software-delivered AI Inference


Forget special hardware. Get GPU-class performance on CPUs with our sparsity-aware inference engine.

8-Core & 4-Core CPU | DeepSparse 0.12.0 | 99% Baseline Accuracy | Replicate Now
V100 & T4 | TensorFlow 20.06-py3 NGC | 100% Baseline Accuracy | NVIDIA Numbers

icon-copy-blue

Try it Now


COPIED
NLP - Question Answering
NLP - Text Classification
NLP - Token Classification
Computer Vision - Object Detection
Computer Vision - Image Classification
PICK A USE CASE:
1. BenchmarkYour Use Case

Install DeepSparse, our sparsity-aware inference engine, and benchmark a sparse-quantized version of the Hugging Face BERT-base model to achieve a 7x speedup with 99% of the baseline accuracy.

Click here to see our Sparse Question Answering guide.

2. TrainWith Your Data

Install SparseML, our open-source library, to transfer learn our sparse-quantized model to your dataset using a few lines of code.

The example given in the terminal uses a public dataset. Click here to apply to your own data via transfer learning.

3. DeployTo Your Infrastructure

Use the DeepSparse Engine for best-in-class CPU performance.

Copy the Python code on the right for an example of the DeepSparse Python API for a question answering model.

Click here to learn about deployment options.

icon-copy-blue
1. BenchmarkYour Use Case

Install DeepSparse, our sparsity-aware inference engine, and benchmark a sparse-quantized version of the Hugging Face BERT-base model to achieve an 8.1x speedup with 99% of the baseline accuracy.

Click here to see our Sparse Text Classification: Sentiment Analysis guide.

2. TrainWith Your Data

Install SparseML, our open-source library, to transfer learn our sparse-quantized model to your dataset using a few lines of code.

The example given in the terminal uses a public dataset. Click here to apply to your own data via transfer learning.

3. DeployTo Your Infrastructure

Use the DeepSparse Engine for best-in-class CPU performance.

Copy the Python code on the right for an example of the DeepSparse Python API for a text classification model.

Click here to learn about deployment options.

icon-copy-blue
1. BenchmarkYour Use Case

Install DeepSparse, our sparsity-aware inference engine, and benchmark a sparse-quantized version of the Hugging Face BERT-base model to achieve an 8.1x speedup with 99% of the baseline accuracy.

Click here to see our Sparse Token Classification: Named Entity Recognition guide.

2. TrainWith Your Data

Install SparseML, our open-source library, to transfer learn our sparse-quantized model to your dataset using a few lines of code.

The example given in the terminal uses a public dataset. Click here to apply to your own data via transfer learning.

3. DeployTo Your Infrastructure

Use the DeepSparse Engine for best-in-class CPU performance.

Copy the Python code on the right for an example of the DeepSparse Python API for a token classification model.

Click here to learn about deployment options.

icon-copy-blue
1. BenchmarkYour Use Case

Install DeepSparse, our sparsity-aware inference engine, and benchmark a sparse-quantized version of the YOLOv5s model to achieve an 11x speedup over PyTorch CPU.

Click here to see our YOLOv3 and YOLOv5 benchmarking example in GitHub.

2. TrainWith Your Data

Install SparseML, our open-source library, to transfer learn our sparse-quantized model to your dataset using a few lines of code.

Click here to apply to your own data via transfer learning.

 

3. DeployTo Your Infrastructure

Use the DeepSparse Engine for best-in-class CPU performance.

Click here for our YOLOv3 and YOLOv5 DeepSparse Inference Examples in GitHub.

icon-copy-blue
1. BenchmarkYour Use Case

Install DeepSparse, our sparsity-aware inference engine, and benchmark a sparse-quantized version of the ResNet-50 model to achieve a 7x speedup over ONNX Runtime CPU with 99% of the baseline accuracy.

Click here to see our ResNet-50 benchmarking example in GitHub.

2. TrainWith Your Data

Install SparseML, our open-source library, to transfer learn our sparse-quantized model to your dataset using a few lines of code.

Click here to apply to your own data via transfer learning.

 

 

 

3. DeployTo Your Infrastructure

Use the DeepSparse Engine for best-in-class CPU performance.

Click here for our ResNet-50 DeepSparse Inference Examples in GitHub.

icon-copy-blue

Join the Deep Sparse Community


MONTHLY NEWSLETTER

PRODUCT & MODEL UPDATES

DeepSparse Engine & AWS SageMaker

Deployed sparse DistilBERT to achieve a 7x increase in model performance