Products and Solutions

Community

Blog

Resources

Company

Welcome to NeuralFlix

Browse our knowledge catalog to learn more on how to deploy deep learning models with GPU-class performance on commodity CPUs.

Let us know if you have any questions, comments or would like to set up an appointment to chat with us.

Intro to Neural Magic & Software-Delivered AI

Intro to Deep Learning Model Sparsification

Intro to SparseZoo

Intro to SparseML

Intro to DeepSparse Runtime

Accelerate NLP Tasks With Sparsity and the DeepSparse Runtime

Accelerate Image Classification Tasks With Sparsity and the DeepSparse Runtime

Accelerate Image Segmentation Tasks With Sparsity and the DeepSparse Runtime

Accelerate Object Detection Tasks With Sparsity and the DeepSparse Runtime

Sparse Training of Neural Networks Using AC/DC

How Well Do Sparse Models Transfer?

How to Achieve the Fastest CPU Inference Performance for Object Detection YOLO Models

Workshop: How to Optimize Deep Learning Models for Production

How to Compress Your BERT NLP Models For Very Efficient Inference

Tissue vs. Silicon: The Future of Deep Learning Hardware

Sparsifying YOLOv5 for 10x Better Performance, 12x Smaller File Size, and Cheaper Deployment

Pruning Deep Learning Models for Success in Production

YOLOv5 on CPUs: Sparsifying to Achieve GPU-Level Performance and Tiny Footprint

YOLOv3 on the Edge: DeepSparse Engine vs. PyTorch

State-of-the-Art NLP Compression Research in Action: Understanding Crypto Sentiment

3.5x Faster NLP BERT Using a Sparsity-Aware Inference Engine on AMD Milan-X