Real-time Recommendation Engine

Machine Learning Solutions for Personalization

Recommendation systems predict user preferences by using machine learning to understand past user behavior. For example, eCommerce and retail sites can use real-time recommendations powered by Neural Magic to create fine-tuned personalizations that improve customer loyalty, as well as increase conversion rates and cross-sell/upsell opportunities.

Sign Up for Early Access

Improving Performance of Machine Learning Recommendations

Today, when machine learning engineers run recommendation models on a CPU, they often make sacrifices that affect the quality of their predictions, reducing their:

  • Model Size
  • Input Size
  • Accuracy


Neural Magic addresses these limitations by generating GPU-class performance on a CPU.

Increase Speed

GPU class performance in your existing CPU setup

Reduce Cost

Run multiple models on the same CPU instance

Improve Accuracy

Increase the accuracy with bigger models and bigger data

Currently supports recommendation models such as Deep Learning Recommendation Models (DLRM), Multilayer Perceptron (MLP) networks and Fully Connected Networks

Learn More

Neural Magic In the News

Try Neural Magic Today

The Neural Magic Inference Engine fits seamlessly into existing CI/CD pipelines, can be deployed in containers or virtual machines, and can be managed with Kubernetes like any modern software application.

Sign up for early access