Real-time Recommendation Engine
Recommendation systems predict user preferences by using machine learning to understand past user behavior. For example, eCommerce and retail sites can use real-time recommendations powered by Neural Magic to create fine-tuned personalizations that improve customer loyalty, as well as increase conversion rates and cross-sell/upsell opportunities.
Improving Performance of Machine Learning Recommendations
Today, when machine learning engineers run recommendation models on a CPU, they often make sacrifices that affect the quality of their predictions, reducing their:
- Model Size
- Input Size
Neural Magic addresses these limitations by generating GPU-class performance on a CPU.
Currently supports recommendation models such as Deep Learning Recommendation Models (DLRM), Multilayer Perceptron (MLP) networks and Fully Connected Networks
Neural Magic In the News
Try Neural Magic Today
The Neural Magic Inference Engine fits seamlessly into existing CI/CD pipelines, can be deployed in containers or virtual machines, and can be managed with Kubernetes like any modern software application.