|
The Power of LLMs Large Language Models (LLMs) have transformed AI, enabling machines to understand and generate human-like text. These models, trained on vast datasets, excel at tasks like answering questions, summarizing content, and providing customer support. Their versatility makes them valuable across healthcare, finance, education, entertainment, and nearly all other industries However, achieving high… Read More Deploy Llama 3 8B with vLLM
|
Neural Magic Joins MLCommons  Through research, benchmarks, and best practices, Neural Magic is committed to open standards that will guide machine learning (ML) along the path from a research field to a mature industry. In February of 2021, we open-sourced our model sparsification libraries and made our sparsity-aware inference engine freely available for community use.… Read More Neural Magic Joins MLCommons to Help Accelerate ML Innovation Through Transparency and Open Source
|
Microsoft, AMD, and Neural Magic are raising the bar for high-performance computing. With a combination of HBv3 virtual machines and our sparsity-aware inference engine, we are able to run deep learning workloads on CPUs at speeds previously reserved only for GPUs. For example, together we deliver 5x inference speedup for BERT NLP models over other… Read More Video: Azure, AMD, and Neural Magic Raise the Bar for High-Performance Computing
|
Neural Magic, the AI company building a software platform for deep learning inference, today announced a $30 million Series A funding round led by existing investor NEA with participation from Andreessen Horowitz, Amdocs, Comcast Ventures, Pillar VC, and Ridgeline Ventures. This financing brings the company’s total amount raised to $50 million. The new capital will… Read More Neural Magic Announces $30 Million Series A Funding Led by NEA
|
We are excited to announce that industry veteran Brian Stevens will be joining Neural Magic as Chief Executive Officer. Brian brings vast experience in open source, enterprise, and hyper-scale cloud to the team. Before joining Neural Magic, Brian was Vice President and CTO of Google Cloud and Executive Vice President and CTO of Red Hat.… Read More Neural Magic Appoints Brian Stevens as Chief Executive Officer
|
This blog was originally posted by Na Zhang on VMware's Office of the CTO Blog. You can see the original copy here. Increasingly large deep learning (DL) models require a significant amount of computing, memory, and energy, all of which become a bottleneck in real-time inference where resources are limited. In this post, we detail our… Read More Accelerating Machine Learning Inference on CPU with VMware vSphere and Neural Magic
|
How many deep learning models do companies typically have in production? A lot fewer than you’d think. 84% of companies had five or fewer models in production. For many teams, this process is simply too hard or too costly. We recently surveyed more than 290 machine learning engineers and data scientists to find out how… Read More Companies Lack Resources to Get Deep Learning Models into Production [Survey]
|
Everything we know about memory requirements in machine learning may be wrong.  Today, when data scientists process deep learning models using a “throughput computing” device like a GPU, TPU, or similar hardware accelerator, they’re likely faced with a decision to shrink their model or input size to fit within the device’s memory limitations. Training a… Read More Challenging Memory Requirements and Performance Standards in ML
|
Nir’s take on the future of machine learning—where it’s heading and where it should be heading—can be seen as contrary to the current, prevailing wisdom.
|
The seed investment is led by Comcast Ventures, and including NEA, Andreessen Horowitz, Pillar VC and Amdocs