Microsoft, AMD, and Neural Magic are raising the bar for high-performance computing. With a combination of HBv3 virtual machines and our sparsity-aware inference engine, we are able to run deep learning workloads on CPUs at speeds previously reserved only for GPUs.
For example, together we deliver 5x inference speedup for BERT NLP models over other conventional approaches. More details here.
Hear more what Azure Chief Technology Officer, Mark Russinovich, has to say about the powerful combination of Microsoft, AMD, and Neural Magic (minute 1:50).