Unlock Faster and More Efficient Language Models with SparseGPT
Language models are getting larger and more complex, posing challenges for deployment and inference.
At Neural Magic, we've developed cutting-edge research to address the issue with advanced pruning and quantization techniques. Our latest innovation, SparseGPT, extends these methods to large language models, enabling significant compression and faster inference. With SparseGPT, weights can be quantized to INT4, and up to 60% can be entirely removed without retraining.
Join our webinar on May 25, 2023, to dive into the research, results, and intuitive approach behind SparseGPT.
Participants will gain:
- Deep understanding of SparseGPT: Learn about the cutting-edge research and innovative techniques behind SparseGPT, including advanced pruning and quantization methods, and how they adapt for large language models.
- Practical insights and results: Gain insights into the results of SparseGPT, including significant compression of weights, quantization to INT4, and the potential to remove up to 60% of weights without retraining. Understand how these optimizations can lead to faster and more efficient language models.
- Applicability to your work: Discover how you can apply SparseGPT in your current work and research to overcome challenges associated with large language models, improve model deployment, and enable more efficient inference for your specific applications.
- Dan Alistarh, Research Lead, Neural Magic. Professor at the Institute of Science and Technology Austria
- Mark Kurtz, Director of ML, Neural Magic
Time: 1:00 PM EDT; 10:00 AM PDT