Rob Greenberg
Continue Reading
Recent Blogs
Products
Oct 17, 2024
We Ran Over Half a Million Evaluations on Quantized LLMs: Here's What We Found
Open Source
Oct 14, 2024
Introducing Machete, a Mixed-Input GEMM Kernel Optimized for NVIDIA Hopper GPUs
Open Source
Aug 14, 2024
LLM Compressor is Here: Faster Inference with vLLM