Gabriele Oliaro

AI Research Intern

Gabriele Oliaro is a Research Intern at Snowflake and a Ph.D. student at Carnegie Mellon University. His research focuses on AI systems, with an emphasis on large language model (LLM) inference and fine-tuning.

Gen AI

SuffixDecoding at Production Scale with Arctic Inference and vLLM

SuffixDecoding now delivers 1.96x–3.12x end-to-end speedups in vLLM and Arctic Inference with major CPU optimizations for fast, production-ready LLM serving.

Aurick Qiao|Gabriele Oliaro|Samyam Rajbhandari|Snowflake AI Research

DEC 02, 2025|9 min read