Aurick Qiao

Senior Software Engineer
Aurick Qiao is an AI researcher at Snowflake and former CEO of Petuum Inc., with a Ph.D. from Carnegie Mellon University. He specializes in machine learning systems, distributed computing, and large language models, making significant contributions to AI research and technology.

MORE POSTSFROM Aurick Qiao

Gen AI

Accelerating PyTorch Innovation at Scale: Snowflake at PyTorch Conference 2025

How Snowflake tackles four core AI challenges: scaling deep learning, training thousands of models, accelerating inference, and balancing multilingual performance.
|||||
NOV 19, 2025|7 min read
Gen AI

Smarter, Faster and Snowflake-Native: Real-Time Text2SQL Behind Snowflake Intelligence

Discover Arctic-Text2SQL-R1.5, Snowflake's new, native Text-to-SQL model built to overcome LLM latency and deliver higher accuracy for real-time conversational analytics.
|||||
NOV 04, 2025|7 min read
Gen AI

Fast Reasoning on GPT-OSS with Speculative Decoding and Arctic Inference

Snowflake AI Research boosted GPT-OSS reasoning speed by 1.7–1.8x using Arctic Inference with speculative decoding, enabling faster agentic AI.
|||||
AUG 25, 2025|4 min read

Arctic Long Sequence Training (ALST): Scalable And Efficient Training For Multi-Million Token Sequences

Snowflake's ALST enables scalable training of long-context models with up to 15 million tokens using Hugging Face and DeepSpeed, all without custom modeling code.
||||||||
JUN 24, 2025|10 min read
Gen AI

Arctic Inference with Shift Parallelism: The Fastest Open Source Inference System for Enterprise AI

Built by Snowflake AI Research, Arctic Inference uses Shift Parallelism, SwiftKV, and speculative decoding to power the fastest open-source enterprise AI.
||||||||
MAY 29, 2025|15 min read
Gen AI

Low-Latency and High-Throughput Inference for Long Context with Sequence Parallelism (aka Arctic Ulysses)

Ulysses, a novel sequence parallelism technique, boosts long-context LLM inference performance with 3.4x lower latency and better GPU efficiency.
|||||
APR 03, 2025|14 min read
Digital illustration of connected lines and dots in a column lined with grids
Product and Technology

SwiftKV from Snowflake AI Research Reduces Inference Costs of Meta Llama LLMs up to 75% on Cortex AI

SwiftKV optimizes Meta Llama LLMs on Snowflake Cortex AI, reducing inference costs by up to 75% while maintaining accuracy for enterprise AI solutions.
||||
JAN 16, 2025|5 min read

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime