Snowflake AI Research

Snowflake AI Research

We are a team with extensive experience building systems and technology that have significantly reduced the cost of LLM training and inference. A lot of our work has been open-sourced to provide the AI community with more accessible and cost-effective LLMs. The team includes many specialists in natural language processing and search. With the help of thousands of engineers worldwide at Snowflake, our cutting-edge technology powers enterprise AI products in Cortex AI and more. Check out what we're working on: https://www.snowflake.com/en/product/ai/ai-research/

MORE POSTSFROM Snowflake AI Research

Gen AI

Fast and More Accurate Causal Parallel Decoding Using Jacobi Forcing

Jacobi Forcing enables fast and more accurate causal parallel decoding for autoregressive transformers, offering near-AR quality and improved token throughput.
||||
MAR 04, 2026|10 min read
Gen AI

Agent World Model (AWM): Infinity Synthetic Environments for Agentic Reinforcement Learning

Open-source Agent World Model generates 1,000 SQL-backed executable environments for agentic RL with benchmark-winning results.
||||||
FEB 13, 2026|10 min read
Gen AI

SuffixDecoding at Production Scale with Arctic Inference and vLLM

SuffixDecoding now delivers 1.96x–3.12x end-to-end speedups in vLLM and Arctic Inference with major CPU optimizations for fast, production-ready LLM serving.
|||
DEC 02, 2025|9 min read
Gen AI

Accelerating PyTorch Innovation at Scale: Snowflake at PyTorch Conference 2025

How Snowflake tackles four core AI challenges: scaling deep learning, training thousands of models, accelerating inference, and balancing multilingual performance.
|||||
NOV 19, 2025|7 min read
Gen AI

What’s Your Agent’s GPA? A Framework for Evaluating AI Agent Reliability

Learn how the Snowflake AI Research team's Agent GPA framework achieved 95% error detection and 86% localization accuracy to accurately measure and debug agent performance.
||||||
NOV 04, 2025|8 min read
Gen AI

Optimizing Query Execution in Cortex AISQL

How we made AISQL up to 8x faster and 70x cheaper through AI-aware planning, adaptive model cascading, and semantic join rewriting.
|||||||||||
NOV 04, 2025|8 min read
Gen AI

Smarter, Faster and Snowflake-Native: Real-Time Text2SQL Behind Snowflake Intelligence

Discover Arctic-Text2SQL-R1.5, Snowflake's new, native Text-to-SQL model built to overcome LLM latency and deliver higher accuracy for real-time conversational analytics.
|||||
NOV 04, 2025|7 min read
Gen AI

Fast Reasoning on GPT-OSS with Speculative Decoding and Arctic Inference

Snowflake AI Research boosted GPT-OSS reasoning speed by 1.7–1.8x using Arctic Inference with speculative decoding, enabling faster agentic AI.
|||||
AUG 25, 2025|4 min read
Gen AI

Bridging the Gap Between LLMs and Real-World Challenges: Snowflake at ACL 2025

Snowflake AI Research shares 4 breakthrough papers accepted at ACL 2025, advancing model efficiency, document AI, text-to-SQL reliability, and LLM evaluation.
|
JUL 30, 2025|7 min read

Previous

1

2

3

4

5

Next

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime