GEN AI
Results
Newest - Oldest

Inside Snowflake Intelligence: Five Pillars of Enterprise-Grade Agentic AI
Explore the underlying architecture, orchestration, and system-level optimizations behind Snowflake Intelligence, a production-grade agentic AI system built for enterprise reasoning.
JUN 03, 2025|13 min read

Smaller Models, Smarter SQL: Arctic-Text2SQL-R1 Tops BIRD and Wins Broadly
A deep dive into how Snowflake AI built Arctic-Text2SQL-R1 using simple rewards, strong reasoning, and a scalable approach to real-world SQL generation.
MAY 29, 2025|14 min read

Arctic Inference with Shift Parallelism: The Fastest Open Source Inference System for Enterprise AI
Built by Snowflake AI Research, Arctic Inference uses Shift Parallelism, SwiftKV, and speculative decoding to power the fastest open-source enterprise AI.
MAY 29, 2025|15 min read

Scaling vLLM for Embeddings: 16x Throughput and Cost Reduction
Learn how we increased embedding throughput 3x in Snowflake Cortex—and 16x vs. vLLM—through smarter serialization, tokenization, and GPU optimization.
MAY 29, 2025|8 min read

Announcing Claude Opus 4 and Claude Sonnet 4 on Snowflake Cortex AI
Claude Opus 4 and Sonnet 4 are now on Snowflake Cortex AI, enabling secure access to Anthropic’s latest models via LLM functions and REST APIs.
MAY 22, 2025|5 min read

Powering Natural Language Interfaces for the Web with NLWeb and Snowflake
Build chatbots for your website in minutes using NLWeb and Snowflake’s Cortex APIs—fast, secure retrieval and LLMs with no extra infrastructure.
MAY 20, 2025|4 min read

Fastest Speculative Decoding in vLLM with Arctic Inference and Arctic Training
How we enhanced speculative decoding to get 4x faster end-to-end task completion for LLM agents and up to 2.8x faster decoding for conversational, interactive and coding workloads.
MAY 01, 2025|18 min read

Evaluating Multimodal vs. Text-Based Retrieval for RAG with Snowflake Cortex
Discover how multimodal retrieval on Snowflake Cortex transforms enterprise PDF search, enhancing accuracy and speed across complex document formats.
APR 21, 2025|8 min read

Cortex Agents: Unifying Data Insights with Snowflake
Snowflake's Cortex Agents unify structured analytics and unstructured search, offering a seamless experience for complex data queries.
APR 09, 2025|7 min read

Low-Latency and High-Throughput Inference for Long Context with Sequence Parallelism (aka Arctic Ulysses)
Ulysses, a novel sequence parallelism technique, boosts long-context LLM inference performance with 3.4x lower latency and better GPU efficiency.
APR 03, 2025|14 min read

Think. Execute. Excel: Arctic Text2SQL with Execution-Guided CoT
Learn how Snowflake’s ExCoT optimizes Text2SQL with execution-guided CoT and DPO, setting a new benchmark in natural language to SQL accuracy.
APR 02, 2025|10 min read

Agentic Semantic Model Improvement: Elevating Text-to-SQL Performance
Discover how Snowflake’s agentic system enhances semantic models for text-to-SQL, reducing manual work and improving SQL query accuracy.
MAR 31, 2025|7 min read
1
2
3
4