GEN AI

Results

Newest - Oldest
Z - A
A - Z
Newest - Oldest
Oldest - Newest
Gen AI

Inside Snowflake Intelligence: Five Pillars of Enterprise-Grade Agentic AI

Explore the underlying architecture, orchestration, and system-level optimizations behind Snowflake Intelligence, a production-grade agentic AI system built for enterprise reasoning.
|||||||
JUN 03, 2025|13 min read
Gen AI

Smaller Models, Smarter SQL: Arctic-Text2SQL-R1 Tops BIRD and Wins Broadly

A deep dive into how Snowflake AI built Arctic-Text2SQL-R1 using simple rewards, strong reasoning, and a scalable approach to real-world SQL generation.
||
MAY 29, 2025|14 min read
Gen AI

Arctic Inference with Shift Parallelism: The Fastest Open Source Inference System for Enterprise AI

Built by Snowflake AI Research, Arctic Inference uses Shift Parallelism, SwiftKV, and speculative decoding to power the fastest open-source enterprise AI.
||||||||
MAY 29, 2025|15 min read
Gen AI

Scaling vLLM for Embeddings: 16x Throughput and Cost Reduction

Learn how we increased embedding throughput 3x in Snowflake Cortex—and 16x vs. vLLM—through smarter serialization, tokenization, and GPU optimization.
||||||
MAY 29, 2025|8 min read
Gen AI

Announcing Claude Opus 4 and Claude Sonnet 4 on Snowflake Cortex AI

Claude Opus 4 and Sonnet 4 are now on Snowflake Cortex AI, enabling secure access to Anthropic’s latest models via LLM functions and REST APIs.
|
MAY 22, 2025|5 min read
Powering Natural Language Interfaces for the Web with NLWeb and Snowflake
Gen AI

Powering Natural Language Interfaces for the Web with NLWeb and Snowflake

Build chatbots for your website in minutes using NLWeb and Snowflake’s Cortex APIs—fast, secure retrieval and LLMs with no extra infrastructure.
MAY 20, 2025|4 min read
Gen AI

Fastest Speculative Decoding in vLLM with Arctic Inference and Arctic Training

How we enhanced speculative decoding to get 4x faster end-to-end task completion for LLM agents and up to 2.8x faster decoding for conversational, interactive and coding workloads.
MAY 01, 2025|18 min read
Gen AI

Evaluating Multimodal vs. Text-Based Retrieval for RAG with Snowflake Cortex

Discover how multimodal retrieval on Snowflake Cortex transforms enterprise PDF search, enhancing accuracy and speed across complex document formats.
|||
APR 21, 2025|8 min read
Gen AI

Cortex Agents: Unifying Data Insights with Snowflake

Snowflake's Cortex Agents unify structured analytics and unstructured search, offering a seamless experience for complex data queries.
|||
APR 09, 2025|7 min read
Gen AI

Low-Latency and High-Throughput Inference for Long Context with Sequence Parallelism (aka Arctic Ulysses)

Ulysses, a novel sequence parallelism technique, boosts long-context LLM inference performance with 3.4x lower latency and better GPU efficiency.
|||||
APR 03, 2025|14 min read
Gen AI

Think. Execute. Excel: Arctic Text2SQL with Execution-Guided CoT

Learn how Snowflake’s ExCoT optimizes Text2SQL with execution-guided CoT and DPO, setting a new benchmark in natural language to SQL accuracy.
||||
APR 02, 2025|10 min read
Gen AI

Agentic Semantic Model Improvement: Elevating Text-to-SQL Performance

Discover how Snowflake’s agentic system enhances semantic models for text-to-SQL, reducing manual work and improving SQL query accuracy.
MAR 31, 2025|7 min read

Previous

1

2

3

4

Next

Where Data Does More

  • 30-day free trial
  • No credit card required
  • Cancel anytime