GEN AI
48 Results
Newest - Oldest

Snowflake's Arctic-Extract: A Vision-Language Model for High-Fidelity Document Extraction
We’re excited to announce Arctic-Extract, a vision-language model that delivers fast, accurate, and cost-effective document AI for high-fidelity data extraction.
AUG 29, 2025|5 min read

Fast Reasoning on GPT-OSS with Speculative Decoding and Arctic Inference
Snowflake AI Research boosted GPT-OSS reasoning speed by 1.7–1.8x using Arctic Inference with speculative decoding, enabling faster agentic AI.
AUG 25, 2025|4 min read

Bridging the Gap Between LLMs and Real-World Challenges: Snowflake at ACL 2025
Snowflake AI Research shares 4 breakthrough papers accepted at ACL 2025, advancing model efficiency, document AI, text-to-SQL reliability, and LLM evaluation.
JUL 30, 2025|7 min read

Inside Snowflake Intelligence: Five Pillars of Enterprise-Grade Agentic AI
Explore the underlying architecture, orchestration, and system-level optimizations behind Snowflake Intelligence, a production-grade agentic AI system built for enterprise reasoning.
JUN 03, 2025|13 min read

Smaller Models, Smarter SQL: Arctic-Text2SQL-R1 Tops BIRD and Wins Broadly
A deep dive into how Snowflake AI built Arctic-Text2SQL-R1 using simple rewards, strong reasoning, and a scalable approach to real-world SQL generation.
MAY 29, 2025|14 min read

Arctic Inference with Shift Parallelism: The Fastest Open Source Inference System for Enterprise AI
Built by Snowflake AI Research, Arctic Inference uses Shift Parallelism, SwiftKV, and speculative decoding to power the fastest open-source enterprise AI.
MAY 29, 2025|15 min read

Scaling vLLM for Embeddings: 16x Throughput and Cost Reduction
Learn how we increased embedding throughput 3x in Snowflake Cortex—and 16x vs. vLLM—through smarter serialization, tokenization, and GPU optimization.
MAY 29, 2025|8 min read

Announcing Claude Opus 4 and Claude Sonnet 4 on Snowflake Cortex AI
Claude Opus 4 and Sonnet 4 are now on Snowflake Cortex AI, enabling secure access to Anthropic’s latest models via LLM functions and REST APIs.
MAY 22, 2025|5 min read

Powering Natural Language Interfaces for the Web with NLWeb and Snowflake
Build chatbots for your website in minutes using NLWeb and Snowflake’s Cortex APIs—fast, secure retrieval and LLMs with no extra infrastructure.
MAY 20, 2025|4 min read

Fastest Speculative Decoding in vLLM with Arctic Inference and Arctic Training
How we enhanced speculative decoding to get 4x faster end-to-end task completion for LLM agents and up to 2.8x faster decoding for conversational, interactive and coding workloads.
MAY 01, 2025|18 min read

Evaluating Multimodal vs. Text-Based Retrieval for RAG with Snowflake Cortex
Discover how multimodal retrieval on Snowflake Cortex transforms enterprise PDF search, enhancing accuracy and speed across complex document formats.
APR 21, 2025|8 min read

Cortex Agents: Unifying Data Insights with Snowflake
Snowflake's Cortex Agents unify structured analytics and unstructured search, offering a seamless experience for complex data queries.
APR 09, 2025|7 min read
1
2
3
4