Cortex REST API Usage Monitor

See how leading teams deploy agents at scale. Find a stop near you.

Snowflake for Developers/Guides/Cortex REST API Usage Monitor

Quickstart

Cortex REST API Usage Monitor

Cortex LLM

Priya Joseph

Fork Repo

Overview

Duration: 2

This quickstart demonstrates how to build a Streamlit in Snowflake (SiS) application to monitor your Cortex REST API token consumption. Track usage by model, user, and time period to optimize costs and understand your AI workload patterns.

What You'll Learn

How to query SNOWFLAKE.ACCOUNT_USAGE.CORTEX_REST_API_USAGE_HISTORY
Building interactive dashboards with Streamlit in Snowflake
Visualizing token consumption patterns with Altair charts
Tracking usage by model, user, and time period

What You'll Need

A Snowflake account with Cortex REST API access
ACCOUNTADMIN role or appropriate privileges to query ACCOUNT_USAGE views
Basic familiarity with Streamlit

What You'll Build

A comprehensive dashboard showing:

Overall usage KPIs (total tokens, requests, models, users)
Token usage trends by model over time
Input vs output token breakdown
Usage breakdown by user
Daily detailed usage tables

Setup

Duration: 3

Prerequisites

Grant access to the SNOWFLAKE database for your role:

USE ROLE ACCOUNTADMIN;
GRANT IMPORTED PRIVILEGES ON DATABASE SNOWFLAKE TO ROLE <your_role>;

Create the Streamlit App

Navigate to Streamlit in Snowsight
Click + Streamlit App
Name your app CORTEX_REST_API_USAGE_MONITOR
Select your warehouse and database/schema
Create a notebook. Copy the code in streamlit.py into notebook cell and run.

Understanding the Dashboard

Duration: 5

Overall Usage Summary

The top KPIs show:

Total Tokens: Combined input + output tokens consumed
Total Requests: Number of API calls made
Models Used: Count of distinct models accessed
Active Users: Number of users making requests

Token Usage by Model

The stacked bar chart shows daily token consumption broken down by model, helping you identify:

Usage trends over time
Which models consume the most tokens
Peak usage days

Daily Usage by Model

The detailed table and chart view shows daily token consumption with request counts for each model:

Input vs Output Tokens

Understanding the ratio between input and output tokens helps optimize:

Prompt engineering (reduce input tokens)
Response limits (control output tokens)
Model selection for different use cases

Usage by User

Track which users or service accounts consume the most resources for:

Chargeback reporting
Identifying optimization opportunities
Capacity planning

Cost Considerations

Duration: 2

Token Pricing

Cortex REST API costs vary by model. Monitor the CORTEX_REST_API_USAGE_HISTORY view to:

Track consumption trends
Set up alerts for unusual usage
Optimize model selection for cost vs quality

Latency Note

The ACCOUNT_USAGE views have up to 2 hours of latency. For real-time monitoring, consider:

Using INFORMATION_SCHEMA views (15-minute latency, shorter retention)
Building streaming pipelines for critical monitoring

Conclusion

Duration: 1

What You Learned

Querying Cortex REST API usage data from ACCOUNT_USAGE
Building interactive monitoring dashboards with Streamlit in Snowflake
Visualizing token consumption patterns

Related Resources

Updated 2026-03-04

This content is provided as is, and is not maintained on an ongoing basis. It may be out of date with current Snowflake instances