If You're Serious About Data Analysis, You Need a Semantic Model

Take a real question: "What was NRR last two quarters, and what are the top churn drivers by ARR?"
Most teams give their text-to-SQL system DDL, column comments, and a handful of sample queries. That system sees something like this:
-- Tables available:
-- subscriptions (id, account_id, start_date, end_date, mrr, plan_type)
-- churn_events (id, account_id, churn_date, reason_code, arr_impact)
-- accounts (id, name, segment, region, csm_id)
--
-- Sample query: SELECT account_id, SUM(mrr) FROM subscriptions GROUP BY 1The system has to guess what "NRR" means — maybe (starting_mrr + expansion - contraction - churn) / starting_mrr, but over what cohort window, and which subscriptions rows count as expansion versus new business? The join between churn_events and subscriptions could easily double-count ARR, and nothing in the schema says how to avoid it. So the system guesses, and different tools guess differently.
Add a semantic view, and the system sees this instead:
CREATE OR REPLACE SEMANTIC VIEW revenue_metrics
TABLES (
accounts AS accounts PRIMARY KEY (id) WITH SYNONYMS ('customer'),
subscriptions AS subscriptions PRIMARY KEY (id),
churn_events AS churn_events PRIMARY KEY (id)
)
RELATIONSHIPS (
subscriptions(account_id) REFERENCES accounts,
churn_events(account_id) REFERENCES accounts
)
DIMENSIONS (
accounts.region AS accounts.region,
accounts.segment AS accounts.segment,
churn_events.reason_code AS churn_events.reason_code
)
METRICS (
net_revenue_retention AS <sql_expr> COMMENT = 'NRR over a rolling quarterly cohort',
annual_recurring_revenue AS SUM(churn_events.arr_impact) COMMENT = 'ARR impact from churn events'
)
COMMENT = 'Governed revenue and retention metrics';NRR has a definition, the relationships between tables are in the schema, and every metric specifies how it aggregates. Because semantic views live as schema objects, they become the shared contract across BI, APIs and AI workflows so that every consumer gets the same answer.
If you're running analytics at scale, using semantic models isn't optional anymore.
The "DDL + sample queries" baseline breaks under real conditions
Most teams start with the same bootstrap package: DDL with column comments, a few sample queries and informal conventions passed through docs or tribal knowledge. It works for demos. But it falls apart in production for four reasons:
- Metric drift: Sample queries go stale. One team's definition of "active user" quietly diverges from another's, and nobody notices until the board deck has two conflicting numbers.
- Join ambiguity: When multiple plausible join paths exist between tables, small differences change the grain and the totals. A text-to-SQL system picking the wrong path silently returns a confidently wrong number.
- Concept mismatch: Business questions refer to concepts such as "NRR" or "customer health score" that don't map to any single physical column. Without an explicit definition, every consumer reinvents the logic.
- Tool duplication: Each BI tool and each AI agent independently learns meaning from the same raw schema and reproduces it inconsistently. You end up maintaining the same business logic in Tableau, in Power BI, in your internal Python scripts and in your LLM prompts.
Semantic views address all four of these issues by moving meaning from informal convention into an executable, governed schema object. Metrics become first-class definitions. Relationships are declared explicitly. The warehouse becomes the single source of business semantics.
"This sounds like a lot of up-front work"
It used to be. Building a semantic model from scratch meant weeks of interviews, spreadsheets and manual YAML authoring before you saw any value.
Semantic View Autopilot makes this practical. It generates semantic views from the logic your organization already has (query patterns, BI dashboards, trusted SQL).
The practical workflow looks like this:
- Pick one domain like revenue or product usage and a small set of Tier 1 metrics.
- Feed Autopilot real context: trusted SQL queries, BI dashboards and historical query patterns. Autopilot uses query history, table metadata (including primary keys and cardinality) and the context you provide to validate relationships and extract business logic. When you supply SQL examples, Snowflake adds them as verified queries and infers relationships from them.
- Review and certify the generated semantic view. Autopilot proposes candidates when multiple definitions exist and you choose which one is canonical.
- Use verified queries as a regression harness. As business rules evolve, Autopilot monitors usage and proposes updates, so the model stays aligned with how the organization actually operates.
The cost of creating a semantic model is no longer a reason to skip it.
Semantic views are where BI and AI converge
BI tools have always maintained their own semantic layers: Tableau has its data model, Looker has LookML, and Power BI has DAX measures. Every tool reinvents the same definitions independently. When the warehouse itself holds the semantic layer, that duplication collapses. You define "NRR" once in a semantic view, and every downstream tool reads it.
Open Semantic Interchange (OSI) is a standard designed to unlock semantic definitions that are portable across BI vendors and AI systems: define a metric once, and have it render correctly everywhere for tools (such as Tableau, Power BI and Sigma) that adopt the standard.
On the AI side, semantic views act as operational tools for agents. When you configure a Cortex Agent, you can attach semantic views as tools via Snowflake Cortex Analyst. The agent routes a natural-language question to the relevant semantic view, which provides the metric definitions and relationship graph needed to generate correct SQL.
Semantic models are the shared interface that lets BI dashboards and AI agents operate on the same definitions. As agentic analytics becomes a primary way people interact with data, having that shared interface in place is how you keep answers reliable.
If you want self-serve analytics that stays governed as AI usage scales, treat your semantic model as a first-class asset. The cost of not having a semantic model should worry you.

