When Should You Employ Snowflake Materialized Views?

Comments 0

Share to social media

I’ve used both Snowflake views and materialized views with varying degrees of success. Both have their place and it’s on us to determine when to use them based on what they offer. This guide will help you in understanding how views and materialized views work in Snowflake and show you practical examples along with potential benefits.

The “What” Before the “When”

Before we dive into when to use materialized views, let’s quickly refresh our understanding. A materialized view in Snowflake is essentially a pre-computed result set of a query that’s stored and maintained automatically. Think of it as a cache that Snowflake manages for you, keeping it in sync with your source data.

Views vs. Materialized Views: Understanding the Difference

Let’s compare traditional views with materialized views:

Traditional views serve as stored queries that execute at runtime. Their characteristics make them ideal for certain scenarios:

  • Computed at query time
  • Always reflect current data
  • No additional storage cost
  • Consume compute resources with each query
  • Best for: Simple transformations, current data requirements, frequently changing data

On the other hand, materialized views basically are turned into a table that is refreshed anytime data is changed. Here is an overview of their attributes:

  • Pre-computed and stored
  • Automatically maintained by Snowflake
  • Require additional storage
  • Minimal compute at query time
  • Best for: Complex computations, frequent access, relatively stable data

Understanding Real-World Query Patterns

Let’s examine a typical business intelligence scenario that many organizations face. Consider a sales analysis system that needs to provide:

  • Daily sales summaries by region
  • Customer purchase patterns
  • Product category performance
  • Year-over-year comparisons

Here’s an example query that many organizations might run frequently:

This query performs several resource-intensive operations:

  • Joining three tables (sales, products, customers)
  • Filtering for the last year’s data
  • Grouping by month, product category, and region
  • Performing multiple aggregations (SUM, COUNT DISTINCT, AVG)

Comparative Performance Analysis: Views vs. Materialized Views

When comparing regular views and materialized views in Snowflake, several key performance metrics reveal distinct operational characteristics. Let’s examine their performance profiles using a real-world business intelligence scenario.

Performance profile

A regular view in Snowflake executes its underlying query logic with each access. For our sample sales analysis query on a medium-sized dataset (about 100 million sales records, 1 million customers, and 10,000 products over multiple years), this means consistently processing raw data through multiple steps. Each execution requires 45-60 seconds of processing time as Snowflake reads from all three source tables, executes join operations, and calculates aggregations from scratch.

This processing consumes approximately 0.5 compute credits per execution due to the repeated full-scale data processing requirements. However, regular views require no additional storage space as they only maintain the query definition. On the other hand, materialized views present a different performance model. The initial creation requires a one-time cost of approximately 60 seconds, comparable to a single execution of the regular view. Subsequent query execution times drop dramatically to 1-2 seconds, representing a 95% improvement over the regular view. This efficiency stems from accessing pre-computed results rather than processing raw data.

One thing that is an interesting trade off is computing costs. While you do have to complete the results, once you are using the materialized views in queries the amount of compute may be quite a bit lower. The compute credit consumption reflects this efficiency, using only 0.02 credits per query—a 96% reduction compared to the regular view. This significant reduction occurs because the materialized view eliminates the need for repeated raw data processing and join operations.

Materialized views introduce two additional operational considerations. First, they require periodic refresh operations, taking a few seconds to update when source data changes. Snowflake optimizes these refreshes through incremental maintenance, processing only changed records. Second, they incur storage costs to store the pre-computed results and aggregated data.

Real-World Performance Impact

Consider a scenario where this report runs multiple times daily:

  • Core business hours: 8 AM to 6 PM
  • Updates every 15 minutes
  • Average of 50 concurrent users during peak hours
  • Each user runs 5-10 different report variations

Regular View Operation Costs:

  • Daily processing time: 83 minutes (5,000 seconds)
  • Monthly credit consumption: 1,500 credits
  • Monthly operational cost: $6,000 (at $4 per credit)
  • Storage cost: None

Materialized View Operation Costs:

  • Daily processing time: 2.5 minutes (150 seconds)
  • Monthly credit consumption: 132 credits (including refresh operations)
  • Monthly operational cost: $528 (at $4 per credit)
  • Storage cost: $40 monthly for 100MB
  • Total monthly cost: $568

The materialized view implementation delivers a net monthly savings of $5,432, representing a 90% reduction in operational costs.

Optimal Implementation Scenarios

The materialized view pattern demonstrates maximum effectiveness when a few different things occur. when the query frequency exceeds update frequency of the underlying data, allowing the benefits of pre-computation to outweigh refresh overhead. Implementation supports consistent reporting requirements, such as daily dashboards and regular reports. The queries involve complex calculations that benefit from reuse rather than recalculation. The source datasets are large, but the aggregated results maintain a significantly smaller footprint.

This performance analysis demonstrates that while both view types serve important purposes, materialized views offer substantial performance and cost benefits for frequently accessed, computation-heavy queries where data freshness requirements align with scheduled refresh intervals.

Data-Driven Decision Making: Let Usage Guide You

One of the smartest ways to decide whether you need a materialized view is to look at your query patterns. Snowflake makes this incredibly easy with its SNOWHOUSE.ACCOUNT_USAGE.ACCESS_HISTORY view. However, it’s important to understand how Snowflake tracks and matches queries before diving into the analysis. Snowflake’s ACCESS_HISTORY view tracks queries based on their exact text representation. Here are key points to understand:

  • Exact Matching: Queries must match exactly to be considered the same, including:
    • Spacing and formatting
    • Case sensitivity
    • Literal values in WHERE clauses
  • Common Variations That Create Different Entries:
    • Different WHERE clause parameters
    • Changed date ranges
    • Modified LIMIT values
    • Altered ORDER BY clauses

For example, these queries would be considered different:

To identify truly repetitive patterns, you might need to:

  • Look for similar query structures
  • Ignore specific parameter values
  • Consider parameterizing frequently used queries

Analyzing Query Patterns

Here’s a query that will show you the number of times a query has been executed and some additional statistics that will help you to see at a high level how much the query has been used and what it is likely costing you:

Interpreting Usage Metrics

When analyzing results, look for certain patterns. While your needs certainly may be different, here is what I do as a starting point:

  • High Frequency Queries (>100 executions per month)
  • Significant Execution Time (>10 seconds average)
  • Substantial Credit Usage (>0.1 credits per execution)
  • Large Data Scans (>1GB per query)

Real-world benchmark from a customer case, where they are implementing a Daily Sales Dashboard.

Before implementing materialized views for commonly reused queries:

  • Executions: 2,880/month (every 15 minutes)
  • Avg Time: 45 seconds
  • Credits: 1,440 credits/month
  • Cost: $5,760/month

After identifying and implementing the materialized views:

  • Executions: 2,880/month
  • Avg Time: 2 seconds
  • Credits: 57.6 credits/month
  • Cost: $230.4/month
  • MV Refresh: 96 credits/month
  • Storage: $20/month
  • Net Monthly Savings: $5,413.60

The Sweet Spot: When to Pull the Trigger

In this section, I will discuss the sweet spot to realize that you need to use materialized views versus regular views. The sweet spot for materialized views occurs when you have complex queries that are executed frequently but don’t require real-time data updates. Let’s explore two common scenarios where materialized views shine.

Complex Aggregations with Moderate Updates

Ideal scenarios for materialized views include complex aggregations that combine data from multiple tables and perform heavy computations are perfect candidates for materialization, especially when the underlying data updates are predictable and moderate. Consider a typical business intelligence scenario where you need daily sales summaries that combine transaction data with customer and product information.

This type of analysis typically involves:

  • Joining multiple large tables (sales, customers, products)
  • Performing multiple aggregations (sums, counts, averages)
  • Grouping by multiple dimensions
  • Processing millions of records
  • Running the same calculations repeatedly throughout the day

Here’s an example materialized view for such a scenario:

In a typical enterprise environment, this view might process:

  • Sales transactions spanning multiple years
  • Millions of customer records
  • Thousands of products across multiple categories
  • Data that updates daily but not in real-time

The performance impact is substantial when dealing with this scale of data:

  • Direct queries would need to process and join all historical data each time
  • Materialized views maintain pre-computed results, only processing new or changed records
  • Queries against the materialized view access pre-aggregated data, dramatically reducing processing time

Join-Heavy Queries

Another sweet spot for materialized views is when you need to combine data from multiple tables to create comprehensive views of your data, often called “360-degree views.” These views typically bring together information from across your data warehouse but don’t require real-time updates.

Consider this example that creates a customer-centric view combining orders, products, and returns data:

This type of view typically processes:

  • Multiple years of order history
  • Complex relationships across multiple tables
  • Large volumes of transaction data
  • Information that updates daily or weekly

The benefits of materialization in this scenario include:

  • Eliminating the need to repeatedly join large tables
  • Maintaining pre-computed aggregates across multiple dimensions
  • Providing fast access to commonly needed customer metrics
  • Reducing the compute resources needed for frequent reporting

In both scenarios, the materialized views provide significant benefits because:

  • The underlying data changes are predictable and less frequent than query access
  • The computations are complex and resource-intensive
  • The same data is queried frequently by multiple users or applications
  • The storage overhead of the materialized results is justified by the performance gains
  • Real-time data isn’t a requirement

When to Think Twice

While materialized views offer significant benefits in many scenarios, there are specific situations where they might not be the optimal solution. Let’s explore these cases to help you make informed decisions about when to use alternative approaches.

Highly Volatile Data

One of the primary scenarios where materialized views may not be ideal is when dealing with rapidly changing data. Consider real-time trading systems as an example. In such environments, data can change thousands of times per second, and maintaining materialized views becomes more costly than querying the base tables directly.

To determine if your data is too volatile for materialized views, you can analyze the refresh patterns using this query:

This query helps you understand:

  • How frequently your materialized views are refreshing
  • The average time spent on refreshes
  • The volume of data being updated
  • The pattern of updates (large batches vs. small frequent updates)

Consider avoiding materialized views when you see patterns like:

  • Multiple refreshes per minute
  • High average refresh times relative to query times
  • Large portions of the dataset being updated in each refresh
  • Consistent update patterns throughout the day

For example, in a trading system scenario:

  • Market data updates could occur thousands of times per second
  • Position calculations might change hundreds of times per minute
  • Risk calculations could update dozens of times per hour

In these cases, consider alternatives such as:

  • Regular views for real-time access to current data
  • Streaming solutions for real-time processing
  • Time-windowed materialized views for historical analysis
  • Hybrid approaches using both views and materialized views for different time periods

Storage Cost Analysis:

Understanding storage costs is crucial when implementing materialized views in Snowflake. Let’s explore how to analyze these costs and interpret the results to make informed decisions about materialized view implementations.

Analyzing Storage Usage

To effectively monitor and analyze storage costs for your materialized views, you can use the following query:

Example result set

This query and result set provides several key metrics that help you understand your storage utilization:

  • The name and type of each materialized view
  • The number of rows stored
  • The size in gigabytes
  • The estimated monthly storage cost in dollars

Understanding Storage Metrics and Cost Calculations

In this section we will look at some of the ways you can calculate and predict costs of storage.

Storage Size Calculation

The fundamental storage calculation in Snowflake converts raw byte measurements into more manageable gigabyte values using the expression bytes/POW(1024,3). (Note the POW function raises the number in the first parameter to the power in the second parameter.) This conversion follows the standard binary progression of digital storage units. Starting with the basic unit of a byte, we progress through kilobytes, megabytes, and finally to gigabytes. Each step up multiplies by 1024: a kilobyte contains 1024 bytes, a megabyte encompasses 1024 kilobytes (or 1024² bytes), and a gigabyte consists of 1024 megabytes (or 1024³ bytes).

To illustrate this conversion, consider a materialized view consuming 1073741824 bytes of storage. When we apply our conversion formula (1073741824/POW(1024,3)), the result simplifies to exactly 1.0 GB, providing a more intuitive measure of the storage footprint.

Cost Calculation Framework

Snowflake’s storage cost calculation builds upon this size conversion by incorporating the standard rate of $40 per terabyte per month. The formula bytes/POW(1024,3) * 40 provides a direct monthly cost estimate in dollars. This calculation takes the size in gigabytes and applies the proportional cost based on Snowflake’s terabyte pricing.

For practical application, consider a materialized view occupying 100GB of storage. The monthly cost calculation would be: 100 GB * ($40/1024) ≈ $3.91. This demonstrates how even relatively large, materialized views can maintain reasonable storage costs while providing significant performance benefits.

This cost framework helps data engineers and architects make informed decisions about materialized view implementations by providing clear visibility into the storage cost implications of their design choices. Understanding these metrics becomes particularly valuable when evaluating the trade-offs between query performance improvements and associated storage costs.

Monitoring and Maintenance

Effective monitoring of materialized views is crucial for maintaining optimal performance and cost efficiency. This section covers key monitoring strategies and maintenance practices to ensure your materialized views continue to provide value over time.

Performance MonitoringUnderstanding Refresh Operations

One of the most critical aspects to monitor is how your materialized views handle refresh operations. Refreshes occur when the underlying data changes, and understanding their patterns helps optimize view performance. Here’s a comprehensive monitoring query that tracks refresh operations:

This query helps you understand:

  • How frequently each view refreshes
  • The average time taken for refreshes
  • The volume of data being updated
  • Patterns in refresh timing
  • Overall impact on system resources

Key Monitoring Metrics

When monitoring materialized views, focus on these essential metrics:

Refresh Operations

  • Frequency: Track how often each view refreshes
  • Duration: Monitor the time taken for refresh operations
  • Data Volume: Measure the amount of data being processed
  • Success Rate: Track failed refreshes and their causes

Performance Impact

  • Query Response Time: Compare query performance before and after materialization
  • Resource Usage: Monitor compute credits consumed during refreshes
  • Concurrent Operations: Track impact on other warehouse operations

Storage Utilization

  • Size Growth: Monitor how view size changes over time
  • Storage Costs: Track monthly storage expenses
  • Data Distribution: Analyze clustering effectiveness

Conclusion

Materialized views in Snowflake can be incredibly powerful when used correctly. Based on the performance metrics and benchmarks shown above, they can provide significant performance improvements, reduction in cost and greater Return On Investment (ROI). The key is to let your usage patterns guide you – look at your query history, identify repetitive complex queries, and measure the impact. Start small, monitor the results, and scale up based on real performance improvements. Remember, the goal isn’t to create materialized views for everything, but to strategically use them where they’ll make the biggest impact on your performance and costs. Happy materializing!

Load comments

About the author

Anil Kumar Moka

See Profile

Anil Kumar Moka is a Polyglot Data Software Engineer and Visionary Technical Leader at a leading U.S. bank, bringing 17 years of expertise in data engineering, data security, software engineering and cloud solutions. His innovative enterprise-wide data remediation platforms safeguard over 100 million customers while achieving 95% faster processing times and multi-million dollar cost savings. An AWS Certified Solutions Architect and published researcher, Anil regularly authors influential white papers and technical articles on cloud computing and data engineering. His leadership in designing scalable solutions and driving technological excellence has established him as a thought leader in the industry. Anil holds a Bachelor of Technology in Electrical and Electronics Engineering, complemented by multiple professional certifications that underscore his commitment to technical mastery.