10 best practices for optimizing resources in Snowflake

The Snowflake platform is architecturally different from most traditional database systems and cloud data warehouses. This tool has two separate computing and storage platforms, both of which are highly elastic. To maintain its high performance and optimize resource usage, we share some best practices that will help you.

Snowflake's elasticity and its management

With Snowflake, you no longer have to worry about performing advanced resource planning, determining workload schedules, or preventing new workloads from being introduced to the system for fear of disk or CPU limitations. As a cloud data platform, Snowflake can scale rapidly to meet both planned and unexpected growth. This means that instead of paying for a fixed, limited amount of storage and compute, the amount of storage and compute grows and shrinks as your needs change over time.

Leveraging a core cloud principle of elasticity, compute can be dynamically scaled to meet demand as concurrency needs or raw compute power fluctuate throughout the day. Storage requirements will vary over time for databases, tables, and metadata. Therefore, all Snowflake account administrators should perform some optimizations to their compute footprint and consider more advanced methods as their compute footprint grows. Considering that compute and storage are now independent and elastic, consumption, unexpected growth, and resource efficiency should be monitored.

Snowflake accounts are virtually unlimited by default. Account administrators can implement minor restrictions to defend against rogue users or suboptimal resource and credit usage. For example, they can proactively control compute at the individual virtual warehouse, user, or account and organization levels through resource monitors. Users, databases, tables, queries, and workloads can be monitored through the ACCOUNT_USAGE schema, which is shared by all Snowflake accounts.

With all of this in mind, here are 10 best practices Snowflake account managers should follow.

Best practices to optimize resources in your cloud

1. Enable auto-sleep

Make sure all virtual warehouses are set to auto-suspend. This way, when they finish processing queries, credit consumption will stop.

2. Enable auto-resume

Configure your virtual warehouses in Snowflake to resume automatically. If you're implementing automatic suspension and setting appropriate timeout limits, enabling automatic resuming is essential; otherwise, users won't be able to access the system.

3. Set appropriate wait times for workloads

All virtual warehouses must have an adequate waiting time.

For ETL/ELT jobs, data loads, and warehouses, set the timeout for suspension immediately after completion.
Targeted at BI and SELECT query stores, set the sleep timeout to 10 minutes in most situations to keep data caches warm for frequent end-user access.
For DevOps, DataOps, and data science warehouses, set the sleep timeout to 5 minutes because having a live cache isn't as important for ad hoc and highly unique queries.

4. Set timeouts for account statements

Use the STATEMENT_QUEUED_TIMEOUT_IN_SECONDS and STATEMENT_TIMEOUT_IN_SECONDS parameters to automatically stop queries that are taking too long to run, whether due to user error or a frozen cluster. Customize the warehouse, account, session, and user timeout level statements according to your data strategy for long-running queries.

5. Monitor warehouses that are approaching the cloud service billing threshold

In general, for an account and outside of serverless functions, Snowflake will charge for cloud services only if they exceed 10% of the virtual warehouse's daily credit consumption. Cloud service tasks are useful for metadata operations, such as BI tool discovery queries, SHOW commands, cache and server usage, and other service optimization functions. So, if you use 100 compute credits in a day but use an additional 15 cloud service credits, you will be charged an additional 5 credits for exceeding the 10% allowance. This means you will be billed for a total of 105 credits per day, providing 10 free cloud service usage credits.

7. Drop unused tables

You may have unused tables that are candidates for dropping. Just make sure no one is querying these tables. In fact, you may want to make it mandatory to check all tables before dropping them. This is specific to the database context, so be sure to check the tables in your databases. Also, be aware of tables that are used only in view DDL.

8. Purge inactive users

It's a good idea to remove inactive users or users who have never logged into Snowflake from your account. Administrators can reactively monitor users, databases, tables, queries, and workloads through the ACCOUNT_USAGE schema shared across all accounts. This data is commonly used to forecast usage trends and provide chargeback and return billing for departments, teams, and workloads. Daily usage metrics are built into the platform for individual users, account administrators, and organization administrators.

9. Look for warehouses that don't have resource monitors

Resource monitors are a great way to proactively control workload budgets and avoid unexpected resource spikes. They can help monitor both user and service account usage in Snowflake. First, you must have dedicated virtual vaults for each of your workloads—ELT, BI, reporting, and data science—as well as other workloads. Accounts and vaults can have total, annual, monthly, weekly, and daily credit quotas.

10. Apply resource monitors

You can use the user interface or SQL to apply a resource monitoring policy. Depending on your account preference settings, resource monitors can notify you when consumption reaches a lower threshold. Then, suspend the vault or account at a higher threshold.

With Snowflake's highly elastic per-second computing and billing model, account managers must continuously monitor resource usage, growth, and efficiency to ensure they meet performance requirements and budgets. While Snowflake can help optimize resources

‍