Right-sizing for a cloud service

Guest post

This is a guest post from SQL Watchmen. Your data is valuable. You need a data platform that allows you to scale and grow. If your foundation is rocky, then your company is at risk of losing their most valuable asset.

Based in Austin, Texas, SQL Watchmen helps protect companies and organizations – and their data. Our DBAs make sure your database is running as efficiently as possible, and can advise on everything from performance tuning to custom SSRS reports to Cloud migrations.

When it’s time to purchase new resources for one or more applications, the non-scientific approach of guesstimating the requirements can be expensive, whether for on-premises capital expenditures or cloud-based services.

For stakeholders deciding on how to provision the data services, is it better to start small and grow the resources as needed? Certainly the elasticity of modern cloud services like Azure SQL Databases makes growing into the application an easy decision. You can easily expand resources or move to higher tier levels with a single mouse click, or even automatically as the workload increases.

But for SQL applications that are destined for residency on a Virtual Machine (VM), it’s not as straightforward. Over-provisioning RAM and CPUs is costly, but is often the first choice because it makes the business owners happy to believe that with more hardware their applications will be faster.

Unfortunately, this approach often leads to a lot of very big SQL Servers doing very little work. In time, when things start slowing down, which they will as data piles on, the first inclination is to add even more costly resources.

Reducing your footprint

The best approach to right-sizing SQL Server, especially if you’re moving from an on-premises server to a VM in a cloud service is not by guesstimating, but by interrogating.

Knowing the workload on a SQL Server instance doesn’t just mean knowing how much CPU or RAM is being used, or even how much physical space is required to house the databases. It means knowing the queries running on your server, what they’re waiting on, and understanding how to address any issues in the queries themselves.

In other words, can that same query run faster on a smaller server? By focusing your efforts on poor-performing code, fewer resources will be required. Generally, this is a shared responsibility of the DBA and developer.

Who has the time?

Operational or administrative DBAs don’t always have the time to devote to the tasks they need to perform to reduce an application’s resource needs. This is partly because they’re busy fighting fires, keeping other applications up and running, safely backed up and secure. That is their primary job, after all.

When it comes time to discuss migrating the application to a cloud service or consolidating to save on licensing costs, the effort of determining what resources will be needed often becomes another guesstimate based on RAM and memory consumption. There is just not enough time to gather the numbers.

What information should be gathered and analyzed?

There are specific steps that should be taken when analyzing a workload with the goal of making the application perform better, while at the same time reducing resource consumption:

  • Establish a baseline. There are several tools on the market that will make this easier, but the best and free way is to turn on performance monitor on the server and grab the relevant counters.
  • Gather metrics from the server itself. SQL Server has an incredibly vast amount of data that it stores about query execution. By using Dynamic Management Views (DMVs), you can quickly identify the worst performing queries. This information will include wait statistics.
  • Determine if there are missing, unused or duplicate indexes. This information, too, can easily be extracted from the server itself.
  • Monitor for blocking. Often application issues arise from inefficient use of locks and transactions.
  • Look for tuning opportunities, once you’ve identified the worst performing code that is taking the most resources. This is the most time-consuming part of the analysis but the one that will yield the best results.
  • Repeat after the new code is applied, and continue to review the baseline and query performance.

Focus on low hanging fruit first

It’s not uncommon when the above tasks are performed that the worst performing queries will be the easiest to rewrite or address with missing indexes and, in turn, will have the most impact on performance. This is the low hanging fruit.

You may find, for example, that a single CPU-intensive query is running 250 times a day and taking over a minute for each run. And it’s a stored procedure for a report!

Your goal is to ensure you’re not paying for resources you don’t need. Conversely, you may need more resources than you’re currently paying for. Either way, it’s helpful to know where to look to get the right information.

If you don’t have staff with the right skills to right-size for the cloud,
or if they have constraints on their time, contact us to see how we can help you.