The challenges of monitoring a highly complex database estate at the University of the Sunshine Coast

University of the Sunshine CoastAs the manager for enterprise applications and data at the University of the Sunshine Coast in Queensland, Australia, I face a lot of unique challenges. The university itself has around 25,000 students, 1,000 permanent staff and another 1,000 seasonal staff who assist with key academic sessions. They’re spread out across the flagship campus at Sippy Downs and a number of satellite campuses and research and teaching facilities in other locations.

As you might imagine, the IT environment that supports such a big organization is very complex, with dozens of servers and hundreds of database instances, mostly on-premises, with a small number in the cloud, along with dozens of different applications that access them.

It’s also a little different to the environment seen in commercial organizations. Our customers are the staff and students, and core identity information about them as well as degree subjects and timetables are used by a number of different systems like HR payroll and student administration.

As with any IT team, our main tasks are the day-to-day monitoring of activities, backups, creating indexes and performance tuning, maintaining security updates, and upgrading the SQL layer of the application stack when the applications themselves are upgraded. This is an ongoing effort and, with four versions of SQL Server in use, the team also need to apply Microsoft patches to the operating system and the SQL layer on a monthly basis.

That would be a big enough job for any team in any sector, but we also have the added challenge of coping with enrolments twice a year. That’s when our systems come under huge pressure, with thousand and thousands of students choosing their classes and booking the favored timetable slots at the same time. If they have a bad enrolment experience, that’s what they remember, however good we’ve performed for the rest of the year.

The crucial role day-to-day monitoring plays

To provide a clear picture of everything that’s happening across the IT infrastructure, multiple layers of monitoring are in place. A suite of monitoring applications and processes provide a constant, rolling visual feed on monitors so that we can spot problems as soon as they arise.

While a database monitoring solution was in use, its features were limited, the user interface was outdated, and it wasn’t particularly easy to use. That was a problem for us because, in many ways, maintaining our servers is like tuning a Formula One car: you’re continually tinkering with them so that they perform at their optimum, and you need the right data and insights from your monitoring solution.

We reviewed a number of alternative solutions and decided to trial SQL Monitor, mostly down to its web-based graphical user interface which would allow us to view our entire estate on one screen from any location.

We soon found out that it also gave us access to a wealth of information about our SQL Server instances, availability groups, clusters, and virtual machines. We could view the patch status of all of our SQL Servers, for example, which would help with our monthly patching requirement. We could also see the status of SQL Agent jobs, find and fix slow queries, set up pre-configured and customizable alerts, and view and track disk usage.

The real winner, though, was being able to pinpoint the causes of performance issues. This is the feature we now use regularly because we can identify what we need to focus our efforts on when it comes to resolving problems and tuning.

The bigger role monitoring plays at peak periods

The point at which SQL Monitor really came into its own was at the next of the twice-yearly enrolments I mentioned earlier. To handle the rush of thousands of students, we split it across three days. At 8am on a Monday morning, the first wave hits the system really hard and then eases off after just 20 minutes if everything goes well. This is repeated on the Wednesday and Friday of that week and, to cope with the surge, we do both proactive and reactive monitoring.

We prepare as best we can beforehand and, if something changes or is different during the first enrolment day like an indexing issue, we have one day to fix it. SQL Monitor has a great way of visualizing activity on its dashboard, which is much quicker than trying to understand numbers and stuff buzzing across the screen. Using it, we were able to drill down to the cause of problems that came up, analyze them, find solutions and implement them before the next wave.

The real value of a monitoring tool is when people don’t know it’s there

SQL Monitor has now taken its natural place among the monitoring systems at the University of the Sunshine Coast. It’s particularly useful at very busy times like enrolment, but it’s also become invaluable in handling the day-to-day operational issues. It saves the team time when monitoring how the databases are performing, and takes care of the many administrative tasks that are necessary to manage the entire server estate at USC.

Importantly, the estate-wide overview and alerts it offers on a single screen fits in with the way we work. We like to know the moment a problem arises so that we can get to work resolving it before the helpdesk phone calls start coming in. For us, that’s the sign of an IT team doing its job well.

If you’d like to see how Redgate’s SQL Monitor can help you monitor large, mixed estates more effectively, you can download a fully-functional 14-day free trial, or try our live online demo environment.

If you’d like to know more about how to optimize the performance and ensure the availability of your databases and servers, visit our monitoring resources page.

 

Tools in this post

Redgate Monitor

Real-time SQL Server and PostgreSQL performance monitoring, with alerts and diagnostics

Find out more