PASS Summit 2026 Kommt in Drei Städte|Jetzt registrieren

Let’s Tune Our AWS Aurora PostgreSQL Database

Guest post

This is a guest post from Grant Fritchey.

Grant Fritchey works for Redgate Software as a Devops Advocate. Grant has worked for more than thirty years in IT as a developer and a DBA. He has built systems from the major enterprise to distributed systems to small boutique companies. Grant writes articles on various data related topics for SQL Server Central and SimpleTalk. He is the author of multiple books including, SQL Server Execution Plans and SQL Server Query Performance Tuning. Grant currently serves on the Board of Directors of the PASS organization, a non-profit professional organization for data professionals. He develops and presents complete structured learning plans to teach Azure, AWS, and other data related topics to developers and other IS personnel. Grant is a Microsoft Data Platform MVP and an AWS Community Builder.

In case you don't know the back story, in order for me to play with radios and label it "work," I've created a PostgreSQL database running on AWS Aurora. The db is fed from API calls to aprs.fi through Lambda functions on AWS. Some of the DDL & code is mine. Some is from Claude. Neither of us paid much attention to indexing when we were putting things together. I threw my own personal best practices out the window and just went for a "make it work, worry about everything else (except security) later," style of coding.

Yeah, we're starting to pay for that little decision.

There isn't that much data in the database. Only when I'm outside with the radio (Bridgecom Systems Maverick, the latest & greatest DMR handheld on the market) and have it set to receive GPS signals and to transmit APRS signals that anything can go in to the database. Further, in order to avoid limits, and costs, I'm only calling for data every three minutes. As of this writing, there are only 400+ individual points stored in the system so far. Yet, queries are already running a little slower. Nothing radical, but I can tell.

Time to spend some time and get a little tuning done.

A Query and an Explain Plan

I'm writing this series of blog posts for several reasons. The biggest is to focus on how well, for the most part, running PostgreSQL on AWS Aurora just isn't all that different from running it yourself. While there are a whole slew of reasons for answering the question "Why PostgreSQL?" one of my favorite is pretty simple: extensions. There are a whole slew of extensions available, and it's not hard to create your own (you can make 'em out of just SQL). However, with my background in insurance, and more specifically working for a company that prided themselves on engineering and testing as well as any kind of actuarial calculations, we frequently fell back on that old real estate saw: location, location, location. When we're talking global information systems in PostgreSQL, then we can lean hard on postGIS to get the job done. In the first post I showed how easy it was to get postGIS up and running inside Aurora. Now, I'd like to show a little more functionality, specific to postGIS, but, also, specifically running inside Aurora as if it was just on your own servers.

In my last post on this topic I showed how to set up Aurora to capture explain plans in PostgreSQL. Even with my rudimentary understanding of query tuning within PostgreSQL, there are issues, not uncommon to SQL Server, with the view, reports.locations_visited:

If you're a PostgreSQL expert, I'm pretty sure you're already spotting some of the issues. Heck, I can see it pretty quickly myself. Remember, I relied on the AI to generate quite a few of these queries (not all). This is one of them. Spot it yet? How about, using DISTINCT and GROUP BY together. Won't aggregating the data, grouping it, make for a pretty "distinct" set of data? OK, but we're guessing here, right? First, we can test pretty quickly, will the results be the same? Running this query with and without the distinct, the data comes back:

But how do we see if there is any changes on performance in AWS Aurora PostgreSQL? Same as anywhere else in PostgreSQL. Sure, I could set up pg_stats_statements and query that. However, remember, to a very large extent, running Postgres on Aurora is the same as running it locally. I can easily see, and measure, two versions of the query to see the differences between like this:

With the following output:

Sure, we can examine what's happening within the execution plan, and we will, but let's focus on just a few points first. We went from 262 buffers hit to 3. I don't know about you, but I'm already excited. Execution time went from 110ms to just shy of 18ms. An 83% increase, or, if you will, 6 times faster. However you want to frame it, just that small change in code made an enormous difference.

There's more.  I'm only, ever, going to be showing a single call sign, so filtering for 'KC1KCE-8' is never going to eliminate rows from our data set, so I can safely remove that. Also, the  clause 'll.City IS NOT NULL' causes our OUTER JOIN to be an INNER JOIN. That's not necessarily going to improve things, but why not make things more clear in the code. Resulting in this query, which, it so happens, shaved another millisecond off the execution time to be about 17ms instead of 18ms:

Is that it?

Almost. One other point worth noting, there isn't currently an index on the reports.location_info table. Because of this, our JOIN criteria is scanning that table, 'Join Filter: st_dwithin(ll.center_point, li.gis_point, ((ll.radius_km * '1000'::numeric))::double precision, true).' With only 400 or so rows, no big deal. As that data set grows, we're absolutely going to need an index. I'm going to go ahead and create one now:

I said it's probably not a big deal for performance, yet, but adding the index dropped the performance time another millisecond, even though the buffers hit went up to 28. Here's the new explain plan:

I'm still learning how to properly read these, so bear with my explanation. Instead of having to materialize data for the join and then do a sequence scan, the optimizer can now use the index. It means a few more reads, but overall better performance, especially as this scales in size.

I'm sure there are more tuning opportunities to be had, but I'm going to leave it for that at the moment.

Conclusion

I kind of keep harping on this, but it's an important point. If I was tuning this locally, no involvement with Aurora, the steps, tools and methods I'd use are the same. That's honestly the key point. If you know, or are learning, PostgreSQL, but now you have to migrate to Aurora, I've got good news. Your knowledge is going to transfer, mostly. Sure, there are distinct differences. However, to a very large degree, this is still just PostgreSQL. Poor code choices, as the AI made when it first created this query, will lead to poor performance on Aurora, same as locally. Getting good indexes in place will help performance, whether you're local or on Aurora.

If you've spotted any other potential improvements, let me know in the comments. Anything you'd like to see added to this APRS project, also let me know in the comments. I've got a few ideas for some additional changes, additions, and probably a fix or two (not everything is working correctly, see if you can spot it). That means I'll be back here with another post or two on this project.

 

Tools in this post

Redgate Flyway

Bring stability and speed to database deployments

Find out more

Loading comments...