{"id":96753,"date":"2023-05-12T00:00:06","date_gmt":"2023-05-12T00:00:06","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=96753"},"modified":"2024-03-07T11:52:55","modified_gmt":"2024-03-07T11:52:55","slug":"uncovering-the-mysteries-of-postgresql-auto-vacuum","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/postgresql\/uncovering-the-mysteries-of-postgresql-auto-vacuum\/","title":{"rendered":"Uncovering the mysteries of PostgreSQL (auto) vacuum"},"content":{"rendered":"<p>Welcome to the second blog of the \u201cmagic of parameters\u201d series. In the first entry, I covered <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/postgresql\/are-postgresql-memory-parameters-magic\/\">memory parameters<\/a>, and in this article. In this article will talk about PostgreSQL configuration parameters which manage the (auto)vacuum and (auto)analyze background processes.<\/p>\n<h2>Why vacuuming is necessary?<\/h2>\n<p>Before we start talking about vacuum and analyze-related parameters, we need to touch on the concept of vacuuming in PostgreSQL. This concept is Postgres-specific, and for DBAs coming from Oracle and Microsoft SQL Server it might feel confusing \u2013 you can\u2019t directly map it to any previous experiences. (Note: both Oracle SQL Server do share some similarities in some of their configurations. For example, SQL Server memory optimized tables have a similar process referred to as <a href=\"https:\/\/learn.microsoft.com\/en-us\/sql\/relational-databases\/in-memory-oltp\/in-memory-oltp-garbage-collection\">garbage collection<\/a>.)<\/p>\n<p>Let\u2019s start by looking at how data is stored in PostgreSQL database. The generic structure of a block is shown in Figure 1.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-96754\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/05\/word-image-96753-1.png\" width=\"528\" height=\"589\" \/><\/p>\n<p><strong>Figure 1<\/strong>. <em>Generic block structure<\/em><\/p>\n<p>Table rows are stored in a <em>heap<\/em>, and their order is not guaranteed. In Figure 1, you can see free space in the block, which can be filled with new rows of data. If your previous DBA experience is with a lock-based RDBMS engine, you would think that we should set up a fillfactor less than 100% for each table so that there will be some room to maintain most updated records in the same block.<\/p>\n<p>Be ready for a surprise! Although you can define a <code>fillfactor<\/code> &lt;100%, this option is rarely used in PostgreSQL. The reason is that PostgreSQL <strong>never<\/strong> updates a row in place due to how it implements concurrency controls!<\/p>\n<h3>Multi-version concurrency control<\/h3>\n<p>To allow multiple users to access data concurrently and to avoid waits when data is being updated, PostgreSQL uses multi-version concurrency control (MVCC). It is implemented using Snapshot Isolation (SI): each SQL statement sees a snapshot of data (a database version) as it was when the transaction started, regardless of the current state of the underlying data. This approach provides multiple advantages:<\/p>\n<ul>\n<li>It prevents statements from viewing inconsistent data due to table modifications made by other transactions<\/li>\n<li>It provides transaction isolation for each session<\/li>\n<\/ul>\n<p>The modified tuples are saved in a new place, within the same block or in a different block, but if some transactions still are still actively accessing the old versions of the modified tuples, these tuples are kept \u201calive.\u201d<\/p>\n<p>How does Postgres know which versions should be kept alive, and which can be recycled? Each table has several \u201chidden\u201d (system) attributes which you can\u2019t see when you execute <code>SELECT * FROM &lt;table&gt;<\/code>. Two of these hidden attributes are <code>xmin<\/code>, which contains the <code>ID<\/code> of the transaction which created the row, and <code>xmax<\/code>, stores the <code>ID<\/code> of the transaction which deleted the row (either by updating or by deleting).<\/p>\n<p>Now, for any processes transaction <code>ID<\/code> that is between greater or equal to <code>xmin<\/code> and less than <code>xmax<\/code>, this row is active and should be visible to that transaction. PostgreSQL marks a tuple dead if there are no more active transactions that the row could be visible to. Figure 2 presents a block layout with a dead tuple.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-96755\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2023\/05\/shape-description-automatically-generated-with-me.png\" alt=\"Shape\n\nDescription automatically generated with medium confidence\" width=\"480\" height=\"562\" \/><\/p>\n<p><strong>Figure 2<\/strong>. <em>Block layout with dead tuple.<\/em><\/p>\n<p>What is the effect of MVCC on the database performance? On the one hand, the absence of locking facilitates performance. On the other hand, having multiple versions and thereby dead tuples creates both <em>table and index bloat<\/em>.<\/p>\n<h3>What\u2019s VACUUM for?<\/h3>\n<p><code>VACUUM<\/code> checks all the blocks and marks the old tuples \u201cdead.\u201d It\u2019s important to remember that <code>VACUUM<\/code> does not rewrite the block and does not \u201ccompress\u201d the data. It simply marks the space occupied by dead tuples as \u201creusable.\u201d You can compress data and return the unused space back to the operating system by running <code>VACUUM FULL<\/code>.<\/p>\n<h3>What else does VACUUM do?<\/h3>\n<p>By this time, it might already feel like \u201ctoo much information\u201d, especially if that\u2019s the first time you are exposed to the concept of vacuuming in Postgres. But before we proceed further, let\u2019s note that the job of <code>VACUUM<\/code> is not limited to reclaiming space. In addition, it:<\/p>\n<ul>\n<li>Updates data statistics used by the\u00a0PostgreSQL\u00a0query planner.<\/li>\n<li>Updates the visibility map, marking the blocks with no dead tuples<\/li>\n<li>Protects against loss of very old data due to\u00a0<em>transaction ID wraparound<\/em>.<\/li>\n<\/ul>\n<p>If your job is to manage Postgres instances, you might want to do <a href=\"https:\/\/www.postgresql.org\/docs\/current\/routine-vacuuming.html\">more reading on the topic of vacuuming<\/a>, beyond what we will cover, but for the start, let\u2019s just keep in mind there are many things vacuum does.<\/p>\n<h3>What\u2019s Autovacuum for?<\/h3>\n<p>Autovacuum is a daemon which periodically invokes vacuuming of tables and indexes. Autovacuum does not require any scheduling, instead, its behavior is driven by several system parameters, which will be described later in this article.<\/p>\n<h2>Myths about the vacuuming process<\/h2>\n<p>Since the concept of vacuuming is not as common in other RDBMS and is not all that we;; understood by PostgreSQL newbies (and some experienced people too!), it is a source for multiple common myths.<\/p>\n<p>In this section I will do what I can to dispel these myths.<\/p>\n<h3>Myth #1. The vacuum process makes everything run slower<\/h3>\n<p>The reality. <code>VACUUM<\/code> (and by extension, <code>autovacuum<\/code>) is obviously not cost-free, but the cost is far less than if it was not executed. When rows are being modified, dead tuples begin to accumulate and everything can become slow when the <code>autovacuum<\/code> <strong>does not<\/strong> run regularly.<\/p>\n<p>Tables become bloated (more details on table bloat later in the article), which makes sequential scans slower. Visibility map is not updated, which prevents the usage of index-only scan (PostgreSQL still needs to check the heap to make sure that index does not point to any dead tuples).<\/p>\n<h3>Myth #2. The vacuum process blocks other operations<\/h3>\n<p>The reality. The <code>VACUUM<\/code> process is aborted if it is blocking any write operation. The wait time is determined by the <code>deadlock_timeout<\/code> parameter. In fact, on a busy table, it may be beneficial to run vacuum often (ideally using <code>autovacuum<\/code>), because otherwise, it might not have a chance to finish for days and weeks, because it is blocked by write operations.<\/p>\n<h3>Myth #3. On busy mission-critical databases, it\u2019s a good idea to disable autovacuum and run scheduled vacuum jobs during the quiet time.<\/h3>\n<p>The reality. \u201cBusy systems\u201d (in terms of modifying data) may be busy most of the day, not just certain times of the day, with very few exceptions. If you disable autovacuum and run vacuum on a schedule, such as during \u201cquiet time,\u201d vacuum will end up with more job to do (if the system is \u201cbusy\u201d there could easily be a lot of dead tuples by that time!). Now, you might end up blocking the tables for longer periods of time, so you need to make sure that there are no write operations for the extended period of time.<\/p>\n<p>In addition, even if the system behavior is predictable in general, there are chance of burst updates, and you may end up with more bloat than expected, with even more severe consequences.<\/p>\n<h3>Myth #4. If you use other ways to control bloat, like <a href=\"https:\/\/www.cybertec-postgresql.com\/en\/products\/pg_squeeze\/\">pg_squeeze<\/a>, you do not need to run vacuum at all.<\/h3>\n<p>The reality. Recall that reclaiming the space is only one of several functions performed by <code>VACUUM<\/code>. You might be able to reclaim the space, but other vacuum functions won\u2019t be performed, most importantly, old records won&#8217;t be frozen. Once again if we are talking about \u201cbusy systems\u201d the risk of <code>TXID<\/code> wraparound is higher.<\/p>\n<h3>Myth #5. You need to monitor the autovacuum runs and make sure that all tables are vacuumed at least daily<\/h3>\n<p>The reality. On the tables with few writes autovacuum might run once a week or even less frequent. The parameters explained in the next section determine how often vacuuming will be performed on specific tables.<\/p>\n<p>It is possible to view the statistics on updates for each table and to set up some monitoring based on this information, but that would effectively mean replicating the logic of <code>autovacuum<\/code>. More important is to monitor system performance and tables bloat<\/p>\n<h2>The most important parameters that govern the vacuum process<\/h2>\n<p>There are many vacuum-related parameters which allow very precise vacuum tuning. However, in practice, it is often enough to set up correctly just a handful of them. Most of them control autovacuum, to help tune the automatic execution of the vacuum process.<\/p>\n<ul>\n<li><code><strong>autovacuum_vacuum_cost_delay<\/strong><\/code>: <em>amount of time the process will sleep after the max cost exceeded <\/em>\n<ul>\n<li>The default value is 20 ms, which is very conservative and may result in vacuuming not keeping up with changes. Start with reducing it to 10ms, and if necessary, you can go as low as 2 ms.<\/li>\n<li>Note that this parameter is different from the naptime (see below)<\/li>\n<\/ul>\n<\/li>\n<li><code><strong>autovacuum_max_workers<\/strong><\/code>: <em>max parallel workers (across server) which are invoked for each autovacuum invocation.<\/em>\n<ul>\n<li>Most often, this parameter is set to half of the total number of parallel workers defined for the instance, however, it is often beneficial to increase this number even more.<\/li>\n<\/ul>\n<\/li>\n<li><code><strong>autovacuum_naptime<\/strong><\/code>: <em>minimum delay between autovacuum runs on any given database<\/em>.\n<ul>\n<li>Each time the <code>autovacuum<\/code> daemon starts, it examines the database and issues <code>VACUUM<\/code> and <code>ANALYZE<\/code> commands as needed for tables in that database. Since this setting determines the wake-up time per database, an <code>autovacuum<\/code> worker process will begin as frequently as <code>autovacuum_naptime \/ number of databases<\/code>. For example, if <code>autovacuum_naptime = 1 min<\/code> and we have five databases, an <code>autovacuum<\/code> worker process would be started every twenty seconds.<\/li>\n<li>The default value for this parameter is 1 min, however, on busy databases with many writes it can be beneficial to increase its value to prevent autovacuum waking up too often. With this parameter, like with many others, there is a trade-off between \u201ctoo often\u201d and \u201ctoo much work on each invocation\u201d.<\/li>\n<\/ul>\n<\/li>\n<li><code><strong>autovacuum_vacuum_scale_factor<\/strong><\/code>: <em>percentage of changes to the table after which a vacuum should run<\/em>\n<ul>\n<li>The default value for this parameter is 0.2; for larger tables, should be reduced to 0.05 (and consider this for all tables for that matter)<\/li>\n<\/ul>\n<\/li>\n<li><code><strong>autovacuum_analyze_scale_factor<\/strong><\/code>: <em>percentage of changes to the table after which an analyze should run<\/em><br \/>\nThe default value is 0.1; for larger tables, should be reduced to 0.05 (and consider for all tables here as well) respectively.<\/p>\n<ul>\n<li>Note: The default for the <code>autovacuum_vacuum_scale_factor<\/code> is 0.2 (20%) and <code>autovacuum_analyze_scale_factor<\/code> is 0.1 (10%). While the default values perform acceptably for tables of a modest size (up to around 500MB), for larger tables these values are usually too high.<\/li>\n<\/ul>\n<\/li>\n<li><strong>autovacuum_vacuum_cost_limit<\/strong>: The default value of -1 for <code>autovacuum_vacuum_cost_limit<\/code> results in <code>autovacuum_vacuum_cost_limit = vacuum_cost_limit<\/code>. However, this value is distributed proportionally among the running autovacuum workers. This is done in order that the sum of the limits of each worker never exceeds the limit on this variable. Therefore, the default value of 200 for <code>vacuum_cost_limit<\/code> is generally too low for a busy database server with multiple <code>autovacuum<\/code> workers.should be set to -1<\/li>\n<li><strong>vacuum_cost_limit:<\/strong> <em>cumulative cost after which the vacuum should stop, should be set to<\/em> 200 X number of workers<\/li>\n<\/ul>\n<h2>How to tune (auto)vacuum<\/h2>\n<p>Vacuuming is vital for PostgreSQL databases well-being, and autovacuum should never be turned off unless there are really exceptional and unusual circumstances. At the same time, autovacuum should be always tuned for specific environment needs.<\/p>\n<p>Tuning <code>autovacuum<\/code> is challenging, because we need to take into account the vacuuming speed, level of I\/O, and blocking.\u00a0To start with, the default set of vacuum-related parameters would work adequately. After some time passes, check system bloat and if it appears to be high and adjust the <code>autovacuum<\/code> settings. What should be considered a high bloat depends on many factors.<\/p>\n<p>On average, bloat below 20% is considered normal. For larger tables, 10% maybe considered a significant bloat, while for small tables, even 50% bloat maybe fine. If the bloat does not result in visible performance degradation, there is no pressing need to address it.<\/p>\n<h3>How to determine if your tables are bloated<\/h3>\n<p>There are examples of such queries that can be found in many PostgreSQL blogs and company\u2019s websites. Many of the solutions require extensions, if you are up to installing additional extensions to monitor your bloat, you can use the following:<\/p>\n<pre class=\"lang:tsql theme:ssms2012-simple-talk\">CREATE EXTENSION pgstattuple;<\/pre>\n<p>This is one of many extensions provided with PostgreSQL <code>contrib<\/code> package, so you do not need to download anything, just execute the <code>CREATE<\/code> statement. The available functions are documented in PostgreSQL documentation.<\/p>\n<p>If you do not want to install any extensions, the following queries will provide good estimates:<\/p>\n<p><a href=\"https:\/\/github.com\/pgexperts\/pgx_scripts\/blob\/master\/bloat\/table_bloat_check.sql\">table_bloat_check.sql<\/a><\/p>\n<p><a href=\"https:\/\/github.com\/pgexperts\/pgx_scripts\/blob\/master\/bloat\/index_bloat_check.sql\">index_bloat_check.sql<\/a><\/p>\n<p>If your system is even somewhat active, it is essential that you monitor the bloat in your data objects.<\/p>\n<h2>How to monitor<\/h2>\n<p>Although you can find the last time when <code>vacuum<\/code>\/<code>autovacuum<\/code> and <code>analyze<\/code>\/<code>autoanalyze<\/code> were executed in the <code>pg_stat_all_tables<\/code>, we do not recommend monitoring this value. As it was mentioned in the previous section, if the updates didn\u2019t reach a threshold, the is no need for autovacuum to run. However, the <code>pg_stat_all_tables<\/code> object contains a lot of valuable information which may be very useful for evaluating the database health.<\/p>\n<p>For each table, <code>pg_stat_all_tables<\/code> contains the following information:<\/p>\n<table>\n<tbody>\n<tr>\n<td>\n<p>Column<\/p>\n<\/td>\n<td>\n<p>Description<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>seq_scan<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Number of sequential scans on the table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>seq_tup_read<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Number of tuples read using sequential scan<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>idx_scan<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Number of index scan accesses to the table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>idx_tup_fetch<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Number of tuples fetched using index scan<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>n_tup_ins<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Number of tuples inserted <\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>n_tup_upd<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Number of tuples updated<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>n_tup_hot_upd<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Number of \u201chot updates\u201d (updates which keep the tuple in the same block)<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>n_live_tup<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Estimated number of live tuples in the table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>n_dead_tuples<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Estimated number of dead tuples in the table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>n_mod_since_analyze<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Estimated number of modifications since last analayze<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>n_ins_since_vacuum\u00a0<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Estimated number of inserts since last vacuum<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>last_vacuum<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Time of the last vacuum run on that table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>last_autovacuum<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Time of the last autovacuum run on that table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>last_analyze<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Time of the last analyze run on this table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>last_autoanalyze<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Time of the last autoanalyze run on that table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>vacuum_count<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Total number of vacuum runs on the table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>auto_vacuum_count<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Total number of autovacuum runs on the table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>analyze_count<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Total number of analyze runs on the table<\/code><\/p>\n<\/td>\n<\/tr>\n<tr>\n<td>\n<p><strong><code>autoanalyze_count<\/code><\/strong><\/p>\n<\/td>\n<td>\n<p><code>Total number of autoanalyze runs on the table<\/code><\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>As you can see, this table provides valuable information about the dynamics of each table. However, the only way to see whether the autovacuum is tuned correctly is to run <strong><em>bloat checking queries <\/em><\/strong>on regular basis.<\/p>\n<h2>Conclusion<\/h2>\n<p>It is not uncommon that new PostgreSQL users and even DBAs do not pay much attention to tuning autovacuum and monitoring tables and indexes bloat. This happens since these concepts are Postgres-specific and are not in the mental checklist of database professionals with previous experience elsewhere.<\/p>\n<p>Thereby, it is important to educate yourself about the role vacuum and analyze play in the PostgreSQL databases well-being.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Welcome to the second blog of the \u201cmagic of parameters\u201d series. In the first entry, I covered memory parameters, and in this article. In this article will talk about PostgreSQL configuration parameters which manage the (auto)vacuum and (auto)analyze background processes. Why vacuuming is necessary? Before we start talking about vacuum and analyze-related parameters, we need&#8230;&hellip;<\/p>\n","protected":false},"author":341426,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[53,143534],"tags":[159066],"coauthors":[158986],"class_list":["post-96753","post","type-post","status-publish","format-standard","hentry","category-featured","category-postgresql","tag-postgresql-101-webinar-sidebar"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96753","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/341426"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=96753"}],"version-history":[{"count":9,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96753\/revisions"}],"predecessor-version":[{"id":96764,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/96753\/revisions\/96764"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=96753"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=96753"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=96753"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=96753"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}